enriched genomic libraries: Topics by Science.gov

Sample records for enriched genomic libraries

Use of RecA protein to enrich for homologous genes in a genomic library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Taidi-Laskowski, B.; Grumet, F.C.; Tyan, D.

1988-08-25

RecA protein-coated probe has been utilized to enrich genomic digests for desired genes in order to facilitate cloning from genomic libraries. Using a previously cloned HLA-B27 gene as the recA-coated enrichment probe, the authors obtained a mean 108x increase in the ratio of specific to nonspecific plaques in lambda libraries screened for B27 variant alleles of estimated 99% homology to the probe. Class I genes of lesser homology were less enriched. Loss of genomic DNA during the enrichment procedure can, however, restrict application of this technique whenever starting genomic DNA is very limited. Nevertheless, the impressive reduction in cloning effortmore » and material makes recA enrichment a useful new tool for cloning homologous genes from genomic DNA.« less
Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries

PubMed Central

Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.

2013-01-01

Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Developing new microsatellite markers in walnut (Juglans regia L.) from Juglans nigra genomic GA enriched library

Treesearch

Hayat Topcu; Nergiz Coban; Keith Woeste; Mehmet Sutyemez; Salih Kafkas

2015-01-01

We attempted to develop new polymorphic SSR primer pairs in walnut using sequences derived from Juglans nigra L. genomic enriched library with GA repeat. The designed 94 SSR primer pairs were subjected to gradient PCR in 12 walnut cultivars to determine their optimum annealing temperatures and to determine whether they produce bands. Then, the...
A new comprehensive method for detection of livestock-related pathogenic viruses using a target enrichment system.

PubMed

Oba, Mami; Tsuchiaka, Shinobu; Omatsu, Tsutomu; Katayama, Yukie; Otomaru, Konosuke; Hirata, Teppei; Aoki, Hiroshi; Murata, Yoshiteru; Makino, Shinji; Nagai, Makoto; Mizutani, Tetsuya

2018-01-08

We tested usefulness of a target enrichment system SureSelect, a comprehensive viral nucleic acid detection method, for rapid identification of viral pathogens in feces samples of cattle, pigs and goats. This system enriches nucleic acids of target viruses in clinical/field samples by using a library of biotinylated RNAs with sequences complementary to the target viruses. The enriched nucleic acids are amplified by PCR and subjected to next generation sequencing to identify the target viruses. In many samples, SureSelect target enrichment method increased efficiencies for detection of the viruses listed in the biotinylated RNA library. Furthermore, this method enabled us to determine nearly full-length genome sequence of porcine parainfluenza virus 1 and greatly increased Breadth, a value indicating the ratio of the mapping consensus length in the reference genome, in pig samples. Our data showed usefulness of SureSelect target enrichment system for comprehensive analysis of genomic information of various viruses in field samples. Copyright © 2017 Elsevier Inc. All rights reserved.
Development of microsatellite markers from an enriched genomic library for genetic analysis of melon (Cucumis melo L.)

PubMed Central

Ritschel, Patricia Silva; Lins, Tulio Cesar de Lima; Tristan, Rodrigo Lourenço; Buso, Gláucia Salles Cortopassi; Buso, José Amauri; Ferreira, Márcio Elias

2004-01-01

Background Despite the great advances in genomic technology observed in several crop species, the availability of molecular tools such as microsatellite markers has been limited in melon (Cucumis melo L.) and cucurbit species. The development of microsatellite markers will have a major impact on genetic analysis and breeding of melon, especially on the generation of marker saturated genetic maps and implementation of marker assisted breeding programs. Genomic microsatellite enriched libraries can be an efficient alternative for marker development in such species. Results Seven hundred clones containing microsatellite sequences from a Tsp-AG/TC microsatellite enriched library were identified and one-hundred and forty-four primer pairs designed and synthesized. When 67 microsatellite markers were tested on a panel of melon and other cucurbit accessions, 65 revealed DNA polymorphisms among the melon accessions. For some cucurbit species, such as Cucumis sativus, up to 50% of the melon microsatellite markers could be readily used for DNA polymophism assessment, representing a significant reduction of marker development costs. A random sample of 25 microsatellite markers was extracted from the new microsatellite marker set and characterized on 40 accessions of melon, generating an allelic frequency database for the species. The average expected heterozygosity was 0.52, varying from 0.45 to 0.70, indicating that a small set of selected markers should be sufficient to solve questions regarding genotype identity and variety protection. Genetic distances based on microsatellite polymorphism were congruent with data obtained from RAPD marker analysis. Mapping analysis was initiated with 55 newly developed markers and most primers showed segregation according to Mendelian expectations. Linkage analysis detected linkage between 56% of the markers, distributed in nine linkage groups. Conclusions Genomic library microsatellite enrichment is an efficient procedure for marker development in melon. One-hundred and forty-four new markers were developed from Tsp-AG/TC genomic library. This is the first reported attempt of successfully using enriched library for microsatellite marker development in the species. A sample of the microsatellite markers tested proved efficient for genetic analysis of melon, including genetic distance estimates and identity tests. Linkage analysis indicated that the markers developed are dispersed throughout the genome and should be very useful for genetic analysis of melon. PMID:15149552
Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

USDA-ARS?s Scientific Manuscript database

Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...
A novel helper phage enabling construction of genome-scale ORF-enriched phage display libraries.

PubMed

Gupta, Amita; Shrivastava, Nimisha; Grover, Payal; Singh, Ajay; Mathur, Kapil; Verma, Vaishali; Kaur, Charanpreet; Chaudhary, Vijay K

2013-01-01

Phagemid-based expression of cloned genes fused to the gIIIP coding sequence and rescue using helper phages, such as VCSM13, has been used extensively for constructing large antibody phage display libraries. However, for randomly primed cDNA and gene fragment libraries, this system encounters reading frame problems wherein only one of 18 phages display the translated foreign peptide/protein fused to phagemid-encoded gIIIP. The elimination of phages carrying out-of-frame inserts is vital in order to improve the quality of phage display libraries. In this study, we designed a novel helper phage, AGM13, which carries trypsin-sensitive sites within the linker regions of gIIIP. This renders the phage highly sensitive to trypsin digestion, which abolishes its infectivity. For open reading frame (ORF) selection, the phagemid-borne phages are rescued using AGM13, so that clones with in-frame inserts express fusion proteins with phagemid-encoded trypsin-resistant gIIIP, which becomes incorporated into the phages along with a few copies of AGM13-encoded trypsin-sensitive gIIIP. In contrast, clones with out-of-frame inserts produce phages carrying only AGM13-encoded trypsin-sensitive gIIIP. Trypsin treatment of the phage population renders the phages with out-of-frame inserts non-infectious, whereas phages carrying in-frame inserts remain fully infectious and can hence be enriched by infection. This strategy was applied efficiently at a genome scale to generate an ORF-enriched whole genome fragment library from Mycobacterium tuberculosis, in which nearly 100% of the clones carried in-frame inserts after selection. The ORF-enriched libraries were successfully used for identification of linear and conformational epitopes for monoclonal antibodies specific to mycobacterial proteins.
Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics1

PubMed Central

Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron

2014-01-01

• Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. PMID:25225629
Chicken microsatellite markers isolated from libraries enriched for simple tandem repeats.

PubMed

Gibbs, M; Dawson, D A; McCamley, C; Wardle, A F; Armour, J A; Burke, T

1997-12-01

The total number of microsatellite loci is considered to be at least 10-fold lower in avian species than in mammalian species. Therefore, efficient large-scale cloning of chicken microsatellites, as required for the construction of a high-resolution linkage map, is facilitated by the construction of libraries using an enrichment strategy. In this study, a plasmid library enriched for tandem repeats was constructed from chicken genomic DNA by hybridization selection. Using this technique the proportion of recombinant clones that cross-hybridized to probes containing simple tandem repeats was raised to 16%, compared with < 0.1% in a non-enriched library. Primers were designed from 121 different sequences. Polymerase chain reaction (PCR) analysis of two chicken reference pedigrees enabled 72 loci to be localized within the collaborative chicken genetic map, and at least 30 of the remaining loci have been shown to be informative in these or other crosses.
Identification of immunogenic polypeptides from a Mycoplasma hyopneumoniae genome library by phage display.

PubMed

Kügler, Jonas; Nieswandt, Simone; Gerlach, Gerald F; Meens, Jochen; Schirrmann, Thomas; Hust, Michael

2008-09-01

The identification of immunogenic polypeptides of pathogens is helpful for the development of diagnostic assays and therapeutic applications like vaccines. Routinely, these proteins are identified by two-dimensional polyacrylamide gel electrophoresis and Western blot using convalescent serum, followed by mass spectrometry. This technology, however, is limited, because low or differentially expressed proteins, e.g. dependent on pathogen-host interaction, cannot be identified. In this work, we developed and improved a M13 genomic phage display-based method for the selection of immunogenic polypeptides of Mycoplasma hyopneumoniae, a pathogen causing porcine enzootic pneumonia. The fragmented genome of M. hyopneumoniae was cloned into a phage display vector, and the genomic library was packaged using the helperphage Hyperphage to enrich open reading frames (ORFs). Afterwards, the phage display library was screened by panning using convalescent serum. The analysis of individual phage clones resulted in the identification of five genes encoding immunogenic proteins, only two of which had been previously identified and described as immunogenic. This M13 genomic phage display, directly combining ORF enrichment and the presentation of the corresponding polypeptide on the phage surface, complements proteome-based methods for the identification of immunogenic polypeptides and is particularly well suited for the use in mycoplasma species.
Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

PubMed Central

2011-01-01

Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon usages of melon full-length transcripts were largely similar to those of Arabidopsis coding sequences. Conclusion The collection of melon ESTs generated from full-length enriched and standard cDNA libraries is expected to play significant roles in annotating the melon genome. The ESTs and associated analysis results will be useful resources for gene discovery, functional analysis, marker-assisted breeding of melon and closely related species, comparative genomic studies and for gaining insights into gene expression patterns. PMID:21599934
Undermethylated DNA as a source of microsatellites from a conifer genome.

PubMed

Zhou, Y; Bui, T; Auckland, L D; Williams, C G

2002-02-01

Developing microsatellites from the large, highly duplicated conifer genome requires special tools. To improve the efficiency of developing Pinus taeda L. microsatellites, undermethylated (UM) DNA fragments were used to construct a microsatellite-enriched copy library. A methylation-sensitive restriction enzyme, McrBC, was used to enrich for UM DNA before library construction. Digested DNA fragments larger than 9 kb were then excised and digested with RsaI and used to construct nine dinucleotide and trinucleotide libraries. A total of 1016 microsatellite-positive clones were detected among 11 904 clones and 620 of these were unique. Of 245 primer sets that produced a PCR product, 113 could be developed as UM microsatellite markers and 70 were polymorphic. Inheritance and marker informativeness were tested for a random sample of 36 polymorphic markers using a three-generation outbred pedigree. Thirty-one microsatellites (86%) had single-locus inheritance despite the highly duplicated nature of the P. taeda genome. Nineteen UM microsatellites had highly informative intercross mating type configurations. Allele number and frequency were estimated for eleven UM microsatellites using a population survey. Allele numbers for these UM microsatellites ranged from 3 to 12 with an average of 5.7 alleles/locus. Frequencies for the 63 alleles were mostly in the low-common range; only 14 of the 63 were in the rare allele (q < 0.05) class. Enriching for UM DNA was an efficient method for developing polymorphic microsatellites from a large plant genome.
Consequences of Normalizing Transcriptomic and Genomic Libraries of Plant Genomes Using a Duplex-Specific Nuclease and Tetramethylammonium Chloride

PubMed Central

Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

2013-01-01

Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088
Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

PubMed

Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

2013-01-01

Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.
Effects of methylation-sensitive enzymes on the enrichment of genic SNPs and the degree of genome complexity reduction in a two-enzyme genotyping-by-sequencing (GBS) approach: a case study in oil palm (Elaeis guineensis).

PubMed

Pootakham, Wirulda; Sonthirod, Chutima; Naktang, Chaiwat; Jomchai, Nukoon; Sangsrakru, Duangjai; Tangphatsornruang, Sithichoke

2016-01-01

Advances in next generation sequencing have facilitated a large-scale single nucleotide polymorphism (SNP) discovery in many crop species. Genotyping-by-sequencing (GBS) approach couples next generation sequencing with genome complexity reduction techniques to simultaneously identify and genotype SNPs. Choice of enzymes used in GBS library preparation depends on several factors including the number of markers required, the desired level of multiplexing, and whether the enrichment of genic SNP is preferred. We evaluated various combinations of methylation-sensitive ( Aat II, Pst I, Msp I) and methylation-insensitive ( Sph I, Mse I) enzymes for their effectiveness in genome complexity reduction and enrichment of genic SNPs. We discovered that the use of two methylation-sensitive enzymes effectively reduced genome complexity and did not require a size selection step. On the contrary, the genome coverage of libraries constructed with methylation-insensitive enzymes was quite high, and the additional size selection step may be required to increase the overall read depth. We also demonstrated the effectiveness of methylation-sensitive enzymes in enriching for SNPs located in genic regions. When two methylation-insensitive enzymes were used, only 16% of SNPs identified were located in genes and 18% in the vicinity (± 5 kb) of the genic regions, while most SNPs resided in the intergenic regions. In contrast, a remarkable degree of enrichment was observed when two methylation-sensitive enzymes were employed. Almost two thirds of the SNPs were located either inside (32-36%) or in the vicinity (28-31%) of the genic regions. These results provide useful information to help researchers choose appropriate GBS enzymes in oil palm and other crop species.
Low-frequency chimeric yeast artificial chromosome libraries from flow-sorted human chromosomes 16 and 21.

PubMed Central

McCormick, M K; Campbell, E; Deaven, L; Moyzis, R

1993-01-01

Construction of chromosome-specific yeast artificial chromosome (YAC) libraries from sorted chromosomes was undertaken (i) to eliminate drawbacks associated with first-generation total genomic YAC libraries, such as the high frequency of chimeric YACs, and (ii) to provide an alternative method for generating chromosome-specific YAC libraries in addition to isolating such collections from a total genomic library. Chromosome-specific YAC libraries highly enriched for human chromosomes 16 and 21 were constructed. By maximizing the percentage of fragments with two ligatable ends and performing yeast transformations with less than saturating amounts of DNA in the presence of carrier DNA, YAC libraries with a low percentage of chimeric clones were obtained. The smaller number of YAC clones in these chromosome-specific libraries reduces the effort involved in PCR-based screening and allows hybridization methods to be a manageable screening approach. Images PMID:8430075
Multiplexed microsatellite recovery using massively parallel sequencing

USGS Publications Warehouse

Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

2011-01-01

Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).
Methods comparison for microsatellite marker development: Different isolation methods, different yield efficiency

NASA Astrophysics Data System (ADS)

Zhan, Aibin; Bao, Zhenmin; Hu, Xiaoli; Lu, Wei; Hu, Jingjie

2009-06-01

Microsatellite markers have become one kind of the most important molecular tools used in various researches. A large number of microsatellite markers are required for the whole genome survey in the fields of molecular ecology, quantitative genetics and genomics. Therefore, it is extremely necessary to select several versatile, low-cost, efficient and time- and labor-saving methods to develop a large panel of microsatellite markers. In this study, we used Zhikong scallop ( Chlamys farreri) as the target species to compare the efficiency of the five methods derived from three strategies for microsatellite marker development. The results showed that the strategy of constructing small insert genomic DNA library resulted in poor efficiency, while the microsatellite-enriched strategy highly improved the isolation efficiency. Although the mining public database strategy is time- and cost-saving, it is difficult to obtain a large number of microsatellite markers, mainly due to the limited sequence data of non-model species deposited in public databases. Based on the results in this study, we recommend two methods, microsatellite-enriched library construction method and FIASCO-colony hybridization method, for large-scale microsatellite marker development. Both methods were derived from the microsatellite-enriched strategy. The experimental results obtained from Zhikong scallop also provide the reference for microsatellite marker development in other species with large genomes.
Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function.

PubMed

Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J

2007-06-01

As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
Informative genomic microsatellite markers for efficient genotyping applications in sugarcane.

PubMed

Parida, Swarup K; Kalia, Sanjay K; Kaul, Sunita; Dalal, Vivek; Hemaprabha, G; Selvi, Athiappan; Pandit, Awadhesh; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; Srivastava, Prem Shankar; Singh, Nagendra K; Mohapatra, Trilochan

2009-01-01

Genomic microsatellite markers are capable of revealing high degree of polymorphism. Sugarcane (Saccharum sp.), having a complex polyploid genome requires more number of such informative markers for various applications in genetics and breeding. With the objective of generating a large set of microsatellite markers designated as Sugarcane Enriched Genomic MicroSatellite (SEGMS), 6,318 clones from genomic libraries of two hybrid sugarcane cultivars enriched with 18 different microsatellite repeat-motifs were sequenced to generate 4.16 Mb high-quality sequences. Microsatellites were identified in 1,261 of the 5,742 non-redundant clones that accounted for 22% enrichment of the libraries. Retro-transposon association was observed for 23.1% of the identified microsatellites. The utility of the microsatellite containing genomic sequences were demonstrated by higher primer designing potential (90%) and PCR amplification efficiency (87.4%). A total of 1,315 markers including 567 class I microsatellite markers were designed and placed in the public domain for unrestricted use. The level of polymorphism detected by these markers among sugarcane species, genera, and varieties was 88.6%, while cross-transferability rate was 93.2% within Saccharum complex and 25% to cereals. Cloning and sequencing of size variant amplicons revealed that the variation in the number of repeat-units was the main source of SEGMS fragment length polymorphism. High level of polymorphism and wide range of genetic diversity (0.16-0.82 with an average of 0.44) assayed with the SEGMS markers suggested their usefulness in various genotyping applications in sugarcane.

UPIC + GO: Zeroing in on informative markers

USDA-ARS?s Scientific Manuscript database

Microsatellites/SSRs (simple sequence repeats) have become a powerful tool in genomic biology because of their broad range of applications and availability. An efficient method recently developed to generate microsatellite-enriched libraries used in combination with high throughput DNA pyrosequencin...
Optimization of design and production strategies for novel adeno-associated viral display peptide libraries.

PubMed

Körbelin, J; Hunger, A; Alawi, M; Sieber, T; Binder, M; Trepel, M

2017-08-01

Libraries displaying random peptides on the surface of adeno-associated virus (AAV) are powerful tools for the generation of target-specific gene therapy vectors. However, for unknown reasons the success rate of AAV library screenings is variable and the influence of the production procedure has not been thoroughly evaluated. During library screenings, the capsid variants with the most favorable tropism are enriched over several selection rounds on a target of choice and identified by subsequent sequencing of the encapsidated viral genomes encoding the library capsids with targeting peptide insertions. Thus, a high capsid-genome correlation is crucial to obtain the correct information about the selected capsid variants. Producing AAV libraries by a two-step protocol with pseudotyped library transfer shuttles has been proposed as one way to ensure such a correlation. Here we show that AAV2 libraries produced by such a protocol via transfer shuttles display an unexpected additional bias in the amino-acid composition which confers increased heparin affinity and thus similarity to wildtype AAV2 tropism. This bias may fundamentally impair the intended use of AAV libraries, discouraging the use of transfer shuttles for the production of AAV libraries in the future.
An optimized methodology for whole genome sequencing of RNA respiratory viruses from nasopharyngeal aspirates.

PubMed

Goya, Stephanie; Valinotto, Laura E; Tittarelli, Estefania; Rojo, Gabriel L; Nabaes Jodar, Mercedes S; Greninger, Alexander L; Zaiat, Jonathan J; Marti, Marcelo A; Mistchenko, Alicia S; Viegas, Mariana

2018-01-01

Over the last decade, the number of viral genome sequences deposited in available databases has grown exponentially. However, sequencing methodology vary widely and many published works have relied on viral enrichment by viral culture or nucleic acid amplification with specific primers rather than through unbiased techniques such as metagenomics. The genome of RNA viruses is highly variable and these enrichment methodologies may be difficult to achieve or may bias the results. In order to obtain genomic sequences of human respiratory syncytial virus (HRSV) from positive nasopharyngeal aspirates diverse methodologies were evaluated and compared. A total of 29 nearly complete and complete viral genomes were obtained. The best performance was achieved with a DNase I treatment to the RNA directly extracted from the nasopharyngeal aspirate (NPA), sequence-independent single-primer amplification (SISPA) and library preparation performed with Nextera XT DNA Library Prep Kit with manual normalization. An average of 633,789 and 1,674,845 filtered reads per library were obtained with MiSeq and NextSeq 500 platforms, respectively. The higher output of NextSeq 500 was accompanied by the increasing of duplicated reads percentage generated during SISPA (from an average of 1.5% duplicated viral reads in MiSeq to an average of 74% in NextSeq 500). HRSV genome recovery was not affected by the presence or absence of duplicated reads but the computational demand during the analysis was increased. Considering that only samples with viral load ≥ E+06 copies/ml NPA were tested, no correlation between sample viral loads and number of total filtered reads was observed, nor with the mapped viral reads. The HRSV genomes showed a mean coverage of 98.46% with the best methodology. In addition, genomes of human metapneumovirus (HMPV), human rhinovirus (HRV) and human parainfluenza virus types 1-3 (HPIV1-3) were also obtained with the selected optimal methodology.
General M13 phage display: M13 phage display in identification and characterization of protein-protein interactions.

PubMed

Hertveldt, Kirsten; Beliën, Tim; Volckaert, Guido

2009-01-01

In M13 phage display, proteins and peptides are exposed on one of the surface proteins of filamentous phage particles and become accessible to affinity enrichment against a bait of interest. We describe the construction of fragmented whole genome and gene fragment phage display libraries and interaction selection by panning. This strategy allows the identification and characterization of interacting proteins on a genomic scale by screening the fragmented "proteome" against protein baits. Gene fragment libraries allow a more in depth characterization of the protein-protein interaction site by identification of the protein region involved in the interaction.
Microsatellite Markers for Raspberries and Blackberries

USDA-ARS?s Scientific Manuscript database

Twelve microsatellites were isolated from SSR-enriched genomic libraries of Rubus idaeus L.‘Meeker’ red raspberry (diploid) and R. loganobaccus L. H. Bailey ‘Marion’ blackberry-raspberry hybrid (hexaploid). These primer pairs, with the addition of one developed from a GenBank R. idaeus sequence, w...
Direct cloning from enrichment cultures, a reliable strategy for isolation of complete operons and genes from microbial consortia.

PubMed

Entcheva, P; Liebl, W; Johann, A; Hartsch, T; Streit, W R

2001-01-01

Enrichment cultures of microbial consortia enable the diverse metabolic and catabolic activities of these populations to be studied on a molecular level and to be explored as potential sources for biotechnology processes. We have used a combined approach of enrichment culture and direct cloning to construct cosmid libraries with large (>30-kb) inserts from microbial consortia. Enrichment cultures were inoculated with samples from five environments, and high amounts of avidin were added to the cultures to favor growth of biotin-producing microbes. DNA was extracted from three of these enrichment cultures and used to construct cosmid libraries; each library consisted of between 6,000 and 35,000 clones, with an average insert size of 30 to 40 kb. The inserts contained a diverse population of genomic DNA fragments isolated from the consortia organisms. These three libraries were used to complement the Escherichia coli biotin auxotrophic strain ATCC 33767 Delta(bio-uvrB). Initial screens resulted in the isolation of seven different complementing cosmid clones, carrying biotin biosynthesis operons. Biotin biosynthesis capabilities and growth under defined conditions of four of these clones were studied. Biotin measured in the different culture supernatants ranged from 42 to 3,800 pg/ml/optical density unit. Sequencing the identified biotin synthesis genes revealed high similarities to bio operons from gram-negative bacteria. In addition, random sequencing identified other interesting open reading frames, as well as two operons, the histidine utilization operon (hut), and the cluster of genes involved in biosynthesis of molybdopterin cofactors in bacteria (moaABCDE).
Identification and characterization of dinucleotide repeat (CA)[sub n] markers for genetic mapping in dog

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ostrander, E.A.; Sprague, G.F. Jr.; Rine, J.

1993-04-01

A large block of simple sequence repeat (SSR) polymorphisms for the dog genome has been isolated and characterized. Screening of primary libraries by conventional hybridization methods as well as by screening of enriched marker-selected libraries led to the isolation of a large number of genomic clones that contained (CA)[sub n] repeats. The sequences of 101 clones showed that the size and complexity of (CA)[sub n] repeats in the dog genome were similar to those reported for these markers in the human genome. Detailed analysis of a representative subset of these markers revealed that most markers were moderately to highly polymorphic,more » with PIC values exceeding 0.70 for 33% of the markers tested. An association between higher PIC values and markers containing longer (CA)[sub n] repeats was observed in these studies, as previously noted for similar markers in the human genome. A list of primer sequences that tag each characterized marker is provided, and a comprehensive system of nomenclature for the dog genome is suggested. 28 refs., 4 figs., 2 tabs.« less
Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

PubMed Central

Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

2012-01-01

Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365
Mapping the zebrafish brain methylome using reduced representation bisulfite sequencing

PubMed Central

Chatterjee, Aniruddha; Ozaki, Yuichi; Stockwell, Peter A; Horsfield, Julia A; Morison, Ian M; Nakagawa, Shinichi

2013-01-01

Reduced representation bisulfite sequencing (RRBS) has been used to profile DNA methylation patterns in mammalian genomes such as human, mouse and rat. The methylome of the zebrafish, an important animal model, has not yet been characterized at base-pair resolution using RRBS. Therefore, we evaluated the technique of RRBS in this model organism by generating four single-nucleotide resolution DNA methylomes of adult zebrafish brain. We performed several simulations to show the distribution of fragments and enrichment of CpGs in different in silico reduced representation genomes of zebrafish. Four RRBS brain libraries generated 98 million sequenced reads and had higher frequencies of multiple mapping than equivalent human RRBS libraries. The zebrafish methylome indicates there is higher global DNA methylation in the zebrafish genome compared with its equivalent human methylome. This observation was confirmed by RRBS of zebrafish liver. High coverage CpG dinucleotides are enriched in CpG island shores more than in the CpG island core. We found that 45% of the mapped CpGs reside in gene bodies, and 7% in gene promoters. This analysis provides a roadmap for generating reproducible base-pair level methylomes for zebrafish using RRBS and our results provide the first evidence that RRBS is a suitable technique for global methylation analysis in zebrafish. PMID:23975027
Measuring Sister Chromatid Cohesion Protein Genome Occupancy in Drosophila melanogaster by ChIP-seq.

PubMed

Dorsett, Dale; Misulovin, Ziva

2017-01-01

This chapter presents methods to conduct and analyze genome-wide chromatin immunoprecipitation of the cohesin complex and the Nipped-B cohesin loading factor in Drosophila cells using high-throughput DNA sequencing (ChIP-seq). Procedures for isolation of chromatin, immunoprecipitation, and construction of sequencing libraries for the Ion Torrent Proton high throughput sequencer are detailed, and computational methods to calculate occupancy as input-normalized fold-enrichment are described. The results obtained by ChIP-seq are compared to those obtained by ChIP-chip (genomic ChIP using tiling microarrays), and the effects of sequencing depth on the accuracy are analyzed. ChIP-seq provides similar sensitivity and reproducibility as ChIP-chip, and identifies the same broad regions of occupancy. The locations of enrichment peaks, however, can differ between ChIP-chip and ChIP-seq, and low sequencing depth can splinter broad regions of occupancy into distinct peaks.
Enrichr: a comprehensive gene set enrichment analysis web server 2016 update

PubMed Central

Kuleshov, Maxim V.; Jones, Matthew R.; Rouillard, Andrew D.; Fernandez, Nicolas F.; Duan, Qiaonan; Wang, Zichen; Koplev, Simon; Jenkins, Sherry L.; Jagodnik, Kathleen M.; Lachmann, Alexander; McDermott, Michael G.; Monteiro, Caroline D.; Gundersen, Gregory W.; Ma'ayan, Avi

2016-01-01

Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr. PMID:27141961
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples

PubMed Central

Quick, Josh; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

2018-01-01

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples without isolation remains challenging for viruses such as Zika, where metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence complete genomes comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimised library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved starting with clinical samples in 1-2 days following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. PMID:28538739
Construction of CRISPR Libraries for Functional Screening.

PubMed

Carstens, Carsten P; Felts, Katherine A; Johns, Sarah E

2018-01-01

Identification of gene function has been aided by the ability to generate targeted gene knockouts or transcriptional repression using the CRISPR/CAS9 system. Using pooled libraries of guide RNA expression vectors that direct CAS9 to a specific genomic site allows identification of genes that are either enriched or depleted in response to a selection scheme, thus linking the affected gene to the chosen phenotype. The quality of the data generated by the screening is dependent on the quality of the guide RNA delivery library with regards to error rates and especially evenness of distribution of the guides. Here, we describe a method for constructing complex plasmid libraries based on pooled designed oligomers with high representation and tight distributions. The procedure allows construction of plasmid libraries of >60,000 members with a 95th/5th percentile ratio of less than 3.5.
Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.

PubMed

Kuleshov, Maxim V; Jones, Matthew R; Rouillard, Andrew D; Fernandez, Nicolas F; Duan, Qiaonan; Wang, Zichen; Koplev, Simon; Jenkins, Sherry L; Jagodnik, Kathleen M; Lachmann, Alexander; McDermott, Michael G; Monteiro, Caroline D; Gundersen, Gregory W; Ma'ayan, Avi

2016-07-08

Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections.

PubMed

Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe; Avarre, Jean-Christophe

2016-01-01

Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×10 7 . The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.
Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections

PubMed Central

Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe

2016-01-01

Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3. PMID:27703859
Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

PubMed Central

Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

2012-01-01

To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

PubMed Central

Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

2018-01-01

Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139
Genome-Wide Mapping of Furfural Tolerance Genes in Escherichia coli

PubMed Central

Glebes, Tirzah Y.; Sandoval, Nicholas R.; Reeder, Philippa J.; Schilling, Katherine D.; Zhang, Min; Gill, Ryan T.

2014-01-01

Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007) Nat. Method.) approach to map, in parallel, the effect of increased dosage for >105 different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate). Only 268 of >4,000 E. coli genes (∼6%) were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate. PMID:24489935
Genome-wide mapping of furfural tolerance genes in Escherichia coli.

PubMed

Glebes, Tirzah Y; Sandoval, Nicholas R; Reeder, Philippa J; Schilling, Katherine D; Zhang, Min; Gill, Ryan T

2014-01-01

Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007) Nat. Method.) approach to map, in parallel, the effect of increased dosage for >10(5) different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate). Only 268 of >4,000 E. coli genes (∼ 6%) were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate.

Analysis of genomic regions of Trichoderma harzianum IOC-3844 related to biomass degradation.

PubMed

Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

2015-01-01

Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes.
Analysis of Genomic Regions of Trichoderma harzianum IOC-3844 Related to Biomass Degradation

PubMed Central

Crucello, Aline; Sforça, Danilo Augusto; Horta, Maria Augusta Crivelente; dos Santos, Clelton Aparecido; Viana, Américo José Carvalho; Beloti, Lilian Luzia; de Toledo, Marcelo Augusto Szymanski; Vincentz, Michel; Kuroshu, Reginaldo Massanobu; de Souza, Anete Pereira

2015-01-01

Trichoderma harzianum IOC-3844 secretes high levels of cellulolytic-active enzymes and is therefore a promising strain for use in biotechnological applications in second-generation bioethanol production. However, the T. harzianum biomass degradation mechanism has not been well explored at the genetic level. The present work investigates six genomic regions (~150 kbp each) in this fungus that are enriched with genes related to biomass conversion. A BAC library consisting of 5,760 clones was constructed, with an average insert length of 90 kbp. The assembled BAC sequences revealed 232 predicted genes, 31.5% of which were related to catabolic pathways, including those involved in biomass degradation. An expression profile analysis based on RNA-Seq data demonstrated that putative regulatory elements, such as membrane transport proteins and transcription factors, are located in the same genomic regions as genes related to carbohydrate metabolism and exhibit similar expression profiles. Thus, we demonstrate a rapid and efficient tool that focuses on specific genomic regions by combining a BAC library with transcriptomic data. This is the first BAC-based structural genomic study of the cellulolytic fungus T. harzianum, and its findings provide new perspectives regarding the use of this species in biomass degradation processes. PMID:25836973
Deep Subsurface Life from North Pond: Enrichment, Isolation, Characterization and Genomes of Heterotrophic Bacteria

PubMed Central

Russell, Joseph A.; León-Zayas, Rosa; Wrighton, Kelly; Biddle, Jennifer F.

2016-01-01

Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic water-column west of the Mid-Atlantic Ridge at 22°N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sediment column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. The cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface. PMID:27242705
Deep Subsurface Life from North Pond: Enrichment, Isolation, Characterization and Genomes of Heterotrophic Bacteria.

PubMed

Russell, Joseph A; León-Zayas, Rosa; Wrighton, Kelly; Biddle, Jennifer F

2016-01-01

Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic water-column west of the Mid-Atlantic Ridge at 22°N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sediment column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. The cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface.
SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

PubMed Central

2010-01-01

Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value < 0.05). Enrichment ratio 2 calculations showed that > 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped redundant clones together and illustrated that the SSHscreen plots are a useful tool for choosing anonymous clones for sequencing, since redundant clones cluster together on the enrichment ratio plots. Conclusions We developed the SSHscreen-SSHdb software pipeline, which greatly facilitates gene discovery using suppression subtractive hybridization by improving the selection of clones for sequencing after screening the library on a small number of microarrays. Annotation of the sequence information and collaboration was further enhanced through a web-based SSHdb database, and we illustrated this through identification of drought responsive genes from cowpea, which can now be investigated in gene function studies. SSH is a popular and powerful gene discovery tool, and therefore this pipeline will have application for gene discovery in any biological system, particularly non-model organisms. SSHscreen 2.0.1 and a link to SSHdb are available from http://microarray.up.ac.za/SSHscreen. PMID:20359330
Isolation of human simple repeat loci by hybridization selection.

PubMed

Armour, J A; Neumann, R; Gobert, S; Jeffreys, A J

1994-04-01

We have isolated short tandem repeat arrays from the human genome, using a rapid method involving filter hybridization to enrich for tri- or tetranucleotide tandem repeats. About 30% of clones from the enriched library cross-hybridize with probes containing trimeric or tetrameric tandem arrays, facilitating the rapid isolation of large numbers of clones. In an initial analysis of 54 clones, 46 different tandem arrays were identified. Analysis of these tandem repeat loci by PCR showed that 24 were polymorphic in length; substantially higher levels of polymorphism were displayed by the tetrameric repeat loci isolated than by the trimeric repeats. Primary mapping of these loci by linkage analysis showed that they derive from 17 chromosomes, including the X chromosome. We anticipate the use of this strategy for the efficient isolation of tandem repeats from other sources of genomic DNA, including DNA from flow-sorted chromosomes, and from other species.
Deep subsurface life from North Pond: Enrichment, isolation, characterization and genomes of heterotrophic bacteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Russell, Joseph A.; Leon-Zayas, Rosa; Wrighton, Kelly

Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic watercolumn west of the Mid-Atlantic Ridge at 22° N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sedimentmore » column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. Furthermore, the cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface.« less
Deep subsurface life from North Pond: Enrichment, isolation, characterization and genomes of heterotrophic bacteria

DOE PAGES

Russell, Joseph A.; Leon-Zayas, Rosa; Wrighton, Kelly; ...

2016-05-10

Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic watercolumn west of the Mid-Atlantic Ridge at 22° N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sedimentmore » column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. Furthermore, the cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface.« less
NEBNext Direct: A Novel, Rapid, Hybridization-Based Approach for the Capture and Library Conversion of Genomic Regions of Interest.

PubMed

Emerman, Amy B; Bowman, Sarah K; Barry, Andrew; Henig, Noa; Patel, Kruti M; Gardner, Andrew F; Hendrickson, Cynthia L

2017-07-05

Next-generation sequencing (NGS) is a powerful tool for genomic studies, translational research, and clinical diagnostics that enables the detection of single nucleotide polymorphisms, insertions and deletions, copy number variations, and other genetic variations. Target enrichment technologies improve the efficiency of NGS by only sequencing regions of interest, which reduces sequencing costs while increasing coverage of the selected targets. Here we present NEBNext Direct ® , a hybridization-based, target-enrichment approach that addresses many of the shortcomings of traditional target-enrichment methods. This approach features a simple, 7-hr workflow that uses enzymatic removal of off-target sequences to achieve a high specificity for regions of interest. Additionally, unique molecular identifiers are incorporated for the identification and filtering of PCR duplicates. The same protocol can be used across a wide range of input amounts, input types, and panel sizes, enabling NEBNext Direct to be broadly applicable across a wide variety of research and diagnostic needs. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Reexamining Content-Enriched Access: Its Effect on Usage and Discovery

ERIC Educational Resources Information Center

Tosaka, Yuji; Weng, Cathy

2011-01-01

Content-enriched metadata in bibliographic records is considered helpful to library users in identifying and selecting library materials for their needs. The paper presents a study, using circulation data from a medium-sized academic library, of the effect of content-enriched records on library materials usage. The study also examines OPAC search…
htsint: a Python library for sequencing pipelines that combines data through gene set generation.

PubMed

Richards, Adam J; Herrel, Anthony; Bonneaud, Camille

2015-09-24

Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples.

PubMed

Quick, Joshua; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah C; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno R; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

2017-06-01

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab.
Transcriptomic Changes of Drought-Tolerant and Sensitive Banana Cultivars Exposed to Drought Stress

PubMed Central

Muthusamy, Muthusamy; Uma, Subbaraya; Backiyarani, Suthanthiram; Saraswathi, Marimuthu Somasundaram; Chandrasekar, Arumugam

2016-01-01

In banana, drought responsive gene expression profiles of drought-tolerant and sensitive genotypes remain largely unexplored. In this research, the transcriptome of drought-tolerant banana cultivar (Saba, ABB genome) and sensitive cultivar (Grand Naine, AAA genome) was monitored using mRNA-Seq under control and drought stress condition. A total of 162.36 million reads from tolerant and 126.58 million reads from sensitive libraries were produced and mapped onto the Musa acuminata genome sequence and assembled into 23,096 and 23,079 unigenes. Differential gene expression between two conditions (control and drought) showed that at least 2268 and 2963 statistically significant, functionally known, non-redundant differentially expressed genes (DEGs) from tolerant and sensitive libraries. Drought has up-regulated 991 and 1378 DEGs and down-regulated 1104 and 1585 DEGs respectively in tolerant and sensitive libraries. Among DEGs, 15.9% are coding for transcription factors (TFs) comprising 46 families and 9.5% of DEGs are constituted by protein kinases from 82 families. Most enriched DEGs are mainly involved in protein modifications, lipid metabolism, alkaloid biosynthesis, carbohydrate degradation, glycan metabolism, and biosynthesis of amino acid, cofactor, nucleotide-sugar, hormone, terpenoids and other secondary metabolites. Several, specific genotype-dependent gene expression pattern was observed for drought stress in both cultivars. A subset of 9 DEGs was confirmed using quantitative reverse transcription-PCR. These results will provide necessary information for developing drought-resilient banana plants. PMID:27867388
A Rapid Spin Column-Based Method to Enrich Pathogen Transcripts from Eukaryotic Host Cells Prior to Sequencing

DOE PAGES

Bent, Zachary W.; Poorey, Kunal; LaBauve, Annette E.; ...

2016-12-21

When analyzing pathogen transcriptomes during the infection of host cells, the signal-to-background (pathogen-to-host) ratio of nucleic acids (NA) in infected samples is very small. Despite the advancements in next-generation sequencing, the minute amount of pathogen NA makes standard RNA-seq library preps inadequate for effective gene-level analysis of the pathogen in cases with low bacterial loads. In order to provide a more complete picture of the pathogen transcriptome during an infection, we developed a novel pathogen enrichment technique, which can enrich for transcripts from any cultivable bacteria or virus, using common, readily available laboratory equipment and reagents. To evenly enrich formore » pathogen transcripts, we generate biotinylated pathogen-targeted capture probes in an enzymatic process using the entire genome of the pathogen as a template. The capture probes are hybridized to a strand-specific cDNA library generated from an RNA sample. The biotinylated probes are captured on a monomeric avidin resin in a miniature spin column, and enriched pathogen-specific cDNA is eluted following a series of washes. To test this method, we performed an in vitro time-course infection using Klebsiella pneumoniae to infect murine macrophage cells. K. pneumoniae transcript enrichment efficiency was evaluated using RNA-seq. Bacterial transcripts were enriched up to ~400-fold, and allowed the recovery of transcripts from ~2000–3600 genes not observed in untreated control samples. These additional transcripts revealed interesting aspects of K. pneumoniae biology including the expression of putative virulence factors and the expression of several genes responsible for antibiotic resistance even in the absence of drugs.« less
A Rapid Spin Column-Based Method to Enrich Pathogen Transcripts from Eukaryotic Host Cells Prior to Sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bent, Zachary W.; Poorey, Kunal; LaBauve, Annette E.

When analyzing pathogen transcriptomes during the infection of host cells, the signal-to-background (pathogen-to-host) ratio of nucleic acids (NA) in infected samples is very small. Despite the advancements in next-generation sequencing, the minute amount of pathogen NA makes standard RNA-seq library preps inadequate for effective gene-level analysis of the pathogen in cases with low bacterial loads. In order to provide a more complete picture of the pathogen transcriptome during an infection, we developed a novel pathogen enrichment technique, which can enrich for transcripts from any cultivable bacteria or virus, using common, readily available laboratory equipment and reagents. To evenly enrich formore » pathogen transcripts, we generate biotinylated pathogen-targeted capture probes in an enzymatic process using the entire genome of the pathogen as a template. The capture probes are hybridized to a strand-specific cDNA library generated from an RNA sample. The biotinylated probes are captured on a monomeric avidin resin in a miniature spin column, and enriched pathogen-specific cDNA is eluted following a series of washes. To test this method, we performed an in vitro time-course infection using Klebsiella pneumoniae to infect murine macrophage cells. K. pneumoniae transcript enrichment efficiency was evaluated using RNA-seq. Bacterial transcripts were enriched up to ~400-fold, and allowed the recovery of transcripts from ~2000–3600 genes not observed in untreated control samples. These additional transcripts revealed interesting aspects of K. pneumoniae biology including the expression of putative virulence factors and the expression of several genes responsible for antibiotic resistance even in the absence of drugs.« less
Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability.

PubMed

Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B

2015-01-01

Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance.
Development of Genic and Genomic SSR Markers of Robusta Coffee (Coffea canephora Pierre Ex A. Froehner)

PubMed Central

Hendre, Prasad S.; Aggarwal, Ramesh K.

2014-01-01

Coffee breeding and improvement efforts can be greatly facilitated by availability of a large repository of simple sequence repeats (SSRs) based microsatellite markers, which provides efficiency and high-resolution in genetic analyses. This study was aimed to improve SSR availability in coffee by developing new genic−/genomic-SSR markers using in-silico bioinformatics and streptavidin-biotin based enrichment approach, respectively. The expressed sequence tag (EST) based genic microsatellite markers (EST-SSRs) were developed using the publicly available dataset of 13,175 unigene ESTs, which showed a distribution of 1 SSR/3.4 kb of coffee transcriptome. Genomic SSRs, on the other hand, were developed from an SSR-enriched small-insert partial genomic library of robusta coffee. In total, 69 new SSRs (44 EST-SSRs and 25 genomic SSRs) were developed and validated as suitable genetic markers. Diversity analysis of selected coffee genotypes revealed these to be highly informative in terms of allelic diversity and PIC values, and eighteen of these markers (∼27%) could be mapped on a robusta linkage map. Notably, the markers described here also revealed a very high cross-species transferability. In addition to the validated markers, we have also designed primer pairs for 270 putative EST-SSRs, which are expected to provide another ca. 200 useful genetic markers considering the high success rate (88%) of marker conversion of similar pairs tested/validated in this study. PMID:25461752
Digital gene expression analysis of gene expression differences within Brassica diploids and allopolyploids.

PubMed

Jiang, Jinjin; Wang, Yue; Zhu, Bao; Fang, Tingting; Fang, Yujie; Wang, Youping

2015-01-27

Brassica includes many successfully cultivated crop species of polyploid origin, either by ancestral genome triplication or by hybridization between two diploid progenitors, displaying complex repetitive sequences and transposons. The U's triangle, which consists of three diploids and three amphidiploids, is optimal for the analysis of complicated genomes after polyploidization. Next-generation sequencing enables the transcriptome profiling of polyploids on a global scale. We examined the gene expression patterns of three diploids (Brassica rapa, B. nigra, and B. oleracea) and three amphidiploids (B. napus, B. juncea, and B. carinata) via digital gene expression analysis. In total, the libraries generated between 5.7 and 6.1 million raw reads, and the clean tags of each library were mapped to 18547-21995 genes of B. rapa genome. The unambiguous tag-mapped genes in the libraries were compared. Moreover, the majority of differentially expressed genes (DEGs) were explored among diploids as well as between diploids and amphidiploids. Gene ontological analysis was performed to functionally categorize these DEGs into different classes. The Kyoto Encyclopedia of Genes and Genomes analysis was performed to assign these DEGs into approximately 120 pathways, among which the metabolic pathway, biosynthesis of secondary metabolites, and peroxisomal pathway were enriched. The non-additive genes in Brassica amphidiploids were analyzed, and the results indicated that orthologous genes in polyploids are frequently expressed in a non-additive pattern. Methyltransferase genes showed differential expression pattern in Brassica species. Our results provided an understanding of the transcriptome complexity of natural Brassica species. The gene expression changes in diploids and allopolyploids may help elucidate the morphological and physiological differences among Brassica species.
Database Resources of the BIG Data Center in 2018

PubMed Central

Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan

2018-01-01

Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542
Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing

PubMed Central

2012-01-01

Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent ‘read-backmapping’ to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach. Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results. PMID:22913592

Microsatellite DNA library for Caiman latirostris.

PubMed

Zucoloto, Rodrigo Barban; Verdade, Luciano Martins; Coutinho, Luiz Lehmann

2002-12-15

New genetic markers were characterized for the broad-snouted caiman (Caiman latirostris) by constructing libraries enriched for microsatellite DNA. Construction and characterization of these libraries are described in the present study. One microsatellite marker was developed from a (ACC-TGG)(n)enriched microsatellite DNA library, and 12 microsatellite markers were developed from a (AC-TG)(n)enriched microsatellite DNA library. These markers were tested in wild-caught animals, and these tests resulted in ten new polymorphic microsatellites for C. latirostris. Copyright 2002 Wiley-Liss, Inc.
Serial analysis of gene expression in a rat lung model of asthma.

PubMed

Yin, Lei-Miao; Jiang, Gong-Hao; Wang, Yu; Wang, Yan; Liu, Yan-Yan; Jin, Wei-Rong; Zhang, Zen; Xu, Yu-Dong; Yang, Yong-Qing

2008-11-01

The pathogenesis and molecular mechanism underlying asthma remain undetermined. The purpose of this study was to identify genes and pathways involved in the early airway response (EAR) phase of asthma by using serial analysis of gene expression (SAGE). Two SAGE tag libraries of lung tissues derived from a rat model of asthma and controls were generated. Bioinformatic analyses were carried out using the Database for Annotation, Visualization and IntegratedDiscovery Functional Annotation Tool, Gene Ontology (GO) TreeMachine and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. A total of 26 552 SAGE tags of asthmatic rat lung were obtained, of which 12 221 were unique tags. Of the unique tags, 55.5% were matched with known genes. By comparison of the two libraries, 186 differentially expressed tags (P < 0.05) were identified, of which 103 were upregulated and 83 were downregulated. Using the bioinformatic tools these genes were classified into 23 functional groups, 15 KEGG pathways and 37 enriched GO categories. The bioinformatic analyses of gene distribution, enriched categories and the involvement of specific pathways in the SAGE libraries have provided information on regulatory networks of the EAR phase of asthma. Analyses of the regulated genes of interest may inform new hypotheses, increase our understanding of the disease and provide a foundation for future research.
Construction of BAC Libraries from Flow-Sorted Chromosomes.

PubMed

Šafář, Jan; Šimková, Hana; Doležel, Jaroslav

2016-01-01

Cloned DNA libraries in bacterial artificial chromosome (BAC) are the most widely used form of large-insert DNA libraries. BAC libraries are typically represented by ordered clones derived from genomic DNA of a particular organism. In the case of large eukaryotic genomes, whole-genome libraries consist of a hundred thousand to a million clones, which make their handling and screening a daunting task. The labor and cost of working with whole-genome libraries can be greatly reduced by constructing a library derived from a smaller part of the genome. Here we describe construction of BAC libraries from mitotic chromosomes purified by flow cytometric sorting. Chromosome-specific BAC libraries facilitate positional gene cloning, physical mapping, and sequencing in complex plant genomes.
Development and experimental validation of a 20K Atlantic cod (Gadus morhua) oligonucleotide microarray based on a collection of over 150,000 ESTs.

PubMed

Booman, Marije; Borza, Tudor; Feng, Charles Y; Hori, Tiago S; Higgins, Brent; Culf, Adrian; Léger, Daniel; Chute, Ian C; Belkaid, Anissa; Rise, Marlies; Gamperl, A Kurt; Hubert, Sophie; Kimball, Jennifer; Ouellette, Rodney J; Johnson, Stewart C; Bowman, Sharen; Rise, Matthew L

2011-08-01

The collapse of Atlantic cod (Gadus morhua) wild populations strongly impacted the Atlantic cod fishery and led to the development of cod aquaculture. In order to improve aquaculture and broodstock quality, we need to gain knowledge of genes and pathways involved in Atlantic cod responses to pathogens and other stressors. The Atlantic Cod Genomics and Broodstock Development Project has generated over 150,000 expressed sequence tags from 42 cDNA libraries representing various tissues, developmental stages, and stimuli. We used this resource to develop an Atlantic cod oligonucleotide microarray containing 20,000 unique probes. Selection of sequences from the full range of cDNA libraries enables application of the microarray for a broad spectrum of Atlantic cod functional genomics studies. We included sequences that were highly abundant in suppression subtractive hybridization (SSH) libraries, which were enriched for transcripts responsive to pathogens or other stressors. These sequences represent genes that potentially play an important role in stress and/or immune responses, making the microarray particularly useful for studies of Atlantic cod gene expression responses to immune stimuli and other stressors. To demonstrate its value, we used the microarray to analyze the Atlantic cod spleen response to stimulation with formalin-killed, atypical Aeromonas salmonicida, resulting in a gene expression profile that indicates a strong innate immune response. These results were further validated by quantitative PCR analysis and comparison to results from previous analysis of an SSH library. This study shows that the Atlantic cod 20K oligonucleotide microarray is a valuable new tool for Atlantic cod functional genomics research.
Environmental genomics of "Haloquadratum walsbyi" in a saltern crystallizer indicates a large pool of accessory genes in an otherwise coherent species

PubMed Central

Legault, Boris A; Lopez-Lopez, Arantxa; Alba-Casado, Jose Carlos; Doolittle, W Ford; Bolhuis, Henk; Rodriguez-Valera, Francisco; Papke, R Thane

2006-01-01

Background Mature saturated brine (crystallizers) communities are largely dominated (>80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present. Results The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions. Conclusion These results point to a large pan-genome (total gene repertoire of the genus/species) even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities. PMID:16820057
Genome improvement of the acarbose producer Actinoplanes sp. SE50/110 and annotation refinement based on RNA-seq analysis.

PubMed

Wolf, Timo; Schneiker-Bekel, Susanne; Neshat, Armin; Ortseifen, Vera; Wibberg, Daniel; Zemke, Till; Pühler, Alfred; Kalinowski, Jörn

2017-06-10

Actinoplanes sp. SE50/110 is the natural producer of acarbose, which is used in the treatment of diabetes mellitus type II. However, until now the transcriptional organization and regulation of the acarbose biosynthesis are only understood rudimentarily. The genome sequence of Actinoplanes sp. SE50/110 was known before, but was resequenced in this study to remove assembly artifacts and incorrect base callings. The annotation of the genome was refined in a multi-step approach, including modern bioinformatic pipelines, transcriptome and proteome data. A whole transcriptome RNA-seq library as well as an RNA-seq library enriched for primary 5'-ends were used for the detection of transcription start sites, to correct tRNA predictions, to identify novel transcripts like small RNAs and to improve the annotation through the correction of falsely annotated translation start sites. The transcriptome data sets were also applied to identify 31 cis-regulatory RNA structures, such as riboswitches or RNA thermometers as well as three leaderless transcribed short peptides found in putative attenuators upstream of genes for amino acid biosynthesis. The transcriptional organization of the acarbose biosynthetic gene cluster was elucidated in detail and fourteen novel biosynthetic gene clusters were suggested. The accurate genome sequence and precise annotation of the Actinoplanes sp. SE50/110 genome will be the foundation for future genetic engineering and systems biology studies. Copyright © 2017 Elsevier B.V. All rights reserved.
Isolation and characterization of novel microsatellite markers from the sika deer (Cervus nippon) genome.

PubMed

Li, Y M; Bai, C Y; Niu, W P; Yu, H; Yang, R J; Yan, S Q; Zhang, J Y; Zhang, M J; Zhao, Z H

2015-09-28

Microsatellite markers are widely and evenly distributed, and are highly polymorphic. Rapid and convenient detection through automated analysis means that microsatellite markers are widely used in the construction of plant and animal genetic maps, in quantitative trait loci localization, marker-assisted selection, identification of genetic relationships, and genetic diversity and phylogenetic tree construction. However, few microsatellite markers remain to be isolated. We used streptavidin magnetic beads to affinity-capture and construct a (CA)n microsatellite DNA-enriched library from sika deer. We selected sequences containing more than six repeats to design primers. Clear bands were selected, which were amplified using non-specific primers following PCR amplification to screen polymorphisms in a group of 65 unrelated sika deer. The positive clone rate reached 82.9% by constructing the enriched library, and we then selected positive clones for sequencing. There were 395 sequences with CA repeats, and the CA repeat number was 4-105. We selected sequences containing more than six repeats to design primers, of which 297 pairs were designed. We next selected clear bands and used non-specific primers to amplify following PCR amplification. In total, 245 pairs of primers were screened. We then selected 50 pairs of primers to randomly screen for polymorphisms. We detected 47 polymorphic and 3 monomorphic loci in 65 unrelated sika deer. These newly isolated and characterized microsatellite loci can be used to construct genetic maps and for lineage testing in deer. In addition, they can be used for comparative genomics between Cervidae species.
Database Resources of the BIG Data Center in 2018.

PubMed

2018-01-04

The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
SearchSmallRNA: a graphical interface tool for the assemblage of viral genomes using small RNA libraries data.

PubMed

de Andrade, Roberto R S; Vaslin, Maite F S

2014-03-07

Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/.
SearchSmallRNA: a graphical interface tool for the assemblage of viral genomes using small RNA libraries data

PubMed Central

2014-01-01

Background Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. Methods In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. Results The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. Conclusions SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. Availability and implementation SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/. PMID:24607237
Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

PubMed Central

Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

2013-01-01

In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105
Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

PubMed

Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

2013-05-24

In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.
Successful development of microsatellite markers in a challenging species: the horizontal borer Austroplatypus incompertus (Coleoptera: Curculionidae).

PubMed

Smith, S; Joss, T; Stow, A

2011-10-01

The analysis of microsatellite loci has allowed significant advances in evolutionary biology and pest management. However, until very recently, the potential benefits have been compromised by the high costs of developing these neutral markers. High-throughput sequencing provides a solution to this problem. We describe the development of 13 microsatellite markers for the eusocial ambrosia beetle, Austroplatypus incompertus, a significant pest of forests in southeast Australia. The frequency of microsatellite repeats in the genome of A. incompertus was determined to be low, and previous attempts at microsatellite isolation using a traditional genomic library were problematic. Here, we utilised two protocols, microsatellite-enriched genomic library construction and high-throughput 454 sequencing and characterised 13 loci which were polymorphic among 32 individuals. Numbers of alleles per locus ranged from 2 to 17, and observed and expected heterozygosities from 0.344 to 0.767 and from 0.507 to 0.860, respectively. These microsatellites have the resolution required to analyse fine-scale colony and population genetic structure. Our work demonstrates the utility of next-generation 454 sequencing as a method for rapid and cost-effective acquisition of microsatellites where other techniques have failed, or for taxa where marker development has historically been both complicated and expensive.
Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

PubMed Central

Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

2014-01-01

Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928
Genomic Characterization of a Novel Phage Found in Black Abalone (Haliotis cracherodii) Infected with Withering Syndrome

NASA Astrophysics Data System (ADS)

Closek, C. J.; Langevin, S.; Burge, C. A.; Crosson, L.; White, S.; Friedman, C. S.

2016-02-01

Withering syndrome (WS), caused by the bacterium Candidatus Xenohaliotis californiensis, a Rickettsia-like organism (RLO), infects many species of abalone. Black abalone (Haliotis cracherodii), one of two endangered species of abalone, has experienced high population losses along the California coast due to WS. Recently, we observed reduced pathogenicity and mortality events in RLO-infected abalone when a novel bacteriophage (phage) was also present. To better understand phage-bacterium dynamics and develop more informative diagnostic tools, we sequenced the genome of the novel phage associated with the RLO responsible for WS. Metagenomic sequencing libraries were prepared with extracted genomic DNA from two experimentally infected H. cracherodii and phage sequences were enriched using hydroxyapatite chromatography normalization. Normalized libraries were individually barcoded and sequenced with Illumina MiSeq. Raw sequence reads were processed using VIrominer and de novo assembly produced one single phage-like contig (35.7Kb) from the experimentally infected abalone. This highly divergent genome had closest homology with a virus associated with abalone shriveling syndrome (SS). Of the 34 predicted ORFs, overlapping homology with the SS virus ranged from 20-72%, demonstrating the phage sequenced is genetically distinct from any known phage. The phage-like sequences represented a significant portion of the total reads sequenced ( 2 million of the 12 million paired-end reads; 17%) and we obtained 94,000X coverage across the novel phage genome. Beyond characterization of this novel phage, which appears to reduce pathogenicity of the RLO, the genome enabled us to develop quantitative PCR and in situ hybridization assays as diagnostic tools. These tools allow us to detect and quantify this phage in the endangered H. cracherodii.
Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species

PubMed Central

Wu, Chengcang; Proestou, Dina; Carter, Dorothy; Nicholson, Erica; Santos, Filippe; Zhao, Shaying; Zhang, Hong-Bin; Goldsmith, Marian R

2009-01-01

Background Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion The high-quality and large-insert BAC libraries of the insects, together with the identified BACs containing genes of interest, provide valuable information, resources and tools for comprehensive understanding and studies of the insect genomes and for addressing many fundamental questions in Lepidoptera. The sample of the genomic sequences provides the first insight into the constitution and evolution of the insect genomes. PMID:19558662
Characterization of polymorphic microsatellites for Tripterygium (Celastraceae), a monospecific genus of medicinal importance.

PubMed

Novy, Ari; Jones, Kenneth C

2011-10-01

Microsatellite markers were developed for the medicinal plant Tripterygium (Celastraceae) to assess its population structure and to facilitate source tracking of plant materials used for medicinal extracts. Ten microsatellite markers were isolated and characterized in T. wilfordii using an enriched genomic library. The number of alleles per locus ranged from five to 12. Observed and expected heterozygosity ranged from 0.166 to 0.630 and 0.392 to 0.562, respectively. These markers will be useful for a variety of applications including source tracking of plant materials, resolution of taxonomic issues, and population genetics studies.
COBRA-Seq: Sensitive and Quantitative Methylome Profiling

PubMed Central

Varinli, Hilal; Statham, Aaron L.; Clark, Susan J.; Molloy, Peter L.; Ross, Jason P.

2015-01-01

Combined Bisulfite Restriction Analysis (COBRA) quantifies DNA methylation at a specific locus. It does so via digestion of PCR amplicons produced from bisulfite-treated DNA, using a restriction enzyme that contains a cytosine within its recognition sequence, such as TaqI. Here, we introduce COBRA-seq, a genome wide reduced methylome method that requires minimal DNA input (0.1–1.0 μg) and can either use PCR or linear amplification to amplify the sequencing library. Variants of COBRA-seq can be used to explore CpG-depleted as well as CpG-rich regions in vertebrate DNA. The choice of enzyme influences enrichment for specific genomic features, such as CpG-rich promoters and CpG islands, or enrichment for less CpG dense regions such as enhancers. COBRA-seq coupled with linear amplification has the additional advantage of reduced PCR bias by producing full length fragments at high abundance. Unlike other reduced representative methylome methods, COBRA-seq has great flexibility in the choice of enzyme and can be multiplexed and tuned, to reduce sequencing costs and to interrogate different numbers of sites. Moreover, COBRA-seq is applicable to non-model organisms without the reference genome and compatible with the investigation of non-CpG methylation by using restriction enzymes containing CpA, CpT, and CpC in their recognition site. PMID:26512698
Molecular Cloning and Characterization of a Newly Isolated Pyrethroid-Degrading Esterase Gene from a Genomic Library of Ochrobactrum anthropi YZ-1

PubMed Central

Song, Jinlong; Shi, Yanhua; Li, Kang; Zhao, Bin; Yan, Yanchun

2013-01-01

A novel pyrethroid-degrading esterase gene pytY was isolated from the genomic library of Ochrobactrum anthropi YZ-1. It possesses an open reading frame (ORF) of 897 bp. Blast search showed that its deduced amino acid sequence shares moderate identities (30% to 46%) with most homologous esterases. Phylogenetic analysis revealed that PytY is a member of the esterase VI family. pytY showed very low sequence similarity compared with reported pyrethroid-degrading genes. PytY was expressed, purified, and characterized. Enzyme assay revealed that PytY is a broad-spectrum degrading enzyme that can degrade various pyrethroids. It is a new pyrethroid-degrading gene and enriches genetic resource. Kinetic constants of Km and Vmax were 2.34 mmol·L−1 and 56.33 nmol min−1, respectively, with lambda-cyhalothrin as substrate. PytY displayed good degrading ability and stability over a broad range of temperature and pH. The optimal temperature and pH were of 35°C and 7.5. No cofactors were required for enzyme activity. The results highlighted the potential use of PytY in the elimination of pyrethroid residuals from contaminated environments. PMID:24155944
Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

PubMed Central

Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

2013-01-01

Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870

Enzymatically Generated CRISPR Libraries for Genome Labeling and Screening

PubMed Central

Lane, Andrew B.; Strzelecka, Magdalena; Ettinger, Andreas; Grenfell, Andrew W.; Wittmann, Torsten; Heald, Rebecca

2015-01-01

Summary CRISPR-based technologies have emerged as powerful tools to alter genomes and mark chromosomal loci, but an inexpensive method for generating large numbers of RNA guides for whole genome screening and labeling is lacking. Using a method that permits library construction from any source of DNA, we generated guide libraries that label repetitive loci or a single chromosomal locus in Xenopus egg extracts and show that a complex library can target the E. coli genome at high frequency. PMID:26212133
Pigeonpea genomics initiative (PGI): an international effort to improve crop productivity of pigeonpea (Cajanus cajan L.)

PubMed Central

Penmetsa, R. V.; Dutta, S.; Kulwal, P. L.; Saxena, R. K.; Datta, S.; Sharma, T. R.; Rosen, B.; Carrasquilla-Garcia, N.; Farmer, A. D.; Dubey, A.; Saxena, K. B.; Gao, J.; Fakrudin, B.; Singh, M. N.; Singh, B. P.; Wanjari, K. B.; Yuan, M.; Srivastava, R. K.; Kilian, A.; Upadhyaya, H. D.; Mallikarjuna, N.; Town, C. D.; Bruening, G. E.; He, G.; May, G. D.; McCombie, R.; Jackson, S. A.; Singh, N. K.; Cook, D. R.

2009-01-01

Pigeonpea (Cajanus cajan), an important food legume crop in the semi-arid regions of the world and the second most important pulse crop in India, has an average crop productivity of 780 kg/ha. The relatively low crop yields may be attributed to non-availability of improved cultivars, poor crop husbandry and exposure to a number of biotic and abiotic stresses in pigeonpea growing regions. Narrow genetic diversity in cultivated germplasm has further hampered the effective utilization of conventional breeding as well as development and utilization of genomic tools, resulting in pigeonpea being often referred to as an ‘orphan crop legume’. To enable genomics-assisted breeding in this crop, the pigeonpea genomics initiative (PGI) was initiated in late 2006 with funding from Indian Council of Agricultural Research under the umbrella of Indo-US agricultural knowledge initiative, which was further expanded with financial support from the US National Science Foundation’s Plant Genome Research Program and the Generation Challenge Program. As a result of the PGI, the last 3 years have witnessed significant progress in development of both genetic as well as genomic resources in this crop through effective collaborations and coordination of genomics activities across several institutes and countries. For instance, 25 mapping populations segregating for a number of biotic and abiotic stresses have been developed or are under development. An 11X-genome coverage bacterial artificial chromosome (BAC) library comprising of 69,120 clones have been developed of which 50,000 clones were end sequenced to generate 87,590 BAC-end sequences (BESs). About 10,000 expressed sequence tags (ESTs) from Sanger sequencing and ca. 2 million short ESTs by 454/FLX sequencing have been generated. A variety of molecular markers have been developed from BESs, microsatellite or simple sequence repeat (SSR)-enriched libraries and mining of ESTs and genomic amplicon sequencing. Of about 21,000 SSRs identified, 6,698 SSRs are under analysis along with 670 orthologous genes using a GoldenGate SNP (single nucleotide polymorphism) genotyping platform, with large scale SNP discovery using Solexa, a next generation sequencing technology, is in progress. Similarly a diversity array technology array comprising of ca. 15,000 features has been developed. In addition, >600 unique nucleotide binding site (NBS) domain containing members of the NBS-leucine rich repeat disease resistance homologs were cloned in pigeonpea; 960 BACs containing these sequences were identified by filter hybridization, BES physical maps developed using high information content fingerprinting. To enrich the genomic resources further, sequenced soybean genome is being analyzed to establish the anchor points between pigeonpea and soybean genomes. In addition, Solexa sequencing is being used to explore the feasibility of generating whole genome sequence. In summary, the collaborative efforts of several research groups under the umbrella of PGI are making significant progress in improving molecular tools in pigeonpea and should significantly benefit pigeonpea genetics and breeding. As these efforts come to fruition, and expanded (depending on funding), pigeonpea would move from an ‘orphan legume crop’ to one where genomics-assisted breeding approaches for a sustainable crop improvement are routine. PMID:20976284
Diazotrophic bacterioplankton in a coral reef lagoon: phylogeny, diel nitrogenase expression and response to phosphate enrichment.

PubMed

Hewson, Ian; Moisander, Pia H; Morrison, Amanda E; Zehr, Jonathan P

2007-05-01

We investigated diazotrophic bacterioplankton assemblage composition in the Heron Reef lagoon (Great Barrier Reef, Australia) using culture-independent techniques targeting the nifH fragment of the nitrogenase gene. Seawater was collected at 3 h intervals over a period of 72 h (i.e. over diel as well as tidal cycles). An incubation experiment was also conducted to assess the impact of phosphate (PO(4)3*) availability on nifH expression patterns. DNA-based nifH libraries contained primarily sequences that were most similar to nifH from sediment, microbial mat and surface-associated microorganisms, with a few sequences that clustered with typical open ocean phylotypes. In contrast to genomic DNA sequences, libraries prepared from gene transcripts (mRNA amplified by reverse transcription-polymerase chain reaction) were entirely cyanobacterial and contained phylotypes similar to those observed in open ocean plankton. The abundance of Trichodesmium and two uncultured cyanobacterial phylotypes from previous studies (group A and group B) were studied by quantitative-polymerase chain reaction in the lagoon samples. These were detected as transcripts, but were not detected in genomic DNA. The gene transcript abundance of these phylotypes demonstrated variability over several diel cycles. The PO(4)3* enrichment experiment had a clearer pattern of gene expression over diel cycles than the lagoon sampling, however PO(4)3* additions did not result in enhanced transcript abundance relative to control incubations. The results suggest that a number of diazotrophs in bacterioplankton of the reef lagoon may originate from sediment, coral or beachrock surfaces, sloughing into plankton with the flooding tide. The presence of typical open ocean phylotype transcripts in lagoon bacterioplankton may indicate that they are an important component of the N cycle of the coral reef.
Preparation of Low-Input and Ligation-Free ChIP-seq Libraries Using Template-Switching Technology.

PubMed

Bolduc, Nathalie; Lehman, Alisa P; Farmer, Andrew

2016-10-10

Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) has become the gold standard for mapping of transcription factors and histone modifications throughout the genome. However, for ChIP experiments involving few cells or targeting low-abundance transcription factors, the small amount of DNA recovered makes ligation of adapters very challenging. In this unit, we describe a ChIP-seq workflow that can be applied to small cell numbers, including a robust single-tube and ligation-free method for preparation of sequencing libraries from sub-nanogram amounts of ChIP DNA. An example ChIP protocol is first presented, resulting in selective enrichment of DNA-binding proteins and cross-linked DNA fragments immobilized on beads via an antibody bridge. This is followed by a protocol for fast and easy cross-linking reversal and DNA recovery. Finally, we describe a fast, ligation-free library preparation protocol, featuring DNA SMART technology, resulting in samples ready for Illumina sequencing. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Uprobe: a genome-wide universal probe resource for comparative physical mapping in vertebrates.

PubMed

Kellner, Wendy A; Sullivan, Robert T; Carlson, Brian H; Thomas, James W

2005-01-01

Interspecies comparisons are important for deciphering the functional content and evolution of genomes. The expansive array of >70 public vertebrate genomic bacterial artificial chromosome (BAC) libraries can provide a means of comparative mapping, sequencing, and functional analysis of targeted chromosomal segments that is independent and complementary to whole-genome sequencing. However, at the present time, no complementary resource exists for the efficient targeted physical mapping of the majority of these BAC libraries. Universal overgo-hybridization probes, designed from regions of sequenced genomes that are highly conserved between species, have been demonstrated to be an effective resource for the isolation of orthologous regions from multiple BAC libraries in parallel. Here we report the application of the universal probe design principal across entire genomes, and the subsequent creation of a complementary probe resource, Uprobe, for screening vertebrate BAC libraries. Uprobe currently consists of whole-genome sets of universal overgo-hybridization probes designed for screening mammalian or avian/reptilian libraries. Retrospective analysis, experimental validation of the probe design process on a panel of representative BAC libraries, and estimates of probe coverage across the genome indicate that the majority of all eutherian and avian/reptilian genes or regions of interest can be isolated using Uprobe. Future implementation of the universal probe design strategy will be used to create an expanded number of whole-genome probe sets that will encompass all vertebrate genomes.
Construction of a Llama Bacterial Artificial Chromosome Library with Approximately 9-Fold Genome Equivalent Coverage

PubMed Central

Airmet, K. W.; Hinckley, J. D.; Tree, L. T.; Moss, M.; Blumell, S.; Ulicny, K.; Gustafson, A. K.; Weed, M.; Theodosis, R.; Lehnardt, M.; Genho, J.; Stevens, M. R.; Kooyman, D. L.

2012-01-01

The Ilama is an important agricultural livestock in much of South America. The llama is increasing in popularity in the United States as a companion animal. Little work has been done to improve llama production using modern technology. A paucity of information is available regarding the llama genome. We report the construction of a llama bacterial artificial chromosome (BAC) library of about 196,224 clones in the vector pECBAC1. Using flow cytometry and bovine, human, mouse, and chicken as controls, we determined the llama genome size to be 2.4 × 109 bp. The average insert size of the library is 137.8 kb corresponding to approximately 9-fold genome coverage. Further studies are needed to further characterize the library and llama genome. We anticipate that this new library will help facilitate future genomic studies in the llama. PMID:22811594
Enzymatically Generated CRISPR Libraries for Genome Labeling and Screening.

PubMed

Lane, Andrew B; Strzelecka, Magdalena; Ettinger, Andreas; Grenfell, Andrew W; Wittmann, Torsten; Heald, Rebecca

2015-08-10

CRISPR-based technologies have emerged as powerful tools to alter genomes and mark chromosomal loci, but an inexpensive method for generating large numbers of RNA guides for whole genome screening and labeling is lacking. Using a method that permits library construction from any source of DNA, we generated guide libraries that label repetitive loci or a single chromosomal locus in Xenopus egg extracts and show that a complex library can target the E. coli genome at high frequency. Copyright © 2015 Elsevier Inc. All rights reserved.
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

PubMed

Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

2013-08-01

With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Hyb-Seq: combining target enrichment and genome skimming for plant phylogenomics

Treesearch

Kevin Weitemier; Shannon C.K. Straub; Richard C. Cronn; Mark Fishbein; Roswitha Schmickl; Angela McDonnell; Aaron Liston

2014-01-01

â¢ Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. â¢ Methods and Results: Genome and transcriptome assemblies for milkweed ( Asclepias syriaca ) were used to design enrichment probes for 3385...
Polymorphic microsatellite loci for the sand pocket mouse Chaetodipus arenarius, an endemic from the Baja California Peninsula

USGS Publications Warehouse

Munguia-Vega, A.; Rodriguez-Estrella, R.; Nachman, M.; Culver, M.

2009-01-01

Fifteen polymorphic microsatellite loci were isolated from an enriched genomic library of the sand pocket mouse Chaetodipus arenarius. The mean number of alleles per locus was 11.53 (range five to 19) and the average observed heterozygosity was 0.764 (range 0.121 to 1.0). The markers will be used for detecting the impact of human-induced habitat fragmentation on patterns of gene flow, genetic structure, and extinction risk. In addition, these markers will be useful across the genus because most of the loci cross-amplified and were polymorphic in three other species of Chaetodipus. ?? 2008 The Authors.
Isolation and characterization of microsatellite loci in the intertidal sponge Halichondria panicea

USGS Publications Warehouse

Knowlton, Anne L.; Pierson, Barbara J.; Talbot, S.L.; Highsmith, Ray C.

2003-01-01

GA- and CA-enriched genomic libraries were constructed for the intertidal sponge Halichondria panicea. Unique repeat motifs identified varied from the expected simple dinucleotide repeats to more complex repeat units. All sequences tended to be highly repetitive but did not necessarily contain the targeted motifs. Seven microsatellite loci were evaluated on sponges from the clone source population. All seven were polymorphic with 5.43 ± 0.92 mean number of alleles. Six of the seven loci that could be resolved had mean heterozygosities of 0.14–0.68. The loci identified here will be useful for population studies.
Simulated Screens of DNA Encoded Libraries: The Potential Influence of Chemical Synthesis Fidelity on Interpretation of Structure-Activity Relationships.

PubMed

Satz, Alexander L

2016-07-11

Simulated screening of DNA encoded libraries indicates that the presence of truncated byproducts complicates the relationship between library member enrichment and equilibrium association constant (these truncates result from incomplete chemical reactions during library synthesis). Further, simulations indicate that some patterns observed in reported experimental data may result from the presence of truncated byproducts in the library mixture and not structure-activity relationships. Potential experimental methods of minimizing the presence of truncates are assessed via simulation; the relationship between enrichment and equilibrium association constant for libraries of differing purities is investigated. Data aggregation techniques are demonstrated that allow for more accurate analysis of screening results, in particular when the screened library contains significant quantities of truncates.
Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

PubMed

Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

2016-07-01

The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.
Genome sequencing of methanogenic Archaea Methanosarcina mazei TUC01 strain isolated from an Amazonian Flooded Area

NASA Astrophysics Data System (ADS)

Baraúna, R. A.; Graças, D. A.; Ramos, R. T.; Carneiro, A. R.; Lopes, T. S.; Lima, A. R.; Zahlouth, R. L.; Pellizari, V. H.; Silva, A.

2013-05-01

Methanosarcina mazei is a strictly anaerobic methanogen from the Methanosarcinales order. This species is known for its broad catabolic range among methanogens and is widespread throughout diverse environments. The draft genome of a strain cultivated from the sediment of the Tucuruí hydroelectric power station, the fourth largest hydroelectric dam in the world, is described here. Approximately 80% of methane is produced by biogenic sources, such as methanogenic archaea from M. mazei species. Although the methanogenesis pathway is well known, some aspects of the core genome, genome evolution and shared genes are still unclear. A sediment sample from the Tucuruí hydropower station reservoir was inoculated in mineral media supplemented with acetate and methanol. This media was maintained in an H2:CO2 (80:20) atmosphere to enrich and cultivate M. mazei. The enrichment was conducted at 30°C under standard anaerobic conditions. After several molecular and cellular analyses, total DNA was extracted from a non-pure culture of M. mazei, amplified using phi29 DNA polymerase (BioLabs) and finally used as a source template for genome sequencing. The draft genome was obtained after two rounds of sequencing. First, the genome was sequenced using a SOLiD System V3 with a mate-paired library, which yielded 24,405,103 and 24,399,268 reads (50 bp) for the R3 and F3 tags, respectively. The second round of sequencing was performed using the SOLiD 5500 XL platform with a mate-paired library, resulting in a total of 113,588,848 reads (60 bp) for each tag (F3 and R3). All reads obtained by this procedure were filtered using Quality Assessment software, whereby reads with an average quality score below Phred 20 were removed. Velvet and Edena were used to assemble the reads, and Simplifier was used to remove the redundant sequences. After this, a total of 16,811 contigs were obtained. M. mazei GO1 (AE008384) genome was used to map the contigs and generate the scaffolds. We used the Graphical Contig Analyzer for All Sequencing Platforms software (G4ALL; http://g4all.sourceforge.net/) to manually curate and generate the genome scaffold with gaps. The resultant gaps were manually closed using CLC Genomics Workbench software. M. mazei TUC01 genome contained 3,420,400 bp with a GC content of 42.47% distributed over 3 scaffolds that were annotated by RAST. A total of 2,959 coding DNA sequences (CDS) were predicted. The genome of M. mazei TUC01 (accession number: CP003077) will provide valuable information about the ecology of Methanosarcinales order and more accurate information about the methanogenesis pathway observed in the Neotropics. SPONSOR: Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq); Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES); Agência Nacional de Energia Elétrica (ANEEL); Centrais Elétricas do Norte do Brasil (Eletronorte).
The librarian's role in an enrichment program for high school students interested in the health professions.

PubMed

Rossini, Beverly; Burnham, Judy; Wright, Andrea

2013-01-01

Librarians from the University of South Alabama Biomedical Library partnered to participate in a program that targets minority students interested in health care with instruction in information literacy. Librarians participate in the summer enrichment programs designed to encourage minority students to enter health care professions by enhancing their preparation. The curriculum developed by the Biomedical Library librarians is focused on developing information searching skills. Students indicated that the library segment helped them in their library research efforts and helped them make more effective use of available resources. Librarians involved report a sense of self-satisfaction as the program allows them to contribute to promoting greater diversity in health care professions. Participating in the summer enrichment program has been beneficial to the students and librarians.
Construction of cDNA expression library of watermelon for isolation of ClWRKY1 transcription factors gene involved in resistance to Fusarium wilt.

PubMed

Yang, Bing-Yan; Huo, Xiu-Ai; Li, Peng-Fei; Wang, Cui-Xia; Duan, Hui-Jun

2014-08-01

Full-length cDNAs are very important for genome annotation and functional analysis of genes. The number of full-length cDNAs from watermelon remains limited. Here we report first the construction of a full-length enriched cDNA library from Fusarium wilt stressed watermelon (Citrullus lanatus Thunb.) cultivar PI296341 root tissues using the SMART method. The titer of primary cDNA library and amplified library was 2.21 x 10(6) and 2.0 x 10(10) pfu/ml, respectively and the rate of recombinant was above 85%. The size of insert fragment ranged from 0.3 to 2.0 kb. In this study, we first cloned a gene named ClWRKY1, which was 1981 bp long and encoded a protein consisting of 394 amino acids. It contained two characteristic WRKY domains and two zinc finger motifs. Quantitative real-time PCR showed that ClWRKY1 expression levels reached maximum level at 12 h after inoculation with Fusarium oxysporum f. sp. niveum. The full-length cDNA library of watermelon root tissues is not only essential for the cloning of genes which are known, but also an initial key for the screening and cloning of new genes that might be involved in resistance to Fusarium wilt.
Construction of a plant-transformation-competent BIBAC library and genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.)

PubMed Central

2013-01-01

Background Cotton, one of the world’s leading crops, is important to the world’s textile and energy industries, and is a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction of a plant-transformation-competent binary bacterial artificial chromosome (BIBAC) library and comparative genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.) with one of its diploid putative progenitor species, G. raimondii Ulbr. Results We constructed the cotton BIBAC library in a vector competent for high-molecular-weight DNA transformation in different plant species through either Agrobacterium or particle bombardment. The library contains 76,800 clones with an average insert size of 135 kb, providing an approximate 99% probability of obtaining at least one positive clone from the library using a single-copy probe. The quality and utility of the library were verified by identifying BIBACs containing genes important for fiber development, fiber cellulose biosynthesis, seed fatty acid metabolism, cotton-nematode interaction, and bacterial blight resistance. In order to gain an insight into the Upland cotton genome and its relationship with G. raimondii, we sequenced nearly 10,000 BIBAC ends (BESs) randomly selected from the library, generating approximately one BES for every 250 kb along the Upland cotton genome. The retroelement Gypsy/DIRS1 family predominates in the Upland cotton genome, accounting for over 77% of all transposable elements. From the BESs, we identified 1,269 simple sequence repeats (SSRs), of which 1,006 were new, thus providing additional markers for cotton genome research. Surprisingly, comparative sequence analysis showed that Upland cotton is much more diverged from G. raimondii at the genomic sequence level than expected. There seems to be no significant difference between the relationships of the Upland cotton D- and A-subgenomes with the G. raimondii genome, even though G. raimondii contains a D genome (D5). Conclusions The library represents the first BIBAC library in cotton and related species, thus providing tools useful for integrative physical mapping, large-scale genome sequencing and large-scale functional analysis of the Upland cotton genome. Comparative sequence analysis provides insights into the Upland cotton genome, and a possible mechanism underlying the divergence and evolution of polyploid Upland cotton from its diploid putative progenitor species, G. raimondii. PMID:23537070
Genetic engineering and improvement of a Zymomonas mobilis for arabinose utilization and its performance on pretreated corn stover hydrolyzate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chou, Yat -Chen; Linger, Jeffrey; Yang, Shihui

In this paper, a glucose, xylose and arabinose utilizing Zymomonas mobilis strain was constructed by incorporating arabinose catabolic pathway genes, araBAD encoding L-ribulokinase, L-arabinose isomerase and L-ribulose-5-phosphate- 4-epimerase in a glucose, xylose co-fermenting host, 8b, using a transposition integration approach. Further improvement on this arabinose-capable integrant, 33C was achieved by applying a second transposition to create a genomic knockout (KO) mutant library. Using arabinose as a sole carbon source and a selection pressure, the KO library was subjected to a growth-enrichment process involving continuous sub-culturing for over 120 generations. Strain 13-1-17, isolated from such process demonstrated significant improvement in metabolizingmore » arabinose in a dilute acid pretreated, saccharified corn stover slurry. Through Next Generation Sequencing (NGS) analysis, integration sites of the transposons were identified. Furthermore, multiple additional point mutations (SNPs: Single Nucleotide Polymorphisms) were discovered in 13-1-17, affecting genes araB and RpiB in the genome. Finally, we speculate that these mutations may have impacted the expression of the enzymes coded by these genes, ribulokinase and Ribose 5-P-isomerase, thus attributing to the improvement of the arabinose utilization.« less
Genetic engineering and improvement of a Zymomonas mobilis for arabinose utilization and its performance on pretreated corn stover hydrolyzate

DOE PAGES

Chou, Yat -Chen; Linger, Jeffrey; Yang, Shihui; ...

2015-04-28

In this paper, a glucose, xylose and arabinose utilizing Zymomonas mobilis strain was constructed by incorporating arabinose catabolic pathway genes, araBAD encoding L-ribulokinase, L-arabinose isomerase and L-ribulose-5-phosphate- 4-epimerase in a glucose, xylose co-fermenting host, 8b, using a transposition integration approach. Further improvement on this arabinose-capable integrant, 33C was achieved by applying a second transposition to create a genomic knockout (KO) mutant library. Using arabinose as a sole carbon source and a selection pressure, the KO library was subjected to a growth-enrichment process involving continuous sub-culturing for over 120 generations. Strain 13-1-17, isolated from such process demonstrated significant improvement in metabolizingmore » arabinose in a dilute acid pretreated, saccharified corn stover slurry. Through Next Generation Sequencing (NGS) analysis, integration sites of the transposons were identified. Furthermore, multiple additional point mutations (SNPs: Single Nucleotide Polymorphisms) were discovered in 13-1-17, affecting genes araB and RpiB in the genome. Finally, we speculate that these mutations may have impacted the expression of the enzymes coded by these genes, ribulokinase and Ribose 5-P-isomerase, thus attributing to the improvement of the arabinose utilization.« less
A new strategy for genome assembly using short sequence reads and reduced representation libraries.

PubMed

Young, Andrew L; Abaan, Hatice Ozel; Zerbino, Daniel; Mullikin, James C; Birney, Ewan; Margulies, Elliott H

2010-02-01

We have developed a novel approach for using massively parallel short-read sequencing to generate fast and inexpensive de novo genomic assemblies comparable to those generated by capillary-based methods. The ultrashort (<100 base) sequences generated by this technology pose specific biological and computational challenges for de novo assembly of large genomes. To account for this, we devised a method for experimentally partitioning the genome using reduced representation (RR) libraries prior to assembly. We use two restriction enzymes independently to create a series of overlapping fragment libraries, each containing a tractable subset of the genome. Together, these libraries allow us to reassemble the entire genome without the need of a reference sequence. As proof of concept, we applied this approach to sequence and assembled the majority of the 125-Mb Drosophila melanogaster genome. We subsequently demonstrate the accuracy of our assembly method with meaningful comparisons against the current available D. melanogaster reference genome (dm3). The ease of assembly and accuracy for comparative genomics suggest that our approach will scale to future mammalian genome-sequencing efforts, saving both time and money without sacrificing quality.

Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli.

PubMed

Glebes, Tirzah Y; Sandoval, Nicholas R; Gillis, Jacob H; Gill, Ryan T

2015-01-01

Engineering both feedstock and product tolerance is important for transitioning towards next-generation biofuels derived from renewable sources. Tolerance to chemical inhibitors typically results in complex phenotypes, for which multiple genetic changes must often be made to confer tolerance. Here, we performed a genome-wide search for furfural-tolerant alleles using the TRackable Multiplex Recombineering (TRMR) method (Warner et al. (2010), Nature Biotechnology), which uses chromosomally integrated mutations directed towards increased or decreased expression of virtually every gene in Escherichia coli. We employed various growth selection strategies to assess the role of selection design towards growth enrichments. We also compared genes with increased fitness from our TRMR selection to those from a previously reported genome-wide identification study of furfural tolerance genes using a plasmid-based genomic library approach (Glebes et al. (2014) PLOS ONE). In several cases, growth improvements were observed for the chromosomally integrated promoter/RBS mutations but not for the plasmid-based overexpression constructs. Through this assessment, four novel tolerance genes, ahpC, yhjH, rna, and dicA, were identified and confirmed for their effect on improving growth in the presence of furfural. © 2014 Wiley Periodicals, Inc.
The genome of the Antarctic-endemic copepod, Tigriopus kingsejongensis.

PubMed

Kang, Seunghyun; Ahn, Do-Hwan; Lee, Jun Hyuck; Lee, Sung Gu; Shin, Seung Chul; Lee, Jungeun; Min, Gi-Sik; Lee, Hyoungseok; Kim, Hyun-Woo; Kim, Sanghee; Park, Hyun

2017-01-01

The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi . The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis -specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota. © The Author 2017. Published by Oxford University Press.
The genome of the Antarctic-endemic copepod, Tigriopus kingsejongensis

PubMed Central

Kang, Seunghyun; Ahn, Do-Hwan; Lee, Jun Hyuck; Lee, Sung Gu; Shin, Seung Chul; Lee, Jungeun; Min, Gi-Sik; Lee, Hyoungseok

2017-01-01

Abstract Background: The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. Findings: We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi. The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis-specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. Conclusions: The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota. PMID:28369352
Enriching peptide libraries for binding affinity and specificity through computationally directed library design

PubMed Central

Foight, Glenna Wink; Chen, T. Scott; Richman, Daniel; Keating, Amy E.

2017-01-01

Peptide reagents with high affinity or specificity for their target protein interaction partner are of utility for many important applications. Optimization of peptide binding by screening large libraries is a proven and powerful approach. Libraries designed to be enriched in peptide sequences that are predicted to have desired affinity or specificity characteristics are more likely to yield success than random mutagenesis. We present a library optimization method in which the choice of amino acids to encode at each peptide position can be guided by available experimental data or structure-based predictions. We discuss how to use analysis of predicted library performance to inform rounds of library design. Finally, we include protocols for more complex library design procedures that consider the chemical diversity of the amino acids at each peptide position and optimize a library score based on a user-specified input model. PMID:28236241
Enriching Peptide Libraries for Binding Affinity and Specificity Through Computationally Directed Library Design.

PubMed

Foight, Glenna Wink; Chen, T Scott; Richman, Daniel; Keating, Amy E

2017-01-01

Peptide reagents with high affinity or specificity for their target protein interaction partner are of utility for many important applications. Optimization of peptide binding by screening large libraries is a proven and powerful approach. Libraries designed to be enriched in peptide sequences that are predicted to have desired affinity or specificity characteristics are more likely to yield success than random mutagenesis. We present a library optimization method in which the choice of amino acids to encode at each peptide position can be guided by available experimental data or structure-based predictions. We discuss how to use analysis of predicted library performance to inform rounds of library design. Finally, we include protocols for more complex library design procedures that consider the chemical diversity of the amino acids at each peptide position and optimize a library score based on a user-specified input model.
LOLAweb: a containerized web server for interactive genomic locus overlap enrichment analysis.

PubMed

Nagraj, V P; Magee, Neal E; Sheffield, Nathan C

2018-06-06

The past few years have seen an explosion of interest in understanding the role of regulatory DNA. This interest has driven large-scale production of functional genomics data and analytical methods. One popular analysis is to test for enrichment of overlaps between a query set of genomic regions and a database of region sets. In this way, new genomic data can be easily connected to annotations from external data sources. Here, we present an interactive interface for enrichment analysis of genomic locus overlaps using a web server called LOLAweb. LOLAweb accepts a set of genomic ranges from the user and tests it for enrichment against a database of region sets. LOLAweb renders results in an R Shiny application to provide interactive visualization features, enabling users to filter, sort, and explore enrichment results dynamically. LOLAweb is built and deployed in a Linux container, making it scalable to many concurrent users on our servers and also enabling users to download and run LOLAweb locally.
Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

PubMed Central

2013-01-01

Background System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. Results Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. Conclusions Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr. PMID:23586463
Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool.

PubMed

Chen, Edward Y; Tan, Christopher M; Kou, Yan; Duan, Qiaonan; Wang, Zichen; Meirelles, Gabriela Vaz; Clark, Neil R; Ma'ayan, Avi

2013-04-15

System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.
Construction of a nurse shark (Ginglymostoma cirratum) bacterial artificial chromosome (BAC) library and a preliminary genome survey.

PubMed

Luo, Meizhong; Kim, Hyeran; Kudrna, Dave; Sisneros, Nicholas B; Lee, So-Jeong; Mueller, Christopher; Collura, Kristi; Zuccolo, Andrea; Buckingham, E Bryan; Grim, Suzanne M; Yanagiya, Kazuyo; Inoko, Hidetoshi; Shiina, Takashi; Flajnik, Martin F; Wing, Rod A; Ohta, Yuko

2006-05-03

Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC) library for the nurse shark, Ginglymostoma cirratum. The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 x 1010 bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6-28 primary positive clones per probe of which 50-90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome.
Toward Understanding the Genetic Basis of Yak Ovary Reproduction: A Characterization and Comparative Analyses of Estrus Ovary Transcriptiome in Yak and Cattle.

PubMed

Lan, Daoliang; Xiong, Xianrong; Huang, Cai; Mipam, Tserang Donko; Li, Jian

2016-01-01

Yaks (Bos grunniens) are endemic species that can adapt well to thin air, cold temperatures, and high altitude. These species can survive in harsh plateau environments and are major source of animal production for local residents, being an important breed in the Qinghai-Tibet Plateau. However, compared with ordinary cattle that live in the plains, yaks generally have lower fertility. Investigating the basic physiological molecular features of yak ovary and identifying the biological events underlying the differences between the ovaries of yak and plain cattle is necessary to understand the specificity of yak reproduction. Therefore, RNA-seq technology was applied to analyze transcriptome data comparatively between the yak and plain cattle estrous ovaries. After deep sequencing, 3,653,032 clean reads with a total of 4,828,772,880 base pairs were obtained from yak ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome, among which, 12,731 and 14,631 genes were assigned to Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Furthermore, comparison of yak and cattle ovary transcriptome data revealed that 1307 genes were significantly and differentially expressed between the two libraries, wherein 661 genes were upregulated and 646 genes were downregulated in yak ovary. Functional analysis showed that the differentially expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. GO annotations indicated that the genes related to "cell adhesion," "hormonal" biological processes, and "calcium ion binding," "cation transmembrane transport" molecular events were significantly active. KEGG pathway analysis showed that the "complement and coagulation cascade" pathway was the most enriched in yak ovary transcriptome data, followed by the "cytochrome P450" related and "ECM-receptor interaction" pathways. Moreover, several novel pathways, such as "circadian rhythm," were significantly enriched despite having no evident associations with the reproductive function. Our findings provide a molecular resource for further investigation of the general molecular mechanism of yak ovary and offer new insights to understand comprehensively the specificity of yak reproduction.
Utilizing a library of synthetic affinity ligands for the enrichment, depletion and one-step purification of leech proteins.

PubMed

Dong, Dexian; Gui, Yanli; Chen, Dezhao; Li, Rongxiu

2008-01-01

Although the concept of affinity purification using synthetic ligands had been utilized for many years, there are few articles related to this research area, and they focus only on the affinity purification of specific protein by a defined library of synthetic ligands. This study presents the design and construction of a 700-member library of synthetic ligands in detail. We selected 297 ligand columns from a 700-member library of synthetic ligands to screen leech protein extract. Of the 297, 154 columns had an enrichment effect, 83 columns had a depletion effect, 36 columns had a one-step purification effect, and 58 columns had a one-step purification via flowthrough effect. The experimental results achieved by this large library of affinity ligands provide solid convincing data for the theory that affinity chromatography could be used for the enrichment of proteins that are present in low abundance, the depletion of high abundance proteins, and one-step purification of special proteins. 2008 John Wiley & Sons, Ltd
Analysis of BAC end sequences in oak, a keystone forest tree species, providing insight into the composition of its genome

PubMed Central

2011-01-01

Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357
Identification of the maize gravitropism gene lazy plant1 by a transposon-tagging genome resequencing strategy.

PubMed

Howard, Thomas P; Hayward, Andrew P; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A; Tohme, Joe; Kausch, Albert P; Mottinger, John P; Dellaporta, Stephen L

2014-01-01

Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform.
Identification of the Maize Gravitropism Gene lazy plant1 by a Transposon-Tagging Genome Resequencing Strategy

PubMed Central

Howard, Thomas P.; Hayward, Andrew P.; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A.; Tohme, Joe; Kausch, Albert P.; Mottinger, John P.; Dellaporta, Stephen L.

2014-01-01

Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform. PMID:24498020
Creation of BAC genomic resources for cocoa ( Theobroma cacao L.) for physical mapping of RGA containing BAC clones.

PubMed

Clément, D; Lanaud, C; Sabau, X; Fouet, O; Le Cunff, L; Ruiz, E; Risterucci, A M; Glaszmann, J C; Piffanelli, P

2004-05-01

We have constructed and validated the first cocoa ( Theobroma cacao L.) BAC library, with the aim of developing molecular resources to study the structure and evolution of the genome of this perennial crop. This library contains 36,864 clones with an average insert size of 120 kb, representing approximately ten haploid genome equivalents. It was constructed from the genotype Scavina-6 (Sca-6), a Forastero clone highly resistant to cocoa pathogens and a parent of existing mapping populations. Validation of the BAC library was carried out with a set of 13 genetically-anchored single copy and one duplicated markers. An average of nine BAC clones per probe was identified, giving an initial experimental estimation of the genome coverage represented in the library. Screening of the library with a set of resistance gene analogues (RGAs), previously mapped in cocoa and co-localizing with QTL for resistance to Phytophthora traits, confirmed at the physical level the tight clustering of RGAs in the cocoa genome and provided the first insights into the relationships between genetic and physical distances in the cocoa genome. This library represents an available BAC resource for structural genomic studies or map-based cloning of genes corresponding to important QTLs for agronomic traits such as resistance genes to major cocoa pathogens like Phytophthora spp ( palmivora and megakarya), Crinipellis perniciosa and Moniliophthora roreri.
Creating a RAW264.7 CRISPR-Cas9 Genome Wide Library

PubMed Central

Napier, Brooke A; Monack, Denise M

2017-01-01

The bacterial clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 genome editing tools are used in mammalian cells to knock-out specific genes of interest to elucidate gene function. The CRISPR-Cas9 system requires that the mammalian cell expresses Cas9 endonuclease, guide RNA (gRNA) to lead the endonuclease to the gene of interest, and the PAM sequence that links the Cas9 to the gRNA. CRISPR-Cas9 genome wide libraries are used to screen the effect of each gene in the genome on the cellular phenotype of interest, in an unbiased high-throughput manner. In this protocol, we describe our method of creating a CRISPR-Cas9 genome wide library in a transformed murine macrophage cell-line (RAW264.7). We have employed this library to identify novel mediators in the caspase-11 cell death pathway (Napier et al., 2016); however, this library can then be used to screen the importance of specific genes in multiple murine macrophage cellular pathways. PMID:28868328
Transcription start site associated RNAs (TSSaRNAs) are ubiquitous in all domains of life.

PubMed

Zaramela, Livia S; Vêncio, Ricardo Z N; ten-Caten, Felipe; Baliga, Nitin S; Koide, Tie

2014-01-01

A plethora of non-coding RNAs has been discovered using high-resolution transcriptomics tools, indicating that transcriptional and post-transcriptional regulation is much more complex than previously appreciated. Small RNAs associated with transcription start sites of annotated coding regions (TSSaRNAs) are pervasive in both eukaryotes and bacteria. Here, we provide evidence for existence of TSSaRNAs in several archaeal transcriptomes including: Halobacterium salinarum, Pyrococcus furiosus, Methanococcus maripaludis, and Sulfolobus solfataricus. We validated TSSaRNAs from the model archaeon Halobacterium salinarum NRC-1 by deep sequencing two independent small-RNA enriched (RNA-seq) and a primary-transcript enriched (dRNA-seq) strand-specific libraries. We identified 652 transcripts, of which 179 were shown to be primary transcripts (∼7% of the annotated genome). Distinct growth-associated expression patterns between TSSaRNAs and their cognate genes were observed, indicating a possible role in environmental responses that may result from RNA polymerase with varying pausing rhythms. This work shows that TSSaRNAs are ubiquitous across all domains of life.
Identification of Isopentenol Biosynthetic Genes from Bacillus subtilis by a Screening Method Based on Isoprenoid Precursor Toxicity▿

PubMed Central

Withers, Sydnor T.; Gottlieb, Shayin S.; Lieu, Bonny; Newman, Jack D.; Keasling, Jay D.

2007-01-01

We have developed a novel method to clone terpene synthase genes. This method relies on the inherent toxicity of the prenyl diphosphate precursors to terpenes, which resulted in a reduced-growth phenotype. When these precursors were consumed by a terpene synthase, normal growth was restored. We have demonstrated that this method is capable of enriching a population of engineered Escherichia coli for those clones that express the sesquiterpene-producing amorphadiene synthase. In addition, we enriched a library of genomic DNA from the isoprene-producing bacterium Bacillus subtilis strain 6051 in E. coli engineered to produce elevated levels of isopentenyl diphosphate and dimethylallyl diphosphate. The selection resulted in the discovery of two genes (yhfR and nudF) whose protein products acted directly on the prenyl diphosphate precursors and produced isopentenol. Expression of nudF in E. coli engineered with the mevalonate-based isopentenyl pyrophosphate biosynthetic pathway resulted in the production of isopentenol. PMID:17693564
Construction of a BAC library and mapping BAC clones to the linkage map of Barramundi, Lates calcarifer.

PubMed

Wang, Chun Ming; Lo, Loong Chueng; Feng, Felicia; Gong, Ping; Li, Jian; Zhu, Ze Yuan; Lin, Grace; Yue, Gen Hua

2008-03-25

Barramundi (Lates calcarifer) is an important farmed marine food fish species. Its first generation linkage map has been applied to map QTL for growth traits. To identify genes located in QTL responsible for specific traits, genomic large insert libraries are of crucial importance. We reported herein a bacterial artificial chromosome (BAC) library and the mapping of BAC clones to the linkage map. This BAC library consisted of 49,152 clones with an average insert size of 98 kb, representing 6.9-fold haploid genome coverage. Screening the library with 24 microsatellites and 15 ESTs/genes demonstrated that the library had good genome coverage. In addition, 62 novel microsatellites each isolated from 62 BAC clones were mapped onto the first generation linkage map. A total of 86 BAC clones were anchored on the linkage map with at least one BAC clone on each linkage group. We have constructed the first BAC library for L. calcarifer and mapped 86 BAC clones to the first generation linkage map. This BAC library and the improved linkage map with 302 DNA markers not only supply an indispensable tool to the integration of physical and linkage maps, the fine mapping of QTL and map based cloning genes located in QTL of commercial importance, but also contribute to comparative genomic studies and eventually whole genome sequencing.
Characterization of Mauritius parakeet (Psittacula eques) microsatellite loci and their cross-utility in other parrots (Psittacidae, Aves).

PubMed

Raisin, Claire; Dawson, Deborah A; Greenwood, Andrew G; Jones, Carl G; Groombridge, Jim J

2009-07-01

We characterized 21 polymorphic microsatellite loci in the endangered Mauritius parakeet (Psittacula eques). Loci were isolated from a Mauritius parakeet genomic library that had been enriched separately for eight different repeat motifs. Loci were characterized in up to 43 putatively unrelated Mauritius parakeets from a single population inhabiting the Black River Gorges National Park, Mauritius. Each locus displayed between three and nine alleles, with the observed heterozygosity ranging between 0.39 and 0.96. All loci were tested in 10 other parrot species. Despite testing few individuals, between seven and 21 loci were polymorphic in each of seven species tested. © 2009 Blackwell Publishing Ltd.

Nonclinical and Clinical Enterococcus faecium Strains, but Not Enterococcus faecalis Strains, Have Distinct Structural and Functional Genomic Features

PubMed Central

Kim, Eun Bae

2014-01-01

Certain strains of Enterococcus faecium and Enterococcus faecalis contribute beneficially to animal health and food production, while others are associated with nosocomial infections. To determine whether there are structural and functional genomic features that are distinct between nonclinical (NC) and clinical (CL) strains of those species, we analyzed the genomes of 31 E. faecium and 38 E. faecalis strains. Hierarchical clustering of 7,017 orthologs found in the E. faecium pangenome revealed that NC strains clustered into two clades and are distinct from CL strains. NC E. faecium genomes are significantly smaller than CL genomes, and this difference was partly explained by significantly fewer mobile genetic elements (ME), virulence factors (VF), and antibiotic resistance (AR) genes. E. faecium ortholog comparisons identified 68 and 153 genes that are enriched for NC and CL strains, respectively. Proximity analysis showed that CL-enriched loci, and not NC-enriched loci, are more frequently colocalized on the genome with ME. In CL genomes, AR genes are also colocalized with ME, and VF are more frequently associated with CL-enriched loci. Genes in 23 functional groups are also differentially enriched between NC and CL E. faecium genomes. In contrast, differences were not observed between NC and CL E. faecalis genomes despite their having larger genomes than E. faecium. Our findings show that unlike E. faecalis, NC and CL E. faecium strains are equipped with distinct structural and functional genomic features indicative of adaptation to different environments. PMID:24141120
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika

2010-01-27

Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less
Construction of a genomic DNA library with a TA vector and its application in cloning of the phytoene synthase gene from the cyanobacterium Spirulina platensis M-135

NASA Astrophysics Data System (ADS)

Yoshikazu, Kawata; Shin-Ichi, Yano; Hiroyuki, Kojima

1998-03-01

An efficient and simple method for constructing a genomic DNA library using a TA cloning vector is presented. It is based on the sonicative cleavage of genomic DNA and modification of fragment ends with Taq DNA polymerase, followed by ligation using a TA vector. This method was applied for cloning of the phytoene synthase gene crt B from Spirulina platensis. This method is useful when genomic DNA cannot be efficiently digested with restriction enzymes, a problem often encountered during the construction of a genomic DNA library of cyanobacteria.
MTTE: an innovative strategy for the evaluation of targeted/exome enrichment efficiency

PubMed Central

Klonowska, Katarzyna; Handschuh, Luiza; Swiercz, Aleksandra; Figlerowicz, Marek; Kozlowski, Piotr

2016-01-01

Although currently available strategies for the preparation of exome-enriched libraries are well established, a final validation of the libraries in terms of exome enrichment efficiency prior to the sequencing step is of considerable importance. Here, we present a strategy for the evaluation of exome enrichment, i.e., the Multipoint Test for Targeted-enrichment Efficiency (MTTE), PCR-based approach utilizing multiplex ligation-dependent probe amplification with capillary electrophoresis separation. We used MTTE for the analysis of subsequent steps of the Illumina TruSeq Exome Enrichment procedure. The calculated values of enrichment-associated parameters (i.e., relative enrichment, relative clearance, overall clearance, and fold enrichment) and the comparison of MTTE results with the actual enrichment revealed the high reliability of our assay. Additionally, the MTTE assay enabled the determination of the sequence-associated features that may confer bias in the enrichment of different targets. Importantly, the MTTE is low cost method that can be easily adapted to the region of interest important for a particular project. Thus, the MTTE strategy is attractive for post-capture validation in a variety of targeted/exome enrichment NGS projects. PMID:27572310
MTTE: an innovative strategy for the evaluation of targeted/exome enrichment efficiency.

PubMed

Klonowska, Katarzyna; Handschuh, Luiza; Swiercz, Aleksandra; Figlerowicz, Marek; Kozlowski, Piotr

2016-10-11

Although currently available strategies for the preparation of exome-enriched libraries are well established, a final validation of the libraries in terms of exome enrichment efficiency prior to the sequencing step is of considerable importance. Here, we present a strategy for the evaluation of exome enrichment, i.e., the Multipoint Test for Targeted-enrichment Efficiency (MTTE), PCR-based approach utilizing multiplex ligation-dependent probe amplification with capillary electrophoresis separation. We used MTTE for the analysis of subsequent steps of the Illumina TruSeq Exome Enrichment procedure. The calculated values of enrichment-associated parameters (i.e., relative enrichment, relative clearance, overall clearance, and fold enrichment) and the comparison of MTTE results with the actual enrichment revealed the high reliability of our assay. Additionally, the MTTE assay enabled the determination of the sequence-associated features that may confer bias in the enrichment of different targets. Importantly, the MTTE is low cost method that can be easily adapted to the region of interest important for a particular project. Thus, the MTTE strategy is attractive for post-capture validation in a variety of targeted/exome enrichment NGS projects.
Primer-Free Aptamer Selection Using A Random DNA Library

PubMed Central

Pan, Weihua; Xin, Ping; Patrick, Susan; Dean, Stacey; Keating, Christine; Clawson, Gary

2010-01-01

Aptamers are highly structured oligonucleotides (DNA or RNA) that can bind to targets with affinities comparable to antibodies 1. They are identified through an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX) to recognize a wide variety of targets, from small molecules to proteins and other macromolecules 2-4. Aptamers have properties that are well suited for in vivo diagnostic and/or therapeutic applications: Besides good specificity and affinity, they are easily synthesized, survive more rigorous processing conditions, they are poorly immunogenic, and their relatively small size can result in facile penetration of tissues. Aptamers that are identified through the standard SELEX process usually comprise ~80 nucleotides (nt), since they are typically selected from nucleic acid libraries with ~40 nt long randomized regions plus fixed primer sites of ~20 nt on each side. The fixed primer sequences thus can comprise nearly ~50% of the library sequences, and therefore may positively or negatively compromise identification of aptamers in the selection process 3, although bioinformatics approaches suggest that the fixed sequences do not contribute significantly to aptamer structure after selection 5. To address these potential problems, primer sequences have been blocked by complementary oligonucleotides or switched to different sequences midway during the rounds of SELEX 6, or they have been trimmed to 6-9 nt 7, 8. Wen and Gray 9 designed a primer-free genomic SELEX method, in which the primer sequences were completely removed from the library before selection and were then regenerated to allow amplification of the selected genomic fragments. However, to employ the technique, a unique genomic library has to be constructed, which possesses limited diversity, and regeneration after rounds of selection relies on a linear reamplification step. Alternatively, efforts to circumvent problems caused by fixed primer sequences using high efficiency partitioning are met with problems regarding PCR amplification 10. We have developed a primer-free (PF) selection method that significantly simplifies SELEX procedures and effectively eliminates primer-interference problems 11, 12. The protocols work in a straightforward manner. The central random region of the library is purified without extraneous flanking sequences and is bound to a suitable target (for example to a purified protein or complex mixtures such as cell lines). Then the bound sequences are obtained, reunited with flanking sequences, and re-amplified to generate selected sub-libraries. As an example, here we selected aptamers to S100B, a protein marker for melanoma. Binding assays showed Kd s in the 10-7 - 10-8 M range after a few rounds of selection, and we demonstrate that the aptamers function effectively in a sandwich binding format. PMID:20689511
Selective recognition of N4-methylcytosine in DNA by engineered transcription-activator-like effectors.

PubMed

Rathi, Preeti; Maurer, Sara; Summerer, Daniel

2018-06-05

The epigenetic DNA nucleobases 5-methylcytosine (5mC) and N 4-methylcytosine (4mC) coexist in bacterial genomes and have important functions in host defence and transcription regulation. To better understand the individual biological roles of both methylated nucleobases, analytical strategies for distinguishing unmodified cytosine (C) from 4mC and 5mC are required. Transcription-activator-like effectors (TALEs) are programmable DNA-binding repeat proteins, which can be re-engineered for the direct detection of epigenetic nucleobases in user-defined DNA sequences. We here report the natural, cytosine-binding TALE repeat to not strongly differentiate between 5mC and 4mC. To engineer repeats with selectivity in the context of C, 5mC and 4mC, we developed a homogeneous fluorescence assay and screened a library of size-reduced TALE repeats for binding to all three nucleobases. This provided insights into the requirements of size-reduced TALE repeats for 4mC binding and revealed a single mutant repeat as a selective binder of 4mC. Employment of a TALE with this repeat in affinity enrichment enabled the isolation of a user-defined DNA sequence containing a single 4mC but not C or 5mC from the background of a bacterial genome. Comparative enrichments with TALEs bearing this or the natural C-binding repeat provides an approach for the complete, programmable decoding of all cytosine nucleobases found in bacterial genomes.This article is part of a discussion meeting issue 'Frontiers in epigenetic chemical biology'. © 2018 The Author(s).
SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption.

PubMed

Ho, Michelle L; Adler, Benjamin A; Torre, Michael L; Silberg, Jonathan J; Suh, Junghae

2013-12-20

Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions.
SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption

PubMed Central

Ho, Michelle L.; Adler, Benjamin A.; Torre, Michael L.; Silberg, Jonathan J.; Suh, Junghae

2013-01-01

Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effectively transduce cells. Such improvements to the virus design process may help advance not only gene therapy applications, but also other bionanotechnologies dependent upon the development of viruses with new sequences and functions. PMID:23899192
Adaptation of a commercial robot for genome library replication

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uber, D.C.; Searles, W.L.

1994-01-01

This report describes tools and fixtures developed at the Human Genome Center at Lawrence Berkeley Laboratory for the Hewlett-Packard ORCA{trademark} (Optimized Robot for Chemical Analysis) to replicate large genome libraries. Photographs and engineering drawings of the various custom-designed components are included.
Recently evolved human-specific methylated regions are enriched in schizophrenia signals.

PubMed

Banerjee, Niladri; Polushina, Tatiana; Bettella, Francesco; Giddaluru, Sudheer; Steen, Vidar M; Andreassen, Ole A; Le Hellard, Stephanie

2018-05-11

One explanation for the persistence of schizophrenia despite the reduced fertility of patients is that it is a by-product of recent human evolution. This hypothesis is supported by evidence suggesting that recently-evolved genomic regions in humans are involved in the genetic risk for schizophrenia. Using summary statistics from genome-wide association studies (GWAS) of schizophrenia and 11 other phenotypes, we tested for enrichment of association with GWAS traits in regions that have undergone methylation changes in the human lineage compared to Neanderthals and Denisovans, i.e. human-specific differentially methylated regions (DMRs). We used analytical tools that evaluate polygenic enrichment of a subset of genomic variants against all variants. Schizophrenia was the only trait in which DMR SNPs showed clear enrichment of association that passed the genome-wide significance threshold. The enrichment was not observed for Neanderthal or Denisovan DMRs. The enrichment seen in human DMRs is comparable to that for genomic regions tagged by Neanderthal Selective Sweep markers, and stronger than that for Human Accelerated Regions. The enrichment survives multiple testing performed through permutation (n = 10,000) and bootstrapping (n = 5000) in INRICH (p < 0.01). Some enrichment of association with height was observed at the gene level. Regions where DNA methylation modifications have changed during recent human evolution show enrichment of association with schizophrenia and possibly with height. Our study further supports the hypothesis that genetic variants conferring risk of schizophrenia co-occur in genomic regions that have changed as the human species evolved. Since methylation is an epigenetic mark, potentially mediated by environmental changes, our results also suggest that interaction with the environment might have contributed to that association.
Flow Sorting of Marine Bacterioplankton after Fluorescence In Situ Hybridization

PubMed Central

Sekar, Raju; Fuchs, Bernhard M.; Amann, Rudolf; Pernthaler, Jakob

2004-01-01

We describe an approach to sort cells from coastal North Sea bacterioplankton by flow cytometry after in situ hybridization with rRNA-targeted horseradish peroxidase-labeled oligonucleotide probes and catalyzed fluorescent reporter deposition (CARD-FISH). In a sample from spring 2003 >90% of the cells were detected by CARD-FISH with a bacterial probe (EUB338). Approximately 30% of the microbial assemblage was affiliated with the Cytophaga-Flavobacterium lineage of the Bacteroidetes (CFB group) (probe CF319a), and almost 10% was targeted by a probe for the β-proteobacteria (probe BET42a). A protocol was optimized to detach cells hybridized with EUB338, BET42a, and CF319a from membrane filters (recovery rate, 70%) and to sort the cells by flow cytometry. The purity of sorted cells was >95%. 16S rRNA gene clone libraries were constructed from hybridized and sorted cells (S-EUB, S-BET, and S-CF libraries) and from unhybridized and unsorted cells (UNHYB library). Sequences related to the CFB group were significantly more frequent in the S-CF library (66%) than in the UNHYB library (13%). No enrichment of β-proteobacterial sequence types was found in the S-BET library, but novel sequences related to Nitrosospira were found exclusively in this library. These bacteria, together with members of marine clade OM43, represented >90% of the β-proteobacteria in the water sample, as determined by CARD-FISH with specific probes. This illustrates that a combination of CARD-FISH and flow sorting might be a powerful approach to study the diversity and potentially the activity and the genomes of different bacterial populations in aquatic habitats. PMID:15466568
GEAR: genomic enrichment analysis of regional DNA copy number changes.

PubMed

Kim, Tae-Min; Jung, Yu-Chae; Rhyu, Mun-Gan; Jung, Myeong Ho; Chung, Yeun-Jun

2008-02-01

We developed an algorithm named GEAR (genomic enrichment analysis of regional DNA copy number changes) for functional interpretation of genome-wide DNA copy number changes identified by array-based comparative genomic hybridization. GEAR selects two types of chromosomal alterations with potential biological relevance, i.e. recurrent and phenotype-specific alterations. Then it performs functional enrichment analysis using a priori selected functional gene sets to identify primary and clinical genomic signatures. The genomic signatures identified by GEAR represent functionally coordinated genomic changes, which can provide clues on the underlying molecular mechanisms related to the phenotypes of interest. GEAR can help the identification of key molecular functions that are activated or repressed in the tumor genomes leading to the improved understanding on the tumor biology. GEAR software is available with online manual in the website, http://www.systemsbiology.co.kr/GEAR/.
Microfluidic droplet enrichment for targeted sequencing

PubMed Central

Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.

2015-01-01

Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629
Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes.

PubMed

Oyola, Samuel O; Otto, Thomas D; Gu, Yong; Maslen, Gareth; Manske, Magnus; Campino, Susana; Turner, Daniel J; Macinnis, Bronwyn; Kwiatkowski, Dominic P; Swerdlow, Harold P; Quail, Michael A

2012-01-03

Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences. We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates. We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.
The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads.

PubMed

Wang, Zhiwen; Hobson, Neil; Galindo, Leonardo; Zhu, Shilin; Shi, Daihu; McDill, Joshua; Yang, Linfeng; Hawkins, Simon; Neutelings, Godfrey; Datla, Raju; Lambert, Georgina; Galbraith, David W; Grassa, Christopher J; Geraldes, Armando; Cronk, Quentin C; Cullis, Christopher; Dash, Prasanta K; Kumar, Polumetla A; Cloutier, Sylvie; Sharpe, Andrew G; Wong, Gane K-S; Wang, Jun; Deyholos, Michael K

2012-11-01

Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50) =694 kb, including contigs with N(50)=20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Mining the metagenome of activated biomass of an industrial wastewater treatment plant by a novel method.

PubMed

Sharma, Nandita; Tanksale, Himgouri; Kapley, Atya; Purohit, Hemant J

2012-12-01

Metagenomic libraries herald the era of magnifying the microbial world, tapping into the vast metabolic potential of uncultivated microbes, and enhancing the rate of discovery of novel genes and pathways. In this paper, we describe a method that facilitates the extraction of metagenomic DNA from activated sludge of an industrial wastewater treatment plant and its use in mining the metagenome via library construction. The efficiency of this method was demonstrated by the large representation of the bacterial genome in the constructed metagenomic libraries and by the functional clones obtained. The BAC library represented 95.6 times the bacterial genome, while, the pUC library represented 41.7 times the bacterial genome. Twelve clones in the BAC library demonstrated lipolytic activity, while four clones demonstrated dioxygenase activity. Four clones in pUC library tested positive for cellulase activity. This method, using FTA cards, not only can be used for library construction, but can also store the metagenome at room temperature.
GenomeD3Plot: a library for rich, interactive visualizations of genomic data in web applications.

PubMed

Laird, Matthew R; Langille, Morgan G I; Brinkman, Fiona S L

2015-10-15

A simple static image of genomes and associated metadata is very limiting, as researchers expect rich, interactive tools similar to the web applications found in the post-Web 2.0 world. GenomeD3Plot is a light weight visualization library written in javascript using the D3 library. GenomeD3Plot provides a rich API to allow the rapid visualization of complex genomic data using a convenient standards based JSON configuration file. When integrated into existing web services GenomeD3Plot allows researchers to interact with data, dynamically alter the view, or even resize or reposition the visualization in their browser window. In addition GenomeD3Plot has built in functionality to export any resulting genome visualization in PNG or SVG format for easy inclusion in manuscripts or presentations. GenomeD3Plot is being utilized in the recently released Islandviewer 3 (www.pathogenomics.sfu.ca/islandviewer/) to visualize predicted genomic islands with other genome annotation data. However, its features enable it to be more widely applicable for dynamic visualization of genomic data in general. GenomeD3Plot is licensed under the GNU-GPL v3 at https://github.com/brinkmanlab/GenomeD3Plot/. brinkman@sfu.ca. © The Author 2015. Published by Oxford University Press.
PinAPL-Py: A comprehensive web-application for the analysis of CRISPR/Cas9 screens.

PubMed

Spahn, Philipp N; Bath, Tyler; Weiss, Ryan J; Kim, Jihoon; Esko, Jeffrey D; Lewis, Nathan E; Harismendy, Olivier

2017-11-20

Large-scale genetic screens using CRISPR/Cas9 technology have emerged as a major tool for functional genomics. With its increased popularity, experimental biologists frequently acquire large sequencing datasets for which they often do not have an easy analysis option. While a few bioinformatic tools have been developed for this purpose, their utility is still hindered either due to limited functionality or the requirement of bioinformatic expertise. To make sequencing data analysis of CRISPR/Cas9 screens more accessible to a wide range of scientists, we developed a Platform-independent Analysis of Pooled Screens using Python (PinAPL-Py), which is operated as an intuitive web-service. PinAPL-Py implements state-of-the-art tools and statistical models, assembled in a comprehensive workflow covering sequence quality control, automated sgRNA sequence extraction, alignment, sgRNA enrichment/depletion analysis and gene ranking. The workflow is set up to use a variety of popular sgRNA libraries as well as custom libraries that can be easily uploaded. Various analysis options are offered, suitable to analyze a large variety of CRISPR/Cas9 screening experiments. Analysis output includes ranked lists of sgRNAs and genes, and publication-ready plots. PinAPL-Py helps to advance genome-wide screening efforts by combining comprehensive functionality with user-friendly implementation. PinAPL-Py is freely accessible at http://pinapl-py.ucsd.edu with instructions and test datasets.
A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

PubMed

Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

2016-11-21

Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum. These resources as a robust platform will be used in high-resolution mapping, gene cloning, assembly of genome sequences, comparative genomics and evolution for sweetpotato.

Profound Tissue Specificity in Proliferation Control Underlies Cancer Drivers and Aneuploidy Patterns.

PubMed

Sack, Laura Magill; Davoli, Teresa; Li, Mamie Z; Li, Yuyang; Xu, Qikai; Naxerova, Kamila; Wooten, Eric C; Bernardi, Ronald J; Martin, Timothy D; Chen, Ting; Leng, Yumei; Liang, Anthony C; Scorsone, Kathleen A; Westbrook, Thomas F; Wong, Kwok-Kin; Elledge, Stephen J

2018-04-05

Genomics has provided a detailed structural description of the cancer genome. Identifying oncogenic drivers that work primarily through dosage changes is a current challenge. Unrestrained proliferation is a critical hallmark of cancer. We constructed modular, barcoded libraries of human open reading frames (ORFs) and performed screens for proliferation regulators in multiple cell types. Approximately 10% of genes regulate proliferation, with most performing in an unexpectedly highly tissue-specific manner. Proliferation drivers in a given cell type showed specific enrichment in somatic copy number changes (SCNAs) from cognate tumors and helped predict aneuploidy patterns in those tumors, implying that tissue-type-specific genetic network architectures underlie SCNA and driver selection in different cancers. In vivo screening confirmed these results. We report a substantial contribution to the catalog of SCNA-associated cancer drivers, identifying 147 amplified and 107 deleted genes as potential drivers, and derive insights about the genetic network architecture of aneuploidy in tumors. Copyright © 2018 Elsevier Inc. All rights reserved.
Rapid and continuous magnetic separation in droplet microfluidic devices.

PubMed

Brouzes, Eric; Kruse, Travis; Kimmerling, Robert; Strey, Helmut H

2015-02-07

We present a droplet microfluidic method to extract molecules of interest from a droplet in a rapid and continuous fashion. We accomplish this by first marginalizing functionalized super-paramagnetic beads within the droplet using a magnetic field, and then splitting the droplet into one droplet containing the majority of magnetic beads and one droplet containing the minority fraction. We quantitatively analysed the factors which affect the efficiency of marginalization and droplet splitting to optimize the enrichment of magnetic beads. We first characterized the interplay between the droplet velocity and the strength of the magnetic field and its effect on marginalization. We found that marginalization is optimal at the midline of the magnet and that marginalization is a good predictor of bead enrichment through splitting at low to moderate droplet velocities. Finally, we focused our efforts on manipulating the splitting profile to improve the enrichment provided by asymmetric splitting. We designed asymmetric splitting forks that employ capillary effects to preferentially extract the bead-rich regions of the droplets. Our strategy represents a framework to optimize magnetic bead enrichment methods tailored to the requirements of specific droplet-based applications. We anticipate that our separation technology is well suited for applications in single-cell genomics and proteomics. In particular, our method could be used to separate mRNA bound to poly-dT functionalized magnetic microparticles from single cell lysates to prepare single-cell cDNA libraries.
Rapid and continuous magnetic separation in droplet microfluidic devices

PubMed Central

Brouzes, Eric; Kruse, Travis; Kimmerling, Robert; Strey, Helmut H.

2015-01-01

We present a droplet microfluidic method to extract molecules of interest from a droplet in a rapid and continuous fashion. We accomplish this by first marginalizing functionalized super-paramagnetic beads within the droplet using a magnetic field, and then splitting the droplet into one droplet containing the majority of magnetic beads and one droplet containing the minority fraction. We quantitatively analysed the factors which affect the efficiency of marginalization and droplet splitting to optimize the enrichment of magnetic beads. We first characterized the interplay between the droplet velocity and the strength of the magnetic field and its effect on marginalization. We found that marginalization is optimal at the midline of the magnet and that marginalization is a good predictor of bead enrichment through splitting at low to moderate droplet velocities. Finally, we focused our efforts on manipulating the splitting profile to improve the enrichment provided by asymmetric splitting. We designed asymmetric splitting forks that employ capillary effects to preferentially extract the bead-rich regions of the droplets. Our strategy represents a framework to optimize magnetic bead enrichment methods tailored to the requirements of specific droplet-based applications. We anticipate that our separation technology is well suited for applications in single-cell genomics and proteomics. In particular, our method could be used to separate mRNA bound to poly-dT functionalized magnetic microparticles from single cell lysates to prepare single-cell cDNA libraries. PMID:25501881
Comparative Transcriptomic Analysis Reveals Candidate Genes and Pathways Involved in Larval Settlement of the Barnacle Megabalanus volcano.

PubMed

Yan, Guoyong; Zhang, Gen; Huang, Jiaomei; Lan, Yi; Sun, Jin; Zeng, Cong; Wang, Yong; Qian, Pei-Yuan; He, Lisheng

2017-10-27

Megabalanus barnacle is one of the model organisms for marine biofouling research. However, further elucidation of molecular mechanisms underlying larval settlement has been hindered due to the lack of genomic information thus far. In the present study, cDNA libraries were constructed for cyprids, the key stage for larval settlement, and adults of Megabalanus volcano . After high-throughput sequencing and de novo assembly, 42,620 unigenes were obtained with a N50 value of 1532 bp. These unigenes were annotated by blasting against the NCBI non-redundant (nr), Swiss-Prot, Cluster of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Finally, 19,522, 15,691, 14,459, and 10,914 unigenes were identified correspondingly. There were 22,158 differentially expressed genes (DEGs) identified between two stages. Compared with the cyprid stage, 8241 unigenes were down-regulated and 13,917 unigenes were up-regulated at the adult stage. The neuroactive ligand-receptor interaction pathway (ko04080) was significantly enriched by KEGG enrichment analysis of the DEGs, suggesting that it possibly involved in larval settlement. Potential functions of three conserved allatostatin neuropeptide-receptor pairs and two light-sensitive opsin proteins were further characterized, indicating that they might regulate attachment and metamorphosis at cyprid stage. These results provided a deeper insight into the molecular mechanisms underlying larval settlement of barnacles.
Comparative Transcriptomic Analysis Reveals Candidate Genes and Pathways Involved in Larval Settlement of the Barnacle Megabalanus volcano

PubMed Central

Yan, Guoyong; Huang, Jiaomei; Lan, Yi; Zeng, Cong; Wang, Yong; Qian, Pei-Yuan; He, Lisheng

2017-01-01

Megabalanus barnacle is one of the model organisms for marine biofouling research. However, further elucidation of molecular mechanisms underlying larval settlement has been hindered due to the lack of genomic information thus far. In the present study, cDNA libraries were constructed for cyprids, the key stage for larval settlement, and adults of Megabalanus volcano. After high-throughput sequencing and de novo assembly, 42,620 unigenes were obtained with a N50 value of 1532 bp. These unigenes were annotated by blasting against the NCBI non-redundant (nr), Swiss-Prot, Cluster of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Finally, 19,522, 15,691, 14,459, and 10,914 unigenes were identified correspondingly. There were 22,158 differentially expressed genes (DEGs) identified between two stages. Compared with the cyprid stage, 8241 unigenes were down-regulated and 13,917 unigenes were up-regulated at the adult stage. The neuroactive ligand-receptor interaction pathway (ko04080) was significantly enriched by KEGG enrichment analysis of the DEGs, suggesting that it possibly involved in larval settlement. Potential functions of three conserved allatostatin neuropeptide-receptor pairs and two light-sensitive opsin proteins were further characterized, indicating that they might regulate attachment and metamorphosis at cyprid stage. These results provided a deeper insight into the molecular mechanisms underlying larval settlement of barnacles. PMID:29077039
Efficient engineering of chromosomal ribosome binding site libraries in mismatch repair proficient Escherichia coli.

PubMed

Oesterle, Sabine; Gerngross, Daniel; Schmitt, Steven; Roberts, Tania Michelle; Panke, Sven

2017-09-26

Multiplexed gene expression optimization via modulation of gene translation efficiency through ribosome binding site (RBS) engineering is a valuable approach for optimizing artificial properties in bacteria, ranging from genetic circuits to production pathways. Established algorithms design smart RBS-libraries based on a single partially-degenerate sequence that efficiently samples the entire space of translation initiation rates. However, the sequence space that is accessible when integrating the library by CRISPR/Cas9-based genome editing is severely restricted by DNA mismatch repair (MMR) systems. MMR efficiency depends on the type and length of the mismatch and thus effectively removes potential library members from the pool. Rather than working in MMR-deficient strains, which accumulate off-target mutations, or depending on temporary MMR inactivation, which requires additional steps, we eliminate this limitation by developing a pre-selection rule of genome-library-optimized-sequences (GLOS) that enables introducing large functional diversity into MMR-proficient strains with sequences that are no longer subject to MMR-processing. We implement several GLOS-libraries in Escherichia coli and show that GLOS-libraries indeed retain diversity during genome editing and that such libraries can be used in complex genome editing operations such as concomitant deletions. We argue that this approach allows for stable and efficient fine tuning of chromosomal functions with minimal effort.
Retrotransposon Capture Sequencing (RC-Seq): A Targeted, High-Throughput Approach to Resolve Somatic L1 Retrotransposition in Humans.

PubMed

Sanchez-Luque, Francisco J; Richardson, Sandra R; Faulkner, Geoffrey J

2016-01-01

Mobile genetic elements (MGEs) are of critical importance in genomics and developmental biology. Polymorphic and somatic MGE insertions have the potential to impact the phenotype of an individual, depending on their genomic locations and functional consequences. However, the identification of polymorphic and somatic insertions among the plethora of copies residing in the genome presents a formidable technical challenge. Whole genome sequencing has the potential to address this problem; however, its efficacy depends on the abundance of cells carrying the new insertion. Robust detection of somatic insertions present in only a subset of cells within a given sample can also be prohibitively expensive due to a requirement for high sequencing depth. Here, we describe retrotransposon capture sequencing (RC-seq), a sequence capture approach in which Illumina libraries are enriched for fragments containing the 5' and 3' termini of specific MGEs. RC-seq allows the detection of known polymorphic insertions present in an individual, as well as the identification of rare or private germline insertions not previously described. Furthermore, RC-seq can be used to detect and characterize somatic insertions, providing a valuable tool to elucidate the extent and characteristics of MGE activity in healthy tissues and in various disease states.
Calibrating genomic and allelic coverage bias in single-cell sequencing.

PubMed

Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

2015-04-16

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing

PubMed Central

Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

2016-01-01

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
The Whole-Genome and Transcriptome of the Manila Clam (Ruditapes philippinarum).

PubMed

Mun, Seyoung; Kim, Yun-Ji; Markkandan, Kesavan; Shin, Wonseok; Oh, Sumin; Woo, Jiyoung; Yoo, Jongsu; An, Hyesuck; Han, Kyudong

2017-06-01

The manila clam, Ruditapes philippinarum, is an important bivalve species in worldwide aquaculture including Korea. The aquaculture production of R. philippinarum is under threat from diverse environmental factors including viruses, microorganisms, parasites, and water conditions with subsequently declining production. In spite of its importance as a marine resource, the reference genome of R. philippinarum for comprehensive genetic studies is largely unexplored. Here, we report the de novo whole-genome and transcriptome assembly of R. philippinarum across three different tissues (foot, gill, and adductor muscle), and provide the basic data for advanced studies in selective breeding and disease control in order to obtain successful aquaculture systems. An approximately 2.56 Gb high quality whole-genome was assembled with various library construction methods. A total of 108,034 protein coding gene models were predicted and repetitive elements including simple sequence repeats and noncoding RNAs were identified to further understanding of the genetic background of R. philippinarum for genomics-assisted breeding. Comparative analysis with the bivalve marine invertebrates uncover that the gene family related to complement C1q was enriched. Furthermore, we performed transcriptome analysis with three different tissues in order to support genome annotation and then identified 41,275 transcripts which were annotated. The R. philippinarum genome resource will markedly advance a wide range of potential genetic studies, a reference genome for comparative analysis of bivalve species and unraveling mechanisms of biological processes in molluscs. We believe that the R. philippinarum genome will serve as an initial platform for breeding better-quality clams using a genomic approach. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Construction of the BAC Library of Small Abalone (Haliotis diversicolor) for Gene Screening and Genome Characterization.

PubMed

Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng

2016-02-01

The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor.
Bacterial Artificial Chromosome Libraries for Mouse Sequencing and Functional Analysis

PubMed Central

Osoegawa, Kazutoyo; Tateno, Minako; Woon, Peng Yeong; Frengen, Eirik; Mammoser, Aaron G.; Catanese, Joseph J.; Hayashizaki, Yoshihide; de Jong, Pieter J.

2000-01-01

Bacterial artificial chromosome (BAC) and P1-derived artificial chromosome (PAC) libraries providing a combined 33-fold representation of the murine genome have been constructed using two different restriction enzymes for genomic digestion. A large-insert PAC library was prepared from the 129S6/SvEvTac strain in a bacterial/mammalian shuttle vector to facilitate functional gene studies. For genome mapping and sequencing, we prepared BAC libraries from the 129S6/SvEvTac and the C57BL/6J strains. The average insert sizes for the three libraries range between 130 kb and 200 kb. Based on the numbers of clones and the observed average insert sizes, we estimate each library to have slightly in excess of 10-fold genome representation. The average number of clones found after hybridization screening with 28 probes was in the range of 9–14 clones per marker. To explore the fidelity of the genomic representation in the three libraries, we analyzed three contigs, each established after screening with a single unique marker. New markers were established from the end sequences and screened against all the contig members to determine if any of the BACs and PACs are chimeric or rearranged. Only one chimeric clone and six potential deletions have been observed after extensive analysis of 113 PAC and BAC clones. Seventy-one of the 113 clones were conclusively nonchimeric because both end markers or sequences were mapped to the other confirmed contig members. We could not exclude chimerism for the remaining 41 clones because one or both of the insert termini did not contain unique sequence to design markers. The low rate of chimerism, ∼1%, and the low level of detected rearrangements support the anticipated usefulness of the BAC libraries for genome research. [The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AQ797173–AQ797398.] PMID:10645956
Comprehensive viral enrichment enables sensitive respiratory virus genomic identification and analysis by next generation sequencing.

PubMed

O'Flaherty, Brigid M; Li, Yan; Tao, Ying; Paden, Clinton R; Queen, Krista; Zhang, Jing; Dinwiddie, Darrell L; Gross, Stephen M; Schroth, Gary P; Tong, Suxiang

2018-06-01

Next generation sequencing (NGS) technologies have revolutionized the genomics field and are becoming more commonplace for identification of human infectious diseases. However, due to the low abundance of viral nucleic acids (NAs) in relation to host, viral identification using direct NGS technologies often lacks sufficient sensitivity. Here, we describe an approach based on two complementary enrichment strategies that significantly improves the sensitivity of NGS-based virus identification. To start, we developed two sets of DNA probes to enrich virus NAs associated with respiratory diseases. The first set of probes spans the genomes, allowing for identification of known viruses and full genome sequencing, while the second set targets regions conserved among viral families or genera, providing the ability to detect both known and potentially novel members of those virus groups. Efficiency of enrichment was assessed by NGS testing reference virus and clinical samples with known infection. We show significant improvement in viral identification using enriched NGS compared to unenriched NGS. Without enrichment, we observed an average of 0.3% targeted viral reads per sample. However, after enrichment, 50%-99% of the reads per sample were the targeted viral reads for both the reference isolates and clinical specimens using both probe sets. Importantly, dramatic improvements on genome coverage were also observed following virus-specific probe enrichment. The methods described here provide improved sensitivity for virus identification by NGS, allowing for a more comprehensive analysis of disease etiology. © 2018 O'Flaherty et al.; Published by Cold Spring Harbor Laboratory Press.
DNA capture and next-generation sequencing can recover whole mitochondrial genomes from highly degraded samples for human identification

PubMed Central

2013-01-01

Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
Analysis of expressed sequence tags from the four main developmental stages of Trypanosoma congolense

PubMed Central

Helm, Jared R.; Hertz-Fowler, Christiane; Aslett, Martin; Berriman, Matthew; Sanders, Mandy; Quail, Michael A.; Soares, Marcelo B.; Bonaldo, Maria F.; Sakurai, Tatsuya; Inoue, Noboru; Donelson, John E.

2009-01-01

Trypanosoma congolense is one of the most economically important pathogens of livestock in Africa. Culture-derived parasites of each of the three main insect stages of the T. congolense life cycle, i.e., the procyclic, epimastigote and metacyclic stages, and bloodstream stage parasites isolated from infected mice, were used to construct stage-specific cDNA libraries and expressed sequence tags (ESTs or cDNA clones) in each library were sequenced. Thirteen EST clusters encoding different variant surface glycoproteins (VSGs) were detected in the metacyclic library and twenty-six VSG EST clusters were found in the bloodstream library, six of which are shared by the metacyclic library. Rare VSG ESTs are present in the epimastigote library, and none were detected in the procyclic library. ESTs encoding enzymes that catalyze oxidative phosphorylation and amino acid metabolism are about twice as abundant in the procyclic and epimastigote stages as in the metacyclic and bloodstream stages. In contrast, ESTs encoding enzymes involved in glycolysis, the citric acid cycle and nucleotide metabolism are about the same in all four developmental stages. Cysteine proteases, kinases and phosphatases are the most abundant enzyme groups represented by the ESTs. All four libraries contain T. congolense-specific expressed sequences not present in the T. brucei and T. cruzi genomes. Normalized cDNA libraries were constructed from the metacyclic and bloodstream stages, and found to be further enriched for T. congolense-specific ESTs. Given that cultured T. congolense offers an experimental advantage over other African trypanosome species, these ESTs provide a basis for further investigation of the molecular properties of these four developmental stages, especially the epimastigote and metacyclic stages for which it is difficult to obtain large quantities of organisms. The T. congolense EST databases are available at: http://www.sanger.ac.uk/Projects/T_congolense/EST_index.shtml. PMID:19559733
Development and characterization of microsatellite markers for Hibiscus glaber Matsum. ex Nakai, an endemic tree species of the oceanic Bonin Islands, Japan.

PubMed

Ohtani, Masato; Tani, Naoki; Yoshimaru, Hiroshi

2008-11-01

Polymorphic microsatellite markers were developed for Hibiscus glaber, an endemic tree of the Bonin Islands. Eighty-seven of the 208 sequences from an enriched library were unique and containing microsatellites. Ten loci were proved to be highly polymorphic among 78 individuals from the Nishi-jima Island. Total exclusionary powers for the first and the second parents were 99.989% and 99.999%, respectively. Nine loci also amplified single fragment from genomic DNA of H. tiliaceus, a related and widespread congener. Our markers can be reliably used for the estimation of current gene flow within/among populations of the two woody Hibiscus species. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
Development of microsatellite markers from loquat, Eriobotrya japonica (Thunb.) Lindl.

PubMed

Gisbert, A D; Lopez-Capuz, I; Soriano, J M; Llacer, G; Romero, C; Badenes, M L

2009-05-01

Loquat (Eriobotrya japonica) is a minor fruit which has become an interesting alternative into the European fruit industry. This interest resulted in a loquat germplasm collection established at the Instituto Valenciano de Investigaciones Agrarias, Valencia, Spain. Currently, it is the main reservoir of this species outside Asia. We developed and characterized the first 21 polymorphic microsatellite loci from a CT/AG-enriched loquat genomic library. The observed heterozygosity ranged between 0.20 and 1.00, expected heterozygosity ranged between 0.17 and 0.81, three markers were multilocus and eight loci departed significantly from Hardy-Weinberg equilibrium. These markers will facilitate diversity and genetic studies into the species. © 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd.
Development of microsatellites from Fothergilla ×intermedia (Hamamelidaceae) and cross transfer to four other genera within Hamamelidaceae1

PubMed Central

Hatmaker, E. Anne; Wadl, Phillip A.; Mantooth, Kristie; Scheffler, Brian E.; Ownley, Bonnie H.; Trigiano, Robert N.

2015-01-01

Premise of the study: We developed microsatellites from Fothergilla ×intermedia to establish loci capable of distinguishing species and cultivars, and to assess genetic diversity for use by ornamental breeders and to transfer within Hamamelidaceae. Methods and Results: We sequenced a small insert genomic library enriched for microsatellites to develop 12 polymorphic microsatellite loci. The number of alleles detected ranged from four to 15 across five genera within Hamamelidaceae. Shannon’s information index ranged from 0.07 to 0.14. Conclusions: These microsatellite loci provide a set of markers to evaluate genetic diversity of natural and cultivated collections and assist ornamental plant breeders for genetic studies of five popular genera of woody ornamental plants. PMID:25909044
Isolation and characterization of microsatellite markers in Acca sellowiana (Berg) Burret.

PubMed

Santos, K L; Santos, M O; Laborda, P R; Souza, A P; Peroni, N; Nodari, R O

2008-09-01

Acca sellowiana has commercial potential due to the quality and the unique flavor of its fruit. Conservation of natural populations and management of breeding programmes would benefit from the availability of molecular markers that could be used to characterize levels and distribution of genetic variability. Thus, 13 microsatellite markers were developed from an enriched genomic library of A. sellowiana. They were characterized using 40 samples. The expected and observed heterozygosities ranged from 0.513 to 0.913 and from 0.200 to 0.889, respectively. These are the first microsatellite loci characterized from A. sellowiana that will contribute to improve researches on its genetic conservation, characterization and breeding. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.
Isolation and characterization of microsatellite markers in Acca sellowiana (Berg) Burret.

PubMed

Santos, K L; Santos, M O; Laborda, P R; Souza, A P; Peroni, N; Nodari, R O

2008-11-01

Acca sellowiana has commercial potential because of the quality and the unique flavor of its fruit. Conservation of natural populations and management of breeding programmes would benefit from the availability of molecular markers that could be used to characterize levels and distribution of genetic variability. Thus, 13 microsatellite markers were developed from an enriched genomic library of A. sellowiana. They were characterized using 40 samples. The expected and observed heterozygosities ranged from 0.513 to 0.913 and from 0.200 to 0.889, respectively. These are the first microsatellite loci characterized from A. sellowiana that will contribute to improve researches on the genetic conservation, characterization and breeding. Journal compilation © 2008 Blackwell Publishing Ltd. No claim to original US government works.

Characterization of 10 new nuclear microsatellite markers in Acca sellowiana (Myrtaceae).

PubMed

Klabunde, Gustavo H F; Olkoski, Denise; Vilperte, Vinicius; Zucchi, Maria I; Nodari, Rubens O

2014-06-01

Microsatellite primers were identified and characterized in Acca sellowiana in order to expand the limited number of pre-existing polymorphic markers for use in population genetic studies for conservation, phylogeography, breeding, and domestication. • A total of 10 polymorphic microsatellite primers were designed from clones obtained from a simple sequence repeat (SSR)-enriched genomic library. The primers amplified di- and trinucleotide repeats with four to 27 alleles per locus. In all tested populations, the observed heterozygosity ranged from 0.269 to 1.0. • These new polymorphic SSR markers will allow future genetic studies to be denser, either for genetic structure characterization of natural populations or for studies involving genetic breeding and domestication process in A. sellowiana.
Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

PubMed Central

Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

2016-01-01

ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including streptothricins, borrelidin, two novel lipopeptides, and one unknown antibiotic from Streptomyces rochei Sal35. The transfer, expression, and screening of the library were all performed in a high-throughput way, so that this approach is scalable and adaptable to industrial automation for next-generation antibiotic discovery. PMID:27451447
Characterization and comparative profiling of the small RNA transcriptomes in the Hemipteran insect Nilaparvata lugens.

PubMed

Zha, Wenjun; Zhou, Lei; Li, Sanhe; Liu, Kai; Yang, Guocai; Chen, Zhijun; Liu, Kai; Xu, Huashan; Li, Peide; Hussain, Saddam; You, Aiqing

2016-12-20

MicroRNAs (miRNAs) are a group of small RNAs involved in various biological processes through negative regulation of mRNAs at the post-transcriptional level. The brown planthopper (BPH), Nilaparvata lugens (Stål), is one of the most serious and destructive insect pests of rice. In the present study, two small RNA libraries of virulent N. lugens populations (Biotype I survives on susceptive rice variety TN1 and Biotype Y survives on moderately resistant rice variety YHY15) were constructed and sequenced using the high-throughput sequencing technology in order to identify the relationship between miRNAs of N.lugens and adaptation of BPH pests to rice resistance. In total 15,758,632 and 11,442,592 reads, corresponding to 3,144,026 and 2,550,049 unique sequences, were obtained in the two libraries (BPH-TN1 and BPH-YHY15 libraries), respectively. A total of 41 potential novel miRNAs were predicted in the two libraries, and 26 miRNAs showed significantly differential expression between two libraries. All miRNAs were significantly up-regulated in the BPH-TN1 library. Target genes likely regulated by these differentially expressed miRNAs were predicted using computational prediction. The functional annotation of target genes performed by Gene Ontology enrichment (GO) and Kyoto Encyclopedia of Genes and Genomes pathway analysis (KEGG) indicated that a majority of differential miRNAs were involved in "Metabolism" pathway. These results provided an understanding of the role of miRNAs in BPH to adaptability of BPH on rice resistance, and will be useful in developing new control strategies for host defense against BPH. Copyright © 2016 Elsevier B.V. All rights reserved.
Single haplotype assembly of the human genome from a hydatidiform mole.

PubMed

Steinberg, Karyn Meltz; Schneider, Valerie A; Graves-Lindsay, Tina A; Fulton, Robert S; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C; Church, Deanna M; Eichler, Evan E; Wilson, Richard K

2014-12-01

A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly. © 2014 Steinberg et al.; Published by Cold Spring Harbor Laboratory Press.
Single haplotype assembly of the human genome from a hydatidiform mole

PubMed Central

Steinberg, Karyn Meltz; Schneider, Valerie A.; Graves-Lindsay, Tina A.; Fulton, Robert S.; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A.; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C.; Church, Deanna M.; Eichler, Evan E.; Wilson, Richard K.

2014-01-01

A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly. PMID:25373144
36 CFR 701.2 - Acquisition of Library material by non-purchase means.

Code of Federal Regulations, 2010 CFR

2010-07-01

... policy of the Library of Congress to foster the enrichment of its collections through gifts of materials within the terms of the Library's acquisitions policies. In implementing this policy, division chiefs and... accompanied by a signed Agreement of Deposit. (2) It is the policy of the Library of Congress to accept...
3G vector-primer plasmid for constructing full-length-enriched cDNA libraries.

PubMed

Zheng, Dong; Zhou, Yanna; Zhang, Zidong; Li, Zaiyu; Liu, Xuedong

2008-09-01

We designed a 3G vector-primer plasmid for the generation of full-length-enriched complementary DNA (cDNA) libraries. By employing the terminal transferase activity of reverse transcriptase and the modified strand replacement method, this plasmid (assembled with a polydT end and a deoxyguanosine [dG] end) combines priming full-length cDNA strand synthesis and directional cDNA cloning. As a result, the number of steps involved in cDNA library preparation is decreased while simplifying downstream gene manipulation, sequencing, and subcloning. The 3G vector-primer plasmid method yields fully represented plasmid primed libraries that are equivalent to those made by the SMART (switching mechanism at 5' end of RNA transcript) approach.
SMRT sequencing of the Vitis vinifera cv. ‘Flame seedless’ genome using a SMRTbell-free library preparation from Swift Biosciences

USDA-ARS?s Scientific Manuscript database

Single Molecule Real-Time (SMRT) sequencing provides advantages to the sequencing of complex genomes. The long reads generated are superior for resolving complex genomic regions and provide highly contiguous de novo assemblies. Current SMRTbell libraries generate average read lengths of 10-15kb. How...
Epigenetic Segregation of Microbial Genomes from Complex Samples Using Restriction Endonucleases HpaII and McrB.

PubMed

Liu, Guohong; Weston, Christopher Q; Pham, Long K; Waltz, Shannon; Barnes, Helen; King, Paula; Sphar, Dan; Yamamoto, Robert T; Forsyth, R Allyn

2016-01-01

We describe continuing work to develop restriction endonucleases as tools to enrich targeted genomes of interest from diverse populations. Two approaches were developed in parallel to segregate genomic DNA based on cytosine methylation. First, the methyl-sensitive endonuclease HpaII was used to bind non-CG methylated DNA. Second, a truncated fragment of McrB was used to bind CpG methylated DNA. Enrichment levels of microbial genomes can exceed 100-fold with HpaII allowing improved genomic detection and coverage of otherwise trace microbial genomes from sputum. Additionally, we observe interesting enrichment results that correlate with the methylation states not only of bacteria, but of fungi, viruses, a protist and plants. The methods presented here offer promise for testing biological samples for pathogens and global analysis of population methylomes.
Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

PubMed

Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

2013-11-01

Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.
Toward Understanding the Genetic Basis of Yak Ovary Reproduction: A Characterization and Comparative Analyses of Estrus Ovary Transcriptiome in Yak and Cattle

PubMed Central

Huang, Cai; Mipam, Tserang Donko; Li, Jian

2016-01-01

Background Yaks (Bos grunniens) are endemic species that can adapt well to thin air, cold temperatures, and high altitude. These species can survive in harsh plateau environments and are major source of animal production for local residents, being an important breed in the Qinghai–Tibet Plateau. However, compared with ordinary cattle that live in the plains, yaks generally have lower fertility. Investigating the basic physiological molecular features of yak ovary and identifying the biological events underlying the differences between the ovaries of yak and plain cattle is necessary to understand the specificity of yak reproduction. Therefore, RNA-seq technology was applied to analyze transcriptome data comparatively between the yak and plain cattle estrous ovaries. Results After deep sequencing, 3,653,032 clean reads with a total of 4,828,772,880 base pairs were obtained from yak ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome, among which, 12,731 and 14,631 genes were assigned to Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Furthermore, comparison of yak and cattle ovary transcriptome data revealed that 1307 genes were significantly and differentially expressed between the two libraries, wherein 661 genes were upregulated and 646 genes were downregulated in yak ovary. Functional analysis showed that the differentially expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. GO annotations indicated that the genes related to “cell adhesion,” “hormonal” biological processes, and “calcium ion binding,” “cation transmembrane transport” molecular events were significantly active. KEGG pathway analysis showed that the “complement and coagulation cascade” pathway was the most enriched in yak ovary transcriptome data, followed by the “cytochrome P450” related and “ECM–receptor interaction” pathways. Moreover, several novel pathways, such as “circadian rhythm,” were significantly enriched despite having no evident associations with the reproductive function. Conclusion Our findings provide a molecular resource for further investigation of the general molecular mechanism of yak ovary and offer new insights to understand comprehensively the specificity of yak reproduction. PMID:27044040
Comparative metagenomics of microbial communities inhabiting deep-sea hydrothermal vent chimneys with contrasting chemistries

PubMed Central

Xie, Wei; Wang, Fengping; Guo, Lei; Chen, Zeling; Sievert, Stefan M; Meng, Jun; Huang, Guangrui; Li, Yuxin; Yan, Qingyu; Wu, Shan; Wang, Xin; Chen, Shangwu; He, Guangyuan; Xiao, Xiang; Xu, Anlong

2011-01-01

Deep-sea hydrothermal vent chimneys harbor a high diversity of largely unknown microorganisms. Although the phylogenetic diversity of these microorganisms has been described previously, the adaptation and metabolic potential of the microbial communities is only beginning to be revealed. A pyrosequencing approach was used to directly obtain sequences from a fosmid library constructed from a black smoker chimney 4143-1 in the Mothra hydrothermal vent field at the Juan de Fuca Ridge. A total of 308 034 reads with an average sequence length of 227 bp were generated. Comparative genomic analyses of metagenomes from a variety of environments by two-way clustering of samples and functional gene categories demonstrated that the 4143-1 metagenome clustered most closely with that from a carbonate chimney from Lost City. Both are highly enriched in genes for mismatch repair and homologous recombination, suggesting that the microbial communities have evolved extensive DNA repair systems to cope with the extreme conditions that have potential deleterious effects on the genomes. As previously reported for the Lost City microbiome, the metagenome of chimney 4143-1 exhibited a high proportion of transposases, implying that horizontal gene transfer may be a common occurrence in the deep-sea vent chimney biosphere. In addition, genes for chemotaxis and flagellar assembly were highly enriched in the chimney metagenomes, reflecting the adaptation of the organisms to the highly dynamic conditions present within the chimney walls. Reconstruction of the metabolic pathways revealed that the microbial community in the wall of chimney 4143-1 was mainly fueled by sulfur oxidation, putatively coupled to nitrate reduction to perform inorganic carbon fixation through the Calvin–Benson–Bassham cycle. On the basis of the genomic organization of the key genes of the carbon fixation and sulfur oxidation pathways contained in the large genomic fragments, both obligate and facultative autotrophs appear to be present and contribute to biomass production. PMID:20927138
RNA-seq reveals distinctive RNA profiles of small extracellular vesicles from different human liver cancer cell lines.

PubMed

Berardocco, Martina; Radeghieri, Annalisa; Busatto, Sara; Gallorini, Marialucia; Raggi, Chiara; Gissi, Clarissa; D'Agnano, Igea; Bergese, Paolo; Felsani, Armando; Berardi, Anna C

2017-10-10

Liver cancer (LC) is one of the most common cancers and represents the third highest cause of cancer-related deaths worldwide. Extracellular vesicle (EVs) cargoes, which are selectively enriched in RNA, offer great promise for the diagnosis, prognosis and treatment of LC. Our study analyzed the RNA cargoes of EVs derived from 4 liver-cancer cell lines: HuH7, Hep3B, HepG2 (hepato-cellular carcinoma) and HuH6 (hepatoblastoma), generating two different sets of sequencing libraries for each. One library was size-selected for small RNAs and the other targeted the whole transcriptome. Here are reported genome wide data of the expression level of coding and non-coding transcripts, microRNAs, isomiRs and snoRNAs providing the first comprehensive overview of the extracellular-vesicle RNA cargo released from LC cell lines. The EV-RNA expression profiles of the four liver cancer cell lines share a similar background, but cell-specific features clearly emerge showing the marked heterogeneity of the EV-cargo among the individual cell lines, evident both for the coding and non-coding RNA species.
An in vivo library-versus-library selection of optimized protein-protein interactions.

PubMed

Pelletier, J N; Arndt, K M; Plückthun, A; Michnick, S W

1999-07-01

We describe a rapid and efficient in vivo library-versus-library screening strategy for identifying optimally interacting pairs of heterodimerizing polypeptides. Two leucine zipper libraries, semi-randomized at the positions adjacent to the hydrophobic core, were genetically fused to either one of two designed fragments of the enzyme murine dihydrofolate reductase (mDHFR), and cotransformed into Escherichia coli. Interaction between the library polypeptides reconstituted enzymatic activity of mDHFR, allowing bacterial growth. Analysis of the resulting colonies revealed important biases in the zipper sequences relative to the original libraries, which are consistent with selection for stable, heterodimerizing pairs. Using more weakly associating mDHFR fragments, we increased the stringency of selection. We enriched the best-performing leucine zipper pairs by multiple passaging of the pooled, selected colonies in liquid culture, as the best pairs allowed for better bacterial propagation. This competitive growth allowed small differences among the pairs to be amplified, and different sequence positions were enriched at different rates. We applied these selection processes to a library-versus-library sample of 2.0 x 10(6) combinations and selected a novel leucine zipper pair that may be appropriate for use in further in vivo heterodimerization strategies.
Improving draft genome contiguity with reference-derived in silico mate-pair libraries.

PubMed

Grau, José Horacio; Hackl, Thomas; Koepfli, Klaus-Peter; Hofreiter, Michael

2018-05-01

Contiguous genome assemblies are a highly valued biological resource because of the higher number of completely annotated genes and genomic elements that are usable compared to fragmented draft genomes. Nonetheless, contiguity is difficult to obtain if only low coverage data and/or only distantly related reference genome assemblies are available. In order to improve genome contiguity, we have developed Cross-Species Scaffolding-a new pipeline that imports long-range distance information directly into the de novo assembly process by constructing mate-pair libraries in silico. We show how genome assembly metrics and gene prediction dramatically improve with our pipeline by assembling two primate genomes solely based on ∼30x coverage of shotgun sequencing data.
Analysis of an RNA-seq Strand-Specific Library from an East Timorese Cucumber Sample Reveals a Complete Cucurbit aphid-borne yellows virus Genome

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2017-01-01

ABSTRACT Analysis of an RNA-seq library from cucumber leaf RNA extracted from a fast technology for analysis of nucleic acids (FTA) card revealed the first complete genome of Cucurbit aphid-borne yellows virus (CABYV) from East Timor. We compare it with 35 complete CABYV genomes from other world regions. It most resembled the genome of the South Korean isolate HD118. PMID:28495776
Methods for Selecting Phage Display Antibody Libraries.

PubMed

Jara-Acevedo, Ricardo; Diez, Paula; Gonzalez-Gonzalez, Maria; Degano, Rosa Maria; Ibarrola, Nieves; Gongora, Rafael; Orfao, Alberto; Fuentes, Manuel

2016-01-01

The selection process aims sequential enrichment of phage antibody display library in clones that recognize the target of interest or antigen as the library undergoes successive rounds of selection. In this review, selection methods most commonly used for phage display antibody libraries have been comprehensively described. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
CRISPR library designer (CLD): software for multispecies design of single guide RNA libraries.

PubMed

Heigwer, Florian; Zhan, Tianzuo; Breinig, Marco; Winter, Jan; Brügemann, Dirk; Leible, Svenja; Boutros, Michael

2016-03-24

Genetic screens using CRISPR/Cas9 are a powerful method for the functional analysis of genomes. Here we describe CRISPR library designer (CLD), an integrated bioinformatics application for the design of custom single guide RNA (sgRNA) libraries for all organisms with annotated genomes. CLD is suitable for the design of libraries using modified CRISPR enzymes and targeting non-coding regions. To demonstrate its utility, we perform a pooled screen for modulators of the TNF-related apoptosis inducing ligand (TRAIL) pathway using a custom library of 12,471 sgRNAs. CLD predicts a high fraction of functional sgRNAs and is publicly available at https://github.com/boutroslab/cld.
Rapid and continuous magnetic separation in droplet microfluidic devices

DOE PAGES

Brouzes, Eric; Kruse, Travis; Kimmerling, Robert; ...

2014-12-03

Here, we present a droplet microfluidic method to extract molecules of interest from a droplet in a rapid and continuous fashion. We accomplish this by first marginalizing functionalized super-paramagnetic beads within the droplet using a magnetic field, and then splitting the droplet into one droplet containing the majority of magnetic beads and one droplet containing the minority fraction. We quantitatively analysed the factors which affect the efficiency of marginalization and droplet splitting to optimize the enrichment of magnetic beads. We first characterized the interplay between the droplet velocity and the strength of the magnetic field and its effect on marginalization.more » We found that marginalization is optimal at the midline of the magnet and that marginalization is a good predictor of bead enrichment through splitting at low to moderate droplet velocities. Finally, we focused our efforts on manipulating the splitting profile to improve the enrichment provided by asymmetric splitting. We designed asymmetric splitting forks that employ capillary effects to preferentially extract the bead-rich regions of the droplets. Our strategy represents a framework to optimize magnetic bead enrichment methods tailored to the requirements of specific droplet-based applications. We anticipate that our separation technology is well suited for applications in single-cell genomics and proteomics. In particular, our method could be used to separate mRNA bound to poly-dT functionalized magnetic microparticles from single cell lysates to prepare single-cell cDNA libraries.« less
Phylogenetic and physiological diversity of microorganisms isolated from a deep greenland glacier ice core

NASA Technical Reports Server (NTRS)

Miteva, V. I.; Sheridan, P. P.; Brenchley, J. E.

2004-01-01

We studied a sample from the GISP 2 (Greenland Ice Sheet Project) ice core to determine the diversity and survival of microorganisms trapped in the ice at least 120,000 years ago. Previously, we examined the phylogenetic relationships among 16S ribosomal DNA (rDNA) sequences in a clone library obtained by PCR amplification from genomic DNA extracted from anaerobic enrichments. Here we report the isolation of nearly 800 aerobic organisms that were grouped by morphology and amplified rDNA restriction analysis patterns to select isolates for further study. The phylogenetic analyses of 56 representative rDNA sequences showed that the isolates belonged to four major phylogenetic groups: the high-G+C gram-positives, low-G+C gram-positives, Proteobacteria, and the Cytophaga-Flavobacterium-Bacteroides group. The most abundant and diverse isolates were within the high-G+C gram-positive cluster that had not been represented in the clone library. The Jukes-Cantor evolutionary distance matrix results suggested that at least 7 isolates represent new species within characterized genera and that 49 are different strains of known species. The isolates were further categorized based on the isolation conditions, temperature range for growth, enzyme activity, antibiotic resistance, presence of plasmids, and strain-specific genomic variations. A significant observation with implications for the development of novel and more effective cultivation methods was that preliminary incubation in anaerobic and aerobic liquid prior to plating on agar media greatly increased the recovery of CFU from the ice core sample.

ChIPnorm: a statistical method for normalizing and identifying differential regions in histone modification ChIP-seq libraries.

PubMed

Nair, Nishanth Ulhas; Sahu, Avinash Das; Bucher, Philipp; Moret, Bernard M E

2012-01-01

The advent of high-throughput technologies such as ChIP-seq has made possible the study of histone modifications. A problem of particular interest is the identification of regions of the genome where different cell types from the same organism exhibit different patterns of histone enrichment. This problem turns out to be surprisingly difficult, even in simple pairwise comparisons, because of the significant level of noise in ChIP-seq data. In this paper we propose a two-stage statistical method, called ChIPnorm, to normalize ChIP-seq data, and to find differential regions in the genome, given two libraries of histone modifications of different cell types. We show that the ChIPnorm method removes most of the noise and bias in the data and outperforms other normalization methods. We correlate the histone marks with gene expression data and confirm that histone modifications H3K27me3 and H3K4me3 act as respectively a repressor and an activator of genes. Compared to what was previously reported in the literature, we find that a substantially higher fraction of bivalent marks in ES cells for H3K27me3 and H3K4me3 move into a K27-only state. We find that most of the promoter regions in protein-coding genes have differential histone-modification sites. The software for this work can be downloaded from http://lcbb.epfl.ch/software.html.
Combining Phage and Yeast Cell Surface Antibody Display to Identify Novel Cell Type-Selective Internalizing Human Monoclonal Antibodies.

PubMed

Bidlingmaier, Scott; Su, Yang; Liu, Bin

2015-01-01

Using phage antibody display, large libraries can be generated and screened to identify monoclonal antibodies with affinity for target antigens. However, while library size and diversity is an advantage of the phage display method, there is limited ability to quantitatively enrich for specific binding properties such as affinity. One way of overcoming this limitation is to combine the scale of phage display selections with the flexibility and quantitativeness of FACS-based yeast surface display selections. In this chapter we describe protocols for generating yeast surface antibody display libraries using phage antibody display selection outputs as starting material and FACS-based enrichment of target antigen-binding clones from these libraries. These methods should be widely applicable for the identification of monoclonal antibodies with specific binding properties.
DNAism: exploring genomic datasets on the web with Horizon Charts.

PubMed

Rio Deiros, David; Gibbs, Richard A; Rogers, Jeffrey

2016-01-27

Computational biologists daily face the need to explore massive amounts of genomic data. New visualization techniques can help researchers navigate and understand these big data. Horizon Charts are a relatively new visualization method that, under the right circumstances, maximizes data density without losing graphical perception. Horizon Charts have been successfully applied to understand multi-metric time series data. We have adapted an existing JavaScript library (Cubism) that implements Horizon Charts for the time series domain so that it works effectively with genomic datasets. We call this new library DNAism. Horizon Charts can be an effective visual tool to explore complex and large genomic datasets. Researchers can use our library to leverage these techniques to extract additional insights from their own datasets.
Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

PubMed

Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

2016-08-01

Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.
Transcriptome analysis reveals the complexity of alternative splicing regulation in the fungus Verticillium dahliae.

PubMed

Jin, Lirong; Li, Guanglin; Yu, Dazhao; Huang, Wei; Cheng, Chao; Liao, Shengjie; Wu, Qijia; Zhang, Yi

2017-02-06

Alternative splicing (AS) regulation is extensive and shapes the functional complexity of higher organisms. However, the contribution of alternative splicing to fungal biology is not well studied. This study provides sequences of the transcriptomes of the plant wilt pathogen Verticillium dahliae, using two different strains and multiple methods for cDNA library preparations. We identified alternatively spliced mRNA isoforms in over a half of the multi-exonic fungal genes. Over one-thousand isoforms involve TopHat novel splice junction; multiple types of combinatory alternative splicing patterns were identified. We showed that one Verticillium gene could use four different 5' splice sites and two different 3' donor sites to produce up to five mature mRNAs, representing one of the most sophisticated alternative splicing model in eukaryotes other than animals. Hundreds of novel intron types involving a pair of new splice sites were identified in the V. dahliae genome. All the types of AS events were validated by using RT-PCR. Functional enrichment analysis showed that AS genes are involved in most known biological functions and enriched in ATP biosynthesis, sexual/asexual reproduction, morphogenesis, signal transduction etc., predicting that the AS regulation modulates mRNA isoform output and shapes the V. dahliae proteome plasticity of the pathogen in response to the environmental and developmental changes. These findings demonstrate the comprehensive alternative splicing mechanisms in a fungal plant pathogen, which argues the importance of this fungus in developing complicate genome regulation strategies in eukaryotes.
Enriching User-Oriented Class Associations for Library Classification Schemes.

ERIC Educational Resources Information Center

Pu, Hsiao-Tieh; Yang, Chyan

2003-01-01

Explores the possibility of adding user-oriented class associations to hierarchical library classification schemes. Analyses a log of book circulation records from a university library in Taiwan and shows that classification schemes can be made more adaptable by analyzing circulation patterns of similar users. (Author/LRW)
Corruption of phage display libraries by target-unrelated clones: diagnosis and countermeasures.

PubMed

Thomas, William D; Golomb, Miriam; Smith, George P

2010-12-15

Phage display is used to discover peptides or proteins with a desired target property-most often, affinity for a target selector molecule. Libraries of phage clones displaying diverse surface peptides are subject to a selection process designed to enrich for the target behavior and subsequently propagated to restore phage numbers. A recurrent problem is enrichment of clones, called target-unrelated phages or peptides (TUPs), that lack the target behavior. Many TUPs are propagation related; they have mutations conferring a growth advantage and are enriched during the propagations accompanying selection. Unlike other filamentous phage libraries, fd-tet-based libraries are relatively resistant to propagation-related TUP corruption. Their minus-strand origin is disrupted by a large cassette that simultaneously confers resistance to tetracycline and imposes a rate-limiting growth defect that cannot be bypassed with simple mutations. Nonetheless, a new type of propagation-related TUP emerged in the output of in vivo selections from an fd-tet library. The founding clone had a complex rearrangement that restored the minus-strand origin while retaining tetracycline resistance. The rearrangement involved two recombination events, one with a contaminant having a wild-type minus-strand origin. The founder's infectivity advantage spread by simple recombination to clones displaying different peptides. We propose measures for minimizing TUP corruption. Copyright © 2010 Elsevier Inc. All rights reserved.
Corruption of phage-display libraries by target-unrelated clones: Diagnosis and countermeasures

PubMed Central

Thomas, William D.; Golomb, Miriam; Smith, George P.

2010-01-01

Phage display is used to discover peptides or proteins with a desired target property—most often, affinity for a target selector molecule. Libraries of phage clones displaying diverse surface peptides are subject to a selection process designed to enrich for the target behavior, and subsequently propagated to restore phage numbers. A recurrent problem is enrichment of clones, called target-unrelated phage (TUPs), that lack the target behavior. Many TUPs are propagation-related; they have mutations conferring a growth advantage, and are enriched during the propagations accompanying selection. Unlike other filamentous phage libraries, fd-tet-based libraries are relatively resistant to propagation-related TUP corruption. Their minus strand origin is disrupted by a large cassette that simultaneously confers resistance to tetracycline and imposes a rate-limiting growth defect that cannot be bypassed with simple mutations. Nonetheless, a new type of propagation-related TUP emerged in the output of in vivo selections from an fd-tet library. The founding clone had a complex rearrangement that restored the minus strand origin while retaining tetracycline resistance. The rearrangement involved two recombination events, one with a contaminant having a wild-type minus strand origin. The founder’s infectivity advantage spread by simple recombination to clones displaying different peptides. We propose measures for minimizing TUP corruption. PMID:20692225
Analysis of an RNA-seq Strand-Specific Library from an East Timorese Cucumber Sample Reveals a Complete Cucurbit aphid-borne yellows virus Genome.

PubMed

Maina, Solomon; Edwards, Owain R; de Almeida, Luis; Ximenes, Abel; Jones, Roger A C

2017-05-11

Analysis of an RNA-seq library from cucumber leaf RNA extracted from a fast technology for analysis of nucleic acids (FTA) card revealed the first complete genome of Cucurbit aphid-borne yellows virus (CABYV) from East Timor. We compare it with 35 complete CABYV genomes from other world regions. It most resembled the genome of the South Korean isolate HD118. Copyright © 2017 Maina et al.
Physical Analysis of the Complex Rye (Secale cereale L.) Alt4 Aluminium (Aluminum) Tolerance Locus Using a Whole-Genome BAC Library of Rye cv. Blanco

USDA-ARS?s Scientific Manuscript database

Rye is a diploid crop species with many outstanding qualities, and is also important as a source of new traits for wheat and triticale improvement. Here we describe a BAC library of rye cv. Blanco, representing a valuable resource for rye molecular genetic studies. The library provides a 6 × genome ...
Toward functional genomics in bacteria: Analysis of gene expression in Escherichia coli from a bacterial artificial chromosome library of Bacillus cereus

PubMed Central

Rondon, Michelle R.; Raffel, Sandra J.; Goodman, Robert M.; Handelsman, Jo

1999-01-01

As the study of microbes moves into the era of functional genomics, there is an increasing need for molecular tools for analysis of a wide diversity of microorganisms. Currently, biological study of many prokaryotes of agricultural, medical, and fundamental scientific interest is limited by the lack of adequate genetic tools. We report the application of the bacterial artificial chromosome (BAC) vector to prokaryotic biology as a powerful approach to address this need. We constructed a BAC library in Escherichia coli from genomic DNA of the Gram-positive bacterium Bacillus cereus. This library provides 5.75-fold coverage of the B. cereus genome, with an average insert size of 98 kb. To determine the extent of heterologous expression of B. cereus genes in the library, we screened it for expression of several B. cereus activities in the E. coli host. Clones expressing 6 of 10 activities tested were identified in the library, namely, ampicillin resistance, zwittermicin A resistance, esculin hydrolysis, hemolysis, orange pigment production, and lecithinase activity. We analyzed selected BAC clones genetically to identify rapidly specific B. cereus loci. These results suggest that BAC libraries will provide a powerful approach for studying gene expression from diverse prokaryotes. PMID:10339608
Construction of random sheared fosmid library from Chinese cabbage and its use for Brassica rapa genome sequencing project.

PubMed

Park, Tae-Ho; Park, Beom-Seok; Kim, Jin-A; Hong, Joon Ki; Jin, Mina; Seol, Young-Joo; Mun, Jeong-Hwan

2011-01-01

As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expected to cover approximately 90% euchromatins of both chromosomes. As the project progresses, BAC selection for sequence extension becomes more limited because BAC libraries are restriction enzyme-specific. To support the project, a random sheared fosmid library was constructed. The library consists of 97536 clones with average insert size of approximately 40 kb corresponding to seven genome equivalents, assuming a Chinese cabbage genome size of 550 Mb. The library was screened with primers designed at the end of sequences of nine points of scaffold gaps where BAC clones cannot be selected to extend the physical contigs. The selected positive clones were end-sequenced to check the overlap between the fosmid clones and the adjacent BAC clones. Nine fosmid clones were selected and fully sequenced. The sequences revealed two completed gap filling and seven sequence extensions, which can be used for further selection of BAC clones confirming that the fosmid library will facilitate the sequence completion of B. rapa. Copyright © 2011. Published by Elsevier Ltd.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots.

PubMed

Wu, Min; Kwoh, Chee-Keong; Przytycka, Teresa M; Li, Jing; Zheng, Jie

2012-06-21

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots

PubMed Central

2012-01-01

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots. PMID:22759569
Development and characterization of 32 microsatellite loci in Genipa americana (Rubiaceae)1

PubMed Central

Manoel, Ricardo O.; Freitas, Miguel L. M.; Barreto, Mariana A.; Moraes, Mário L. T.; Souza, Anete P.; Sebbenn, Alexandre M.

2014-01-01

• Premise of the study: Microsatellite primers were developed for the tree species Genipa americana (Rubiaceae) for further population genetic studies. • Methods and Results: We identified 144 clones containing 65 repeat motifs from a genomic library enriched for (CT)8 and (GT)8 motifs. Primer pairs were developed for 32 microsatellite loci and validated in 40 individuals of two natural G. americana populations. Seventeen loci were polymorphic, revealing from three to seven alleles per locus. The observed and expected heterozygosities ranged from 0.24 to 1.00 and from 0.22 to 0.78, respectively. • Conclusions: The 17 primers identified as polymorphic loci are suitable to study the genetic diversity and structure, mating system, and gene flow in G. americana. PMID:25202610
Characterization of 10 new nuclear microsatellite markers in Acca sellowiana (Myrtaceae)1

PubMed Central

Klabunde, Gustavo H. F.; Olkoski, Denise; Vilperte, Vinicius; Zucchi, Maria I.; Nodari, Rubens O.

2014-01-01

• Premise of the study: Microsatellite primers were identified and characterized in Acca sellowiana in order to expand the limited number of pre-existing polymorphic markers for use in population genetic studies for conservation, phylogeography, breeding, and domestication. • Methods and Results: A total of 10 polymorphic microsatellite primers were designed from clones obtained from a simple sequence repeat (SSR)–enriched genomic library. The primers amplified di- and trinucleotide repeats with four to 27 alleles per locus. In all tested populations, the observed heterozygosity ranged from 0.269 to 1.0. • Conclusions: These new polymorphic SSR markers will allow future genetic studies to be denser, either for genetic structure characterization of natural populations or for studies involving genetic breeding and domestication process in A. sellowiana. PMID:25202632
Enriching Critical Thinking and Language Learning with Educational Digital Libraries

ERIC Educational Resources Information Center

Lu, Hsin-lin

2012-01-01

As the amount of information available in online digital libraries increases exponentially, questions arise concerning the most productive way to use that information to advance learning. Applying the earlier information seeking theories advocated by Kelly (1963), Taylor (1968), and Belkin (1980) to the digital libraries experience, Carol Kuhlthau…
Enrichment of short interspersed transposable elements to embryonic stem cell-specific hypomethylated gene regions.

PubMed

Muramoto, Hiroki; Yagi, Shintaro; Hirabayashi, Keiji; Sato, Shinya; Ohgane, Jun; Tanaka, Satoshi; Shiota, Kunio

2010-08-01

Embryonic stem cells (ESCs) have a distinctive epigenome, which includes their genome-wide DNA methylation modification status, as represented by the ESC-specific hypomethylation of tissue-dependent and differentially methylated regions (T-DMRs) of Pou5f1 and Nanog. Here, we conducted a genome-wide investigation of sequence characteristics associated with T-DMRs that were differentially methylated between ESCs and somatic cells, by focusing on transposable elements including short interspersed elements (SINEs), long interspersed elements (LINEs) and long terminal repeats (LTRs). We found that hypomethylated T-DMRs were predominantly present in SINE-rich/LINE-poor genomic loci. The enrichment for SINEs spread over 300 kb in cis and there existed SINE-rich genomic domains spreading continuously over 1 Mb, which contained multiple hypomethylated T-DMRs. The characterization of sequence information showed that the enriched SINEs were relatively CpG rich and belonged to specific subfamilies. A subset of the enriched SINEs were hypomethylated T-DMRs in ESCs at Dppa3 gene locus, although SINEs are overall methylated in both ESCs and the liver. In conclusion, we propose that SINE enrichment is the genomic property of regions harboring hypomethylated T-DMRs in ESCs, which is a novel aspect of the ESC-specific epigenomic information.
Development of genomic resources for the narrow-leafed lupin (Lupinus angustifolius): construction of a bacterial artificial chromosome (BAC) library and BAC-end sequencing

PubMed Central

2011-01-01

Background Lupinus angustifolius L, also known as narrow-leafed lupin (NLL), is becoming an important grain legume crop that is valuable for sustainable farming and is becoming recognised as a potential human health food. Recent interest is being directed at NLL to improve grain production, disease and pest management and health benefits of the grain. However, studies have been hindered by a lack of extensive genomic resources for the species. Results A NLL BAC library was constructed consisting of 111,360 clones with an average insert size of 99.7 Kbp from cv Tanjil. The library has approximately 12 × genome coverage. Both ends of 9600 randomly selected BAC clones were sequenced to generate 13985 BAC end-sequences (BESs), covering approximately 1% of the NLL genome. These BESs permitted a preliminary characterisation of the NLL genome such as organisation and composition, with the BESs having approximately 39% G:C content, 16.6% repetitive DNA and 5.4% putative gene-encoding regions. From the BESs 9966 simple sequence repeat (SSR) motifs were identified and some of these are shown to be potential markers. Conclusions The NLL BAC library and BAC-end sequences are powerful resources for genetic and genomic research on lupin. These resources will provide a robust platform for future high-resolution mapping, map-based cloning, comparative genomics and assembly of whole-genome sequencing data for the species. PMID:22014081
A Deep-Coverage Tomato BAC Library and Prospects Toward Development of an STC Framework for Genome Sequencing

PubMed Central

Budiman, Muhammad A.; Mao, Long; Wood, Todd C.; Wing, Rod A.

2000-01-01

Recently a new strategy using BAC end sequences as sequence-tagged connectors (STCs) was proposed for whole-genome sequencing projects. In this study, we present the construction and detailed characterization of a 15.0 haploid genome equivalent BAC library for the cultivated tomato, Lycopersicon esculentum cv. Heinz 1706. The library contains 129,024 clones with an average insert size of 117.5 kb and a chloroplast content of 1.11%. BAC end sequences from 1490 ends were generated and analyzed as a preliminary evaluation for using this library to develop an STC framework to sequence the tomato genome. A total of 1205 BAC end sequences (80.9%) were obtained, with an average length of 360 high-quality bases, and were searched against the GenBank database. Using a cutoff expectation value of <10−6, and combining the results from BLASTN, BLASTX, and TBLASTX searches, 24.3% of the BAC end sequences were similar to known sequences, of which almost half (48.7%) share sequence similarities to retrotransposons and 7% to known genes. Some of the transposable element sequences were the first reported in tomato, such as sequences similar to maize transposon Activator (Ac) ORF and tobacco pararetrovirus-like sequences. Interestingly, there were no BAC end sequences similar to the highly repeated TGRI and TGRII elements. However, the majority (70.3%) of STCs did not share significant sequence similarities to any sequences in GenBank at either the DNA or predicted protein levels, indicating that a large portion of the tomato genome is still unknown. Our data demonstrate that this BAC library is suitable for developing an STC database to sequence the tomato genome. The advantages of developing an STC framework for whole-genome sequencing of tomato are discussed. [The BAC end sequences described in this paper have been deposited in the GenBank data library under accession nos. AQ367111–AQ368361.] PMID:10645957

Development of genomic SSR markers for fingerprinting lettuce (Lactuca sativa L.) cultivars and mapping genes.

PubMed

Rauscher, Gilda; Simko, Ivan

2013-01-22

Lettuce (Lactuca sativa L.) is the major crop from the group of leafy vegetables. Several types of molecular markers were developed that are effectively used in lettuce breeding and genetic studies. However only a very limited number of microsattelite-based markers are publicly available. We have employed the method of enriched microsatellite libraries to develop 97 genomic SSR markers. Testing of newly developed markers on a set of 36 Lactuca accession (33 L. sativa, and one of each L. serriola L., L. saligna L., and L. virosa L.) revealed that both the genetic heterozygosity (UHe = 0.56) and the number of loci per SSR (Na = 5.50) are significantly higher for genomic SSR markers than for previously developed EST-based SSR markers (UHe = 0.32, Na = 3.56). Fifty-four genomic SSR markers were placed on the molecular linkage map of lettuce. Distribution of markers in the genome appeared to be random, with the exception of possible cluster on linkage group 6. Any combination of 32 genomic SSRs was able to distinguish genotypes of all 36 accessions. Fourteen of newly developed SSR markers originate from fragments with high sequence similarity to resistance gene candidates (RGCs) and RGC pseudogenes. Analysis of molecular variance (AMOVA) of L. sativa accessions showed that approximately 3% of genetic diversity was within accessions, 79% among accessions, and 18% among horticultural types. The newly developed genomic SSR markers were added to the pool of previously developed EST-SSRs markers. These two types of SSR-based markers provide useful tools for lettuce cultivar fingerprinting, development of integrated molecular linkage maps, and mapping of genes.
Development of genomic SSR markers for fingerprinting lettuce (Lactuca sativa L.) cultivars and mapping genes

PubMed Central

2013-01-01

Background Lettuce (Lactuca sativa L.) is the major crop from the group of leafy vegetables. Several types of molecular markers were developed that are effectively used in lettuce breeding and genetic studies. However only a very limited number of microsattelite-based markers are publicly available. We have employed the method of enriched microsatellite libraries to develop 97 genomic SSR markers. Results Testing of newly developed markers on a set of 36 Lactuca accession (33 L. sativa, and one of each L. serriola L., L. saligna L., and L. virosa L.) revealed that both the genetic heterozygosity (UHe = 0.56) and the number of loci per SSR (Na = 5.50) are significantly higher for genomic SSR markers than for previously developed EST-based SSR markers (UHe = 0.32, Na = 3.56). Fifty-four genomic SSR markers were placed on the molecular linkage map of lettuce. Distribution of markers in the genome appeared to be random, with the exception of possible cluster on linkage group 6. Any combination of 32 genomic SSRs was able to distinguish genotypes of all 36 accessions. Fourteen of newly developed SSR markers originate from fragments with high sequence similarity to resistance gene candidates (RGCs) and RGC pseudogenes. Analysis of molecular variance (AMOVA) of L. sativa accessions showed that approximately 3% of genetic diversity was within accessions, 79% among accessions, and 18% among horticultural types. Conclusions The newly developed genomic SSR markers were added to the pool of previously developed EST-SSRs markers. These two types of SSR-based markers provide useful tools for lettuce cultivar fingerprinting, development of integrated molecular linkage maps, and mapping of genes. PMID:23339733
Evaluation and Design of Genome-Wide CRISPR/SpCas9 Knockout Screens

PubMed Central

Hart, Traver; Tong, Amy Hin Yan; Chan, Katie; Van Leeuwen, Jolanda; Seetharaman, Ashwin; Aregger, Michael; Chandrashekhar, Megha; Hustedt, Nicole; Seth, Sahil; Noonan, Avery; Habsid, Andrea; Sizova, Olga; Nedyalkova, Lyudmila; Climie, Ryan; Tworzyanski, Leanne; Lawson, Keith; Sartori, Maria Augusta; Alibeh, Sabriyeh; Tieu, David; Masud, Sanna; Mero, Patricia; Weiss, Alexander; Brown, Kevin R.; Usaj, Matej; Billmann, Maximilian; Rahman, Mahfuzur; Costanzo, Michael; Myers, Chad L.; Andrews, Brenda J.; Boone, Charles; Durocher, Daniel; Moffat, Jason

2017-01-01

The adaptation of CRISPR/SpCas9 technology to mammalian cell lines is transforming the study of human functional genomics. Pooled libraries of CRISPR guide RNAs (gRNAs) targeting human protein-coding genes and encoded in viral vectors have been used to systematically create gene knockouts in a variety of human cancer and immortalized cell lines, in an effort to identify whether these knockouts cause cellular fitness defects. Previous work has shown that CRISPR screens are more sensitive and specific than pooled-library shRNA screens in similar assays, but currently there exists significant variability across CRISPR library designs and experimental protocols. In this study, we reanalyze 17 genome-scale knockout screens in human cell lines from three research groups, using three different genome-scale gRNA libraries. Using the Bayesian Analysis of Gene Essentiality algorithm to identify essential genes, we refine and expand our previously defined set of human core essential genes from 360 to 684 genes. We use this expanded set of reference core essential genes, CEG2, plus empirical data from six CRISPR knockout screens to guide the design of a sequence-optimized gRNA library, the Toronto KnockOut version 3.0 (TKOv3) library. We then demonstrate the high effectiveness of the library relative to reference sets of essential and nonessential genes, as well as other screens using similar approaches. The optimized TKOv3 library, combined with the CEG2 reference set, provide an efficient, highly optimized platform for performing and assessing gene knockout screens in human cell lines. PMID:28655737
Draft Genome Sequence of a Dictyoglomus sp. from an Enrichment Culture of a New Zealand Geothermal Spring

DOE PAGES

Reysenbach, Anna-Louise; Donaho, John; Kelley, John; ...

2018-03-15

A draft genome of a novelDictyoglomussp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to otherDictyoglomusspp. Like its relatives,Dictyoglomussp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism.
Draft Genome Sequence of a Dictyoglomus sp. from an Enrichment Culture of a New Zealand Geothermal Spring

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reysenbach, Anna-Louise; Donaho, John; Kelley, John

A draft genome of a novelDictyoglomussp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to otherDictyoglomusspp. Like its relatives,Dictyoglomussp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism.
Computational design of chimeric protein libraries for directed evolution.

PubMed

Silberg, Jonathan J; Nguyen, Peter Q; Stevenson, Taylor

2010-01-01

The best approach for creating libraries of functional proteins with large numbers of nondisruptive amino acid substitutions is protein recombination, in which structurally related polypeptides are swapped among homologous proteins. Unfortunately, as more distantly related proteins are recombined, the fraction of variants having a disrupted structure increases. One way to enrich the fraction of folded and potentially interesting chimeras in these libraries is to use computational algorithms to anticipate which structural elements can be swapped without disturbing the integrity of a protein's structure. Herein, we describe how the algorithm Schema uses the sequences and structures of the parent proteins recombined to predict the structural disruption of chimeras, and we outline how dynamic programming can be used to find libraries with a range of amino acid substitution levels that are enriched in variants with low Schema disruption.
A new single-nucleotide polymorphism database for rainbow trout generated through whole genome re-sequencing

USDA-ARS?s Scientific Manuscript database

Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...
Dominant genetics using a yeast genomic library under the control of a strong inducible promoter.

PubMed

Ramer, S W; Elledge, S J; Davis, R W

1992-12-01

In Saccharomyces cerevisiae, numerous genes have been identified by selection from high-copy-number libraries based on "multicopy suppression" or other phenotypic consequences of overexpression. Although fruitful, this approach suffers from two major drawbacks. First, high copy number alone may not permit high-level expression of tightly regulated genes. Conversely, other genes expressed in proportion to dosage cannot be identified if their products are toxic at elevated levels. This work reports construction of a genomic DNA expression library for S. cerevisiae that circumvents both limitations by fusing randomly sheared genomic DNA to the strong, inducible yeast GAL1 promoter, which can be regulated by carbon source. The library obtained contains 5 x 10(7) independent recombinants, representing a breakpoint at every base in the yeast genome. This library was used to examine aberrant gene expression in S. cerevisiae. A screen for dominant activators of yeast mating response identified eight genes that activate the pathway in the absence of exogenous mating pheromone, including one previously unidentified gene. One activator was a truncated STE11 gene lacking approximately 1000 base pairs of amino-terminal coding sequence. In two different clones, the same GAL1 promoter-proximal ATG is in-frame with the coding sequence of STE11, suggesting that internal initiation of translation there results in production of a biologically active, truncated STE11 protein. Thus this library allows isolation based on dominant phenotypes of genes that might have been difficult or impossible to isolate from high-copy-number libraries.
The Essential Genome of Escherichia coli K-12

PubMed Central

2018-01-01

ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Optimization of the genotyping-by-sequencing strategy for population genomic analysis in conifers.

PubMed

Pan, Jin; Wang, Baosheng; Pei, Zhi-Yong; Zhao, Wei; Gao, Jie; Mao, Jian-Feng; Wang, Xiao-Ru

2015-07-01

Flexibility and low cost make genotyping-by-sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI-MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference-free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000-11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking. © 2014 John Wiley & Sons Ltd.
Library Resources for Bac End Sequencing. Final Technical Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pieter J. de Jong

2000-10-01

Studies directed towards the specific aims outlined for this research award are summarized. The RPCI II Human Bac Library has been expanded by the addition of 6.9-fold genomic coverage. This segment has been generated from a MBOI partial digest of the same anonymous donor DNA used for the rest of the library. A new cloning vector, pTARBAC1, has been constructed and used in the construction of RPCI-II segment 5. This new cloning vector provides a new strategy in identifying targeted genomic regions and will greatly facilitate a large-scale analysis for positional cloning. A new maleCS7BC/6J mouse BAC library has beenmore » constructed. RPCI-23 contain 576 plates (approx 210,000 clones) and represents approximately 11-fold coverage of the mouse genome.« less
Partitioning heritability by functional annotation using genome-wide association summary statistics.

PubMed

Finucane, Hilary K; Bulik-Sullivan, Brendan; Gusev, Alexander; Trynka, Gosia; Reshef, Yakir; Loh, Po-Ru; Anttila, Verneri; Xu, Han; Zang, Chongzhi; Farh, Kyle; Ripke, Stephan; Day, Felix R; Purcell, Shaun; Stahl, Eli; Lindstrom, Sara; Perry, John R B; Okada, Yukinori; Raychaudhuri, Soumya; Daly, Mark J; Patterson, Nick; Neale, Benjamin M; Price, Alkes L

2015-11-01

Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.
Discovery of a Potent BTK Inhibitor with a Novel Binding Mode by Using Parallel Selections with a DNA-Encoded Chemical Library.

PubMed

Cuozzo, John W; Centrella, Paolo A; Gikunju, Diana; Habeshian, Sevan; Hupp, Christopher D; Keefe, Anthony D; Sigel, Eric A; Soutter, Holly H; Thomson, Heather A; Zhang, Ying; Clark, Matthew A

2017-05-04

We have identified and characterized novel potent inhibitors of Bruton's tyrosine kinase (BTK) from a single DNA-encoded library of over 110 million compounds by using multiple parallel selection conditions, including variation in target concentration and addition of known binders to provide competition information. Distinct binding profiles were observed by comparing enrichments of library building block combinations under these conditions; one enriched only at high concentrations of BTK and was competitive with ATP, and another enriched at both high and low concentrations of BTK and was not competitive with ATP. A compound representing the latter profile showed low nanomolar potency in biochemical and cellular BTK assays. Results from kinetic mechanism of action studies were consistent with the selection profiles. Analysis of the co-crystal structure of the most potent compound demonstrated a novel binding mode that revealed a new pocket in BTK. Our results demonstrate that profile-based selection strategies using DNA-encoded libraries form the basis of a new methodology to rapidly identify small molecule inhibitors with novel binding modes to clinically relevant targets. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
SPlinted Ligation Adapter Tagging (SPLAT), a novel library preparation method for whole genome bisulphite sequencing

PubMed Central

Manlig, Erika; Wahlberg, Per

2017-01-01

Abstract Sodium bisulphite treatment of DNA combined with next generation sequencing (NGS) is a powerful combination for the interrogation of genome-wide DNA methylation profiles. Library preparation for whole genome bisulphite sequencing (WGBS) is challenging due to side effects of the bisulphite treatment, which leads to extensive DNA damage. Recently, a new generation of methods for bisulphite sequencing library preparation have been devised. They are based on initial bisulphite treatment of the DNA, followed by adaptor tagging of single stranded DNA fragments, and enable WGBS using low quantities of input DNA. In this study, we present a novel approach for quick and cost effective WGBS library preparation that is based on splinted adaptor tagging (SPLAT) of bisulphite-converted single-stranded DNA. Moreover, we validate SPLAT against three commercially available WGBS library preparation techniques, two of which are based on bisulphite treatment prior to adaptor tagging and one is a conventional WGBS method. PMID:27899585
A new single-nucleotide polymorphisms database for rainbow trout generated through whole genome resequencing of selected samples

USDA-ARS?s Scientific Manuscript database

Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...
Discovery of User-Oriented Class Associations for Enriching Library Classification Schemes.

ERIC Educational Resources Information Center

Pu, Hsiao-Tieh

2002-01-01

Presents a user-based approach to exploring the possibility of adding user-oriented class associations to hierarchical library classification schemes. Classes not grouped in the same subject hierarchies yet relevant to users' knowledge are obtained by analyzing a log book of a university library's circulation records, using collaborative filtering…
Libraries Alive: Promoting Libraries and Literature--Practical Applications for the Teacher-Librarian.

ERIC Educational Resources Information Center

Boyd, Suzette

Enthusiasm for and a commitment to literature are the essential tools needed to successfully promote the enriching, challenging, and thought-provoking world of books. This paper focuses on the promotion of literature inside the classroom, inside the library, and in the wider community. Programs at Methodist Ladies' College (Australia), a boarding…
Draft Genome Sequence of a Dictyoglomus sp. from an Enrichment Culture of a New Zealand Geothermal Spring

PubMed Central

Donaho, John A.; Kelley, John F.; St. John, Emily; Turner, Christina; Podar, Mircea; Stott, Matthew B.

2018-01-01

ABSTRACT A draft genome of a novel Dictyoglomus sp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to other Dictyoglomus spp. Like its relatives, Dictyoglomus sp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism. PMID:29545298
Sequencing small genomic targets with high efficiency and extreme accuracy

PubMed Central

Schmitt, Michael W.; Fox, Edward J.; Prindle, Marc J.; Reid-Bayliss, Kate S.; True, Lawrence D.; Radich, Jerald P.; Loeb, Lawrence A.

2015-01-01

The detection of minority variants in mixed samples demands methods for enrichment and accurate sequencing of small genomic intervals. We describe an efficient approach based on sequential rounds of hybridization with biotinylated oligonucleotides, enabling more than one-million fold enrichment of genomic regions of interest. In conjunction with error correcting double-stranded molecular tags, our approach enables the quantification of mutations in individual DNA molecules. PMID:25849638
Enrichment dynamics of Listeria monocytogenes and the associated microbiome from naturally contaminated ice cream linked to a listeriosis outbreak.

PubMed

Ottesen, Andrea; Ramachandran, Padmini; Reed, Elizabeth; White, James R; Hasan, Nur; Subramanian, Poorani; Ryan, Gina; Jarvis, Karen; Grim, Christopher; Daquiqan, Ninalynn; Hanes, Darcy; Allard, Marc; Colwell, Rita; Brown, Eric; Chen, Yi

2016-11-16

Microbiota that co-enrich during efforts to recover pathogens from foodborne outbreaks interfere with efficient detection and recovery. Here, dynamics of co-enriching microbiota during recovery of Listeria monocytogenes from naturally contaminated ice cream samples linked to an outbreak are described for three different initial enrichment formulations used by the Food and Drug Administration (FDA), the International Organization of Standardization (ISO), and the United States Department of Agriculture (USDA). Enrichment cultures were analyzed using DNA extraction and sequencing from samples taken every 4 h throughout 48 h of enrichment. Resphera Insight and CosmosID analysis tools were employed for high-resolution profiling of 16S rRNA amplicons and whole genome shotgun data, respectively. During enrichment, other bacterial taxa were identified, including Anoxybacillus, Geobacillus, Serratia, Pseudomonas, Erwinia, and Streptococcus spp. Surprisingly, incidence of L. monocytogenes was proportionally greater at hour 0 than when tested 4, 8, and 12 h later with all three enrichment schemes. The corresponding increase in Anoxybacillus and Geobacillus spp.indicated these taxa co-enriched in competition with L. monocytogenes during early enrichment hours. L. monocytogenes became dominant after 24 h in all three enrichments. DNA sequences obtained from shotgun metagenomic data of Listeria monocytogenes at 48 h were assembled to produce a consensus draft genome which appeared to have a similar tracking utility to pure culture isolates of L. monocytogenes. All three methods performed equally well for enrichment of Listeria monocytogenes. The observation of potential competitive exclusion of L. mono by Anoxybacillus and Geobacillus in early enrichment hours provided novel information that may be used to further optimize enrichment formulations. Application of Resphera Insight for high-resolution analysis of 16S amplicon sequences accurately identified L. monocytogenes. Both shotgun and 16S rRNA data supported the presence of three slightly variable genomes of L. monocytogenes. Moreover, the draft assembly of a consensus genome of L. monocytogenes from shotgun metagenomic data demonstrated the potential utility of this approach to expedite trace-back of outbreak-associated strains, although further validation will be needed to confirm this utility.

Sequencing and assembly of the 22-gb loblolly pine genome.

PubMed

Zimin, Aleksey; Stevens, Kristian A; Crepeau, Marc W; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L; de Jong, Pieter J; Neale, David B; Salzberg, Steven L; Yorke, James A; Langley, Charles H

2014-03-01

Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer "super-reads," rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp.
Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species

PubMed Central

2010-01-01

Background The presence of closely related genomes in polyploid species makes the assembly of total genomic sequence from shotgun sequence reads produced by the current sequencing platforms exceedingly difficult, if not impossible. Genomes of polyploid species could be sequenced following the ordered-clone sequencing approach employing contigs of bacterial artificial chromosome (BAC) clones and BAC-based physical maps. Although BAC contigs can currently be constructed for virtually any diploid organism with the SNaPshot high-information-content-fingerprinting (HICF) technology, it is currently unknown if this is also true for polyploid species. It is possible that BAC clones from orthologous regions of homoeologous chromosomes would share numerous restriction fragments and be therefore included into common contigs. Because of this and other concerns, physical mapping utilizing the SNaPshot HICF of BAC libraries of polyploid species has not been pursued and the possibility of doing so has not been assessed. The sole exception has been in common wheat, an allohexaploid in which it is possible to construct single-chromosome or single-chromosome-arm BAC libraries from DNA of flow-sorted chromosomes and bypass the obstacles created by polyploidy. Results The potential of the SNaPshot HICF technology for physical mapping of polyploid plants utilizing global BAC libraries was evaluated by assembling contigs of fingerprinted clones in an in silico merged BAC library composed of single-chromosome libraries of two wheat homoeologous chromosome arms, 3AS and 3DS, and complete chromosome 3B. Because the chromosome arm origin of each clone was known, it was possible to estimate the fidelity of contig assembly. On average 97.78% or more clones, depending on the library, were from a single chromosome arm. A large portion of the remaining clones was shown to be library contamination from other chromosomes, a feature that is unavoidable during the construction of single-chromosome BAC libraries. Conclusions The negligibly low level of incorporation of clones from homoeologous chromosome arms into a contig during contig assembly suggested that it is feasible to construct contigs and physical maps using global BAC libraries of wheat and almost certainly also of other plant polyploid species with genome sizes comparable to that of wheat. Because of the high purity of the resulting assembled contigs, they can be directly used for genome sequencing. It is currently unknown but possible that equally good BAC contigs can be also constructed for polyploid species containing smaller, more gene-rich genomes. PMID:20170511
BiNChE: a web tool and library for chemical enrichment analysis based on the ChEBI ontology.

PubMed

Moreno, Pablo; Beisken, Stephan; Harsha, Bhavana; Muthukrishnan, Venkatesh; Tudose, Ilinca; Dekker, Adriano; Dornfeldt, Stefanie; Taruttis, Franziska; Grosse, Ivo; Hastings, Janna; Neumann, Steffen; Steinbeck, Christoph

2015-02-21

Ontology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis. We describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology. BiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.
Scaling up the 454 Titanium Library Construction and Pooling of Barcoded Libraries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Phung, Wilson; Hack, Christopher; Shapiro, Harris

2009-03-23

We have been developing a high throughput 454 library construction process at the Joint Genome Institute to meet the needs of de novo sequencing a large number of microbial and eukaryote genomes, EST, and metagenome projects. We have been focusing efforts in three areas: (1) modifying the current process to allow the construction of 454 standard libraries on a 96-well format; (2) developing a robotic platform to perform the 454 library construction; and (3) designing molecular barcodes to allow pooling and sorting of many different samples. In the development of a high throughput process to scale up the number ofmore » libraries by adapting the process to a 96-well plate format, the key process change involves the replacement of gel electrophoresis for size selection with Solid Phase Reversible Immobilization (SPRI) beads. Although the standard deviation of the insert sizes increases, the overall quality sequence and distribution of the reads in the genome has not changed. The manual process of constructing 454 shotgun libraries on 96-well plates is a time-consuming, labor-intensive, and ergonomically hazardous process; we have been experimenting to program a BioMek robot to perform the library construction. This will not only enable library construction to be completed in a single day, but will also minimize any ergonomic risk. In addition, we have implemented a set of molecular barcodes (AKA Multiple Identifiers or MID) and a pooling process that allows us to sequence many targets simultaneously. Here we will present the testing of pooling a set of selected fosmids derived from the endomycorrhizal fungus Glomus intraradices. By combining the robotic library construction process and the use of molecular barcodes, it is now possible to sequence hundreds of fosmids that represent a minimal tiling path of this genome. Here we present the progress and the challenges of developing these scaled-up processes.« less
Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

PubMed Central

Sakurai, Tetsuya; Plata, Germán; Rodríguez-Zapata, Fausto; Seki, Motoaki; Salcedo, Andrés; Toyoda, Atsushi; Ishiwata, Atsushi; Tohme, Joe; Sakaki, Yoshiyuki; Shinozaki, Kazuo; Ishitani, Manabu

2007-01-01

Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs). Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome. PMID:18096061
In Situ Hi-C Library Preparation for Plants to Study Their Three-Dimensional Chromatin Interactions on a Genome-Wide Scale.

PubMed

Liu, Chang

2017-01-01

The spatial organization of the genome in the nucleus is critical for many cellular processes. It has been broadly accepted that the packing of chromatin inside the nucleus is not random, but structured at several hierarchical levels. The Hi-C method combines Chromatin Conformation Capture and high-throughput sequencing, which allows interrogating genome-wide chromatin interactions. Depending on the sequencing depth, chromatin packing patterns derived from Hi-C experiments can be viewed on a chromosomal scale or at a local genic level. Here, I describe a protocol of plant in situ Hi-C library preparation, which covers procedures starting from tissue fixation to library amplification.
Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs.

PubMed

Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M

2017-06-01

The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange events and exploring cellular heterogeneity. Strand-seq is a single-cell sequencing technology that resolves the individual homologs within a cell by restricting sequence analysis to the DNA template strands used during DNA replication. This protocol, which takes up to 4 d to complete, relies on the directionality of DNA, in which each single strand of a DNA molecule is distinguished based on its 5'-3' orientation. Culturing cells in a thymidine analog for one round of cell division labels nascent DNA strands, allowing for their selective removal during genomic library construction. To preserve directionality of template strands, genomic preamplification is bypassed and labeled nascent strands are nicked and not amplified during library preparation. Each single-cell library is multiplexed for pooling and sequencing, and the resulting sequence data are aligned, mapping to either the minus or plus strand of the reference genome, to assign template strand states for each chromosome in the cell. The major adaptations to conventional single-cell sequencing protocols include harvesting of daughter cells after a single round of BrdU incorporation, bypassing of whole-genome amplification, and removal of the BrdU + strand during Strand-seq library preparation. By sequencing just template strands, the structure and identity of each homolog are preserved.
Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools.

PubMed

Guizard, Sébastien; Piégu, Benoît; Arensburger, Peter; Guillou, Florian; Bigot, Yves

2016-08-19

The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8-12 %) than the sequenced genomes of many vertebrate species (30-55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31-35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes.
Alignment of the Genomes of Brachypodium distachyon and Temperate Cereals and Grasses Using Bacterial Artificial Chromosome Landing With Fluorescence in Situ Hybridization

PubMed Central

Hasterok, Robert; Marasek, Agnieszka; Donnison, Iain S.; Armstead, Ian; Thomas, Ann; King, Ian P.; Wolny, Elzbieta; Idziak, Dominika; Draper, John; Jenkins, Glyn

2006-01-01

As part of an initiative to develop Brachypodium distachyon as a genomic “bridge” species between rice and the temperate cereals and grasses, a BAC library has been constructed for the two diploid (2n = 2x = 10) genotypes, ABR1 and ABR5. The library consists of 9100 clones, with an approximate average insert size of 88 kb, representing 2.22 genome equivalents. To validate the usefulness of this species for comparative genomics and gene discovery in its larger genome relatives, the library was screened by PCR using primers designed on previously mapped rice and Poaceae sequences. Screening indicated a degree of synteny between these species and B. distachyon, which was confirmed by fluorescent in situ hybridization of the marker-selected BACs (BAC landing) to the 10 chromosome arms of the karyotype, with most of the BACs hybridizing as single loci on known chromosomes. Contiguous BACs colocalized on individual chromosomes, thereby confirming the conservation of genome synteny and proving that B. distachyon has utility as a temperate grass model species alternative to rice. PMID:16489232
Alignment of the genomes of Brachypodium distachyon and temperate cereals and grasses using bacterial artificial chromosome landing with fluorescence in situ hybridization.

PubMed

Hasterok, Robert; Marasek, Agnieszka; Donnison, Iain S; Armstead, Ian; Thomas, Ann; King, Ian P; Wolny, Elzbieta; Idziak, Dominika; Draper, John; Jenkins, Glyn

2006-05-01

As part of an initiative to develop Brachypodium distachyon as a genomic "bridge" species between rice and the temperate cereals and grasses, a BAC library has been constructed for the two diploid (2n = 2x = 10) genotypes, ABR1 and ABR5. The library consists of 9100 clones, with an approximate average insert size of 88 kb, representing 2.22 genome equivalents. To validate the usefulness of this species for comparative genomics and gene discovery in its larger genome relatives, the library was screened by PCR using primers designed on previously mapped rice and Poaceae sequences. Screening indicated a degree of synteny between these species and B. distachyon, which was confirmed by fluorescent in situ hybridization of the marker-selected BACs (BAC landing) to the 10 chromosome arms of the karyotype, with most of the BACs hybridizing as single loci on known chromosomes. Contiguous BACs colocalized on individual chromosomes, thereby confirming the conservation of genome synteny and proving that B. distachyon has utility as a temperate grass model species alternative to rice.
Draft Genome Sequence of a Dictyoglomus sp. from an Enrichment Culture of a New Zealand Geothermal Spring.

PubMed

Reysenbach, Anna-Louise; Donaho, John A; Kelley, John F; St John, Emily; Turner, Christina; Podar, Mircea; Stott, Matthew B

2018-03-15

A draft genome of a novel Dictyoglomus sp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to other Dictyoglomus spp. Like its relatives, Dictyoglomus sp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism. Copyright © 2018 Reysenbach et al.
Metadata Harvesting in Regional Digital Libraries in the PIONIER Network

ERIC Educational Resources Information Center

Mazurek, Cezary; Stroinski, Maciej; Werla, Marcin; Weglarz, Jan

2006-01-01

Purpose: The paper aims to present the concept of the functionality of metadata harvesting for regional digital libraries, based on the OAI-PMH protocol. This functionality is a part of regional digital libraries platform created in Poland. The platform was required to reach one of main objectives of the Polish PIONIER Programme--to enrich the…
Genomic sequencing of Pleistocene cave bears

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noonan, James P.; Hofreiter, Michael; Smith, Doug

2005-04-01

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Library construction for next-generation sequencing: Overviews and challenges

PubMed Central

Head, Steven R.; Komori, H. Kiyomi; LaMere, Sarah A.; Whisenant, Thomas; Van Nieuwerburgh, Filip; Salomon, Daniel R.; Ordoukhanian, Phillip

2014-01-01

High-throughput sequencing, also known as next-generation sequencing (NGS), has revolutionized genomic research. In recent years, NGS technology has steadily improved, with costs dropping and the number and range of sequencing applications increasing exponentially. Here, we examine the critical role of sequencing library quality and consider important challenges when preparing NGS libraries from DNA and RNA sources. Factors such as the quantity and physical characteristics of the RNA or DNA source material as well as the desired application (i.e., genome sequencing, targeted sequencing, RNA-seq, ChIP-seq, RIP-seq, and methylation) are addressed in the context of preparing high quality sequencing libraries. In addition, the current methods for preparing NGS libraries from single cells are also discussed. PMID:24502796
Novel and highly informative Capsicum SSR markers and their cross-species transferability.

PubMed

Buso, G S C; Reis, A M M; Amaral, Z P S; Ferreira, M E

2016-09-23

This study was undertaken primarily to develop new simple sequence repeat (SSR) markers for Capsicum. As part of this project aimed at broadening the use of molecular tools in Capsicum breeding, two genomic libraries enriched for AG/TC repeat sequences were constructed for Capsicum annuum. A total of 475 DNA clones were sequenced from both libraries and 144 SSR markers were tested on cultivated and wild species of Capsicum. Forty-five SSR markers were randomly selected to genotype a panel of 48 accessions of the Capsicum germplasm bank. The number of alleles per locus ranged from 2 to 11, with an average of 6 alleles. The polymorphism information content was on average 0.60, ranging from 0.20 to 0.83. The cross-species transferability to seven cultivated and wild Capsicum species was tested with a set of 91 SSR markers. We found that a high proportion of the loci produced amplicons in all species tested. C. frutescens had the highest number of transferable markers, whereas the wild species had the lowest. Our results indicate that the new markers can be readily used in genetic analyses of Capsicum.
RNA-seq reveals distinctive RNA profiles of small extracellular vesicles from different human liver cancer cell lines

PubMed Central

Berardocco, Martina; Radeghieri, Annalisa; Busatto, Sara; Gallorini, Marialucia; Raggi, Chiara; Gissi, Clarissa; D’Agnano, Igea; Bergese, Paolo; Felsani, Armando; Berardi, Anna C.

2017-01-01

Liver cancer (LC) is one of the most common cancers and represents the third highest cause of cancer-related deaths worldwide. Extracellular vesicle (EVs) cargoes, which are selectively enriched in RNA, offer great promise for the diagnosis, prognosis and treatment of LC. Our study analyzed the RNA cargoes of EVs derived from 4 liver-cancer cell lines: HuH7, Hep3B, HepG2 (hepato-cellular carcinoma) and HuH6 (hepatoblastoma), generating two different sets of sequencing libraries for each. One library was size-selected for small RNAs and the other targeted the whole transcriptome. Here are reported genome wide data of the expression level of coding and non-coding transcripts, microRNAs, isomiRs and snoRNAs providing the first comprehensive overview of the extracellular-vesicle RNA cargo released from LC cell lines. The EV-RNA expression profiles of the four liver cancer cell lines share a similar background, but cell-specific features clearly emerge showing the marked heterogeneity of the EV-cargo among the individual cell lines, evident both for the coding and non-coding RNA species. PMID:29137313
Screening of a Brassica napus bacterial artificial chromosome library using highly parallel single nucleotide polymorphism assays

PubMed Central

2013-01-01

Background Efficient screening of bacterial artificial chromosome (BAC) libraries with polymerase chain reaction (PCR)-based markers is feasible provided that a multidimensional pooling strategy is implemented. Single nucleotide polymorphisms (SNPs) can be screened in multiplexed format, therefore this marker type lends itself particularly well for medium- to high-throughput applications. Combining the power of multiplex-PCR assays with a multidimensional pooling system may prove to be especially challenging in a polyploid genome. In polyploid genomes two classes of SNPs need to be distinguished, polymorphisms between accessions (intragenomic SNPs) and those differentiating between homoeologous genomes (intergenomic SNPs). We have assessed whether the highly parallel Illumina GoldenGate® Genotyping Assay is suitable for the screening of a BAC library of the polyploid Brassica napus genome. Results A multidimensional screening platform was developed for a Brassica napus BAC library which is composed of almost 83,000 clones. Intragenomic and intergenomic SNPs were included in Illumina’s GoldenGate® Genotyping Assay and both SNP classes were used successfully for screening of the multidimensional BAC pools of the Brassica napus library. An optimized scoring method is proposed which is especially valuable for SNP calling of intergenomic SNPs. Validation of the genotyping results by independent methods revealed a success of approximately 80% for the multiplex PCR-based screening regardless of whether intra- or intergenomic SNPs were evaluated. Conclusions Illumina’s GoldenGate® Genotyping Assay can be efficiently used for screening of multidimensional Brassica napus BAC pools. SNP calling was specifically tailored for the evaluation of BAC pool screening data. The developed scoring method can be implemented independently of plant reference samples. It is demonstrated that intergenomic SNPs represent a powerful tool for BAC library screening of a polyploid genome. PMID:24010766
Microsatellite markers for the yam bean Pachyrhizus (Fabaceae).

PubMed

Delêtre, Marc; Soengas, Beatriz; Utge, José; Lambourdière, Josie; Sørensen, Marten

2013-07-01

Microsatellite loci were developed for the understudied root crop yam bean (Pachyrhizus spp.) to investigate intraspecific diversity and interspecific relationships within the genus Pachyrhizus. • Seventeen nuclear simple sequence repeat (SSR) markers with perfect di- and trinucleotide repeats were developed from 454 pyrosequencing of SSR-enriched genomic libraries. Loci were characterized in P. ahipa and wild and cultivated populations of four closely related species. All loci successfully cross-amplified and showed high levels of polymorphism, with number of alleles ranging from three to 12 and expected heterozygosity ranging from 0.095 to 0.831 across the genus. • By enabling rapid assessment of genetic diversity in three native neotropical crops, P. ahipa, P. erosus, and P. tuberosus, and two wild relatives, P. ferrugineus and P. panamensis, these markers will allow exploration of the genetic diversity and evolutionary history of the genus Pachyrhizus.
Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

PubMed Central

Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.

2012-01-01

High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309
Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome

PubMed Central

Lan, Tianying; Renner, Tanya; Ibarra-Laclette, Enrique; Farr, Kimberly M.; Chang, Tien-Hao; Cervantes-Pérez, Sergio Alan; Zheng, Chunfang; Sankoff, David; Tang, Haibao; Purbojati, Rikky W.; Putra, Alexander; Drautz-Moses, Daniela I.; Schuster, Stephan C.; Herrera-Estrella, Luis; Albert, Victor A.

2017-01-01

Utricularia gibba, the humped bladderwort, is a carnivorous plant that retains a tiny nuclear genome despite at least two rounds of whole genome duplication (WGD) since common ancestry with grapevine and other species. We used a third-generation genome assembly with several complete chromosomes to reconstruct the two most recent lineage-specific ancestral genomes that led to the modern U. gibba genome structure. Patterns of subgenome dominance in the most recent WGD, both architectural and transcriptional, are suggestive of allopolyploidization, which may have generated genomic novelty and led to instantaneous speciation. Syntenic duplicates retained in polyploid blocks are enriched for transcription factor functions, whereas gene copies derived from ongoing tandem duplication events are enriched in metabolic functions potentially important for a carnivorous plant. Among these are tandem arrays of cysteine protease genes with trap-specific expression that evolved within a protein family known to be useful in the digestion of animal prey. Further enriched functions among tandem duplicates (also with trap-enhanced expression) include peptide transport (intercellular movement of broken-down prey proteins), ATPase activities (bladder-trap acidification and transmembrane nutrient transport), hydrolase and chitinase activities (breakdown of prey polysaccharides), and cell-wall dynamic components possibly associated with active bladder movements. Whereas independently polyploid Arabidopsis syntenic gene duplicates are similarly enriched for transcriptional regulatory activities, Arabidopsis tandems are distinct from those of U. gibba, while still metabolic and likely reflecting unique adaptations of that species. Taken together, these findings highlight the special importance of tandem duplications in the adaptive landscapes of a carnivorous plant genome. PMID:28507139

Fe-phyllosilicate redox cycling organisms from a redox transition zone in Hanford 300 Area sediments.

PubMed

Benzine, Jason; Shelobolina, Evgenya; Xiong, Mai Yia; Kennedy, David W; McKinley, James P; Lin, Xueju; Roden, Eric E

2013-01-01

Microorganisms capable of reducing or oxidizing structural iron (Fe) in Fe-bearing phyllosilicate minerals were enriched and isolated from a subsurface redox transition zone at the Hanford 300 Area site in eastern Washington, USA. Both conventional and in situ "i-chip" enrichment strategies were employed. One Fe(III)-reducing Geobacter (G. bremensis strain R1, Deltaproteobacteria) and six Fe(II) phyllosilicate-oxidizing isolates from the Alphaproteobacteria (Bradyrhizobium japonicum strains 22, is5, and in8p8), Betaproteobacteria (Cupriavidus necator strain A5-1, Dechloromonas agitata strain is5), and Actinobacteria (Nocardioides sp. strain in31) were recovered. The G. bremensis isolate grew by oxidizing acetate with the oxidized form of NAu-2 smectite as the electron acceptor. The Fe(II)-oxidizers grew by oxidation of chemically reduced smectite as the energy source with nitrate as the electron acceptor. The Bradyrhizobium isolates could also carry out aerobic oxidation of biotite. This is the first report of the recovery of a Fe(II)-oxidizing Nocardioides, and to date only one other Fe(II)-oxidizing Bradyrhizobium is known. The 16S rRNA gene sequences of the isolates were similar to ones found in clone libraries from Hanford 300 sediments and groundwater, suggesting that such organisms may be present and active in situ. Whole genome sequencing of the isolates is underway, the results of which will enable comparative genomic analysis of mechanisms of extracellular phyllosilicate Fe redox metabolism, and facilitate development of techniques to detect the presence and expression of genes associated with microbial phyllosilicate Fe redox cycling in sediments.
A survey of small RNAs in human sperm

PubMed Central

Krawetz, Stephen A.; Kruger, Adele; Lalancette, Claudia; Tagett, Rebecca; Anton, Ester; Draghici, Sorin; Diamond, Michael P.

2011-01-01

BACKGROUND There has been substantial interest in assessing whether RNAs (mRNAs and sncRNAs, i.e. small non-coding) delivered from mammalian spermatozoa play a functional role in early embryo development. While the cadre of spermatozoal mRNAs has been characterized, comparatively little is known about the distribution or function of the estimated 24 000 sncRNAs within each normal human spermatozoon. METHODS RNAs of <200 bases in length were isolated from the ejaculates from three donors of proved fertility. RNAs of 18–30 nucleotides in length were then used to construct small RNA Digital Gene Expression libraries for Next Generation Sequencing. Known sncRNAs that uniquely mapped to a single location in the human genome were identified. RESULTS Bioinformatic analysis revealed the presence of multiple classes of small RNAs in human spermatozoa. The primary classes resolved included microRNA (miRNAs) (≈7%), Piwi-interacting piRNAs (≈17%), repeat-associated small RNAs (≈65%). A minor subset of short RNAs within the transcription start site/promoter fraction (≈11%) frames the histone promoter-associated regions enriched in genes of early embryonic development. These have been termed quiescent RNAs. CONCLUSIONS A complex population of male derived sncRNAs that are available for delivery upon fertilization was revealed. Sperm miRNA-targeted enrichment in the human oocyte is consistent with their role as modifiers of early post-fertilization. The relative abundance of piRNAs and repeat-associated RNAs suggests that they may assume a role in confrontation and consolidation. This may ensure the compatibility of the genomes at fertilization. PMID:21989093
Fe-phyllosilicate redox cycling organisms from a redox transition zone in Hanford 300 Area sediments

PubMed Central

Benzine, Jason; Xiong, Mai Yia; Kennedy, David W.; McKinley, James P.; Lin, Xueju; Roden, Eric E.

2013-01-01

Microorganisms capable of reducing or oxidizing structural iron (Fe) in Fe-bearing phyllosilicate minerals were enriched and isolated from a subsurface redox transition zone at the Hanford 300 Area site in eastern Washington, USA. Both conventional and in situ “i-chip” enrichment strategies were employed. One Fe(III)-reducing Geobacter (G. bremensis strain R1, Deltaproteobacteria) and six Fe(II) phyllosilicate-oxidizing isolates from the Alphaproteobacteria (Bradyrhizobium japonicum strains 22, is5, and in8p8), Betaproteobacteria (Cupriavidus necator strain A5-1, Dechloromonas agitata strain is5), and Actinobacteria (Nocardioides sp. strain in31) were recovered. The G. bremensis isolate grew by oxidizing acetate with the oxidized form of NAu-2 smectite as the electron acceptor. The Fe(II)-oxidizers grew by oxidation of chemically reduced smectite as the energy source with nitrate as the electron acceptor. The Bradyrhizobium isolates could also carry out aerobic oxidation of biotite. This is the first report of the recovery of a Fe(II)-oxidizing Nocardioides, and to date only one other Fe(II)-oxidizing Bradyrhizobium is known. The 16S rRNA gene sequences of the isolates were similar to ones found in clone libraries from Hanford 300 sediments and groundwater, suggesting that such organisms may be present and active in situ. Whole genome sequencing of the isolates is underway, the results of which will enable comparative genomic analysis of mechanisms of extracellular phyllosilicate Fe redox metabolism, and facilitate development of techniques to detect the presence and expression of genes associated with microbial phyllosilicate Fe redox cycling in sediments. PMID:24379809
Lessons from Library Power: Enriching Teaching and Learning. Final Report of the Evaluation of the National Library Power Initiative, an Initiative of the DeWitt Wallace-Reader's Digest Fund.

ERIC Educational Resources Information Center

Zweizig, Douglas L.; Hopkins, Dianne McAfee

This book presents the results of an evaluation of Library Power, an initiative of the DeWitt Wallace-Reader's Digest Fund that provided support for school library development in 19 communities. Following an introductory chapter, the chapters are organized around key questions of the evaluation. Chapters 2 through 4 address the implementation of…
A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

PubMed Central

Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

2009-01-01

Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386
Estimating P-coverage of biosynthetic pathways in DNA libraries and screening by genetic selection: biotin biosynthesis in the marine microorganism Chromohalobacter.

PubMed

Kim, Eun Jin; Angell, Scott; Janes, Jeff; Watanabe, Coran M H

2008-06-01

Traditional approaches to natural product discovery involve cell-based screening of natural product extracts followed by compound isolation and characterization. Their importance notwithstanding, continued mining leads to depletion of natural resources and the reisolation of previously identified metabolites. Metagenomic strategies aimed at localizing the biosynthetic cluster genes and expressing them in surrogate hosts offers one possible alternative. A fundamental question that naturally arises when pursuing such a strategy is, how large must the genomic library be to effectively represent the genome of an organism(s) and the biosynthetic gene clusters they harbor? Such an issue is certainly augmented in the absence of expensive robotics to expedite colony picking and/or screening of clones. We have developed an algorism, named BPC (biosynthetic pathway coverage), supported by molecular simulations to deduce the number of BAC clones required to achieve proper coverage of the genome and their respective biosynthetic pathways. The strategy has been applied to the construction of a large-insert BAC library from a marine microorganism, Hon6 (isolated from Honokohau, Maui) thought to represent a new species. The genomic library is constructed with a BAC yeast shuttle vector pClasper lacZ paving the way for the culturing of libraries in both prokaryotic and eukaryotic hosts. Flow cytometric methods are utilized to estimate the genome size of the organism and BPC implemented to assess P-coverage or percent coverage. A genetic selection strategy is illustrated, applications of which could expedite screening efforts in the identification and localization of biosynthetic pathways from marine microbial consortia, offering a powerful complement to genome sequencing and degenerate probe strategies. Implementing this approach, we report on the biotin biosynthetic pathway from the marine microorganism Hon6.
The development and characterisation of a bacterial artificial chromosome library for Fragaria vesca

PubMed Central

Bonet, Julio; Girona, Elena Lopez; Sargent, Daniel J; Muñoz-Torres, Monica C; Monfort, Amparo; Abbott, Albert G; Arús, Pere; Simpson, David W; Davik, Jahn

2009-01-01

Background The cultivated strawberry Fragaria ×ananassa is one of the most economically-important soft-fruit species. Few structural genomic resources have been reported for Fragaria and there exists an urgent need for the development of physical mapping resources for the genus. The first stage in the development of a physical map for Fragaria is the construction and characterisation of a high molecular weight bacterial artificial chromosome (BAC) library. Methods A BAC library, consisting of 18,432 clones was constructed from Fragaria vesca f. semperflorens accession 'Ali Baba'. BAC DNA from individual library clones was pooled to create a PCR-based screening assay for the library, whereby individual clones could be identified with just 34 PCR reactions. These pools were used to screen the BAC library and anchor individual clones to the diploid Fragaria reference map (FV×FN). Findings Clones from the BAC library developed contained an average insert size of 85 kb, representing over seven genome equivalents. The pools and superpools developed were used to identify a set of BAC clones containing 70 molecular markers previously mapped to the diploid Fragaria FV×FN reference map. The number of positive colonies identified for each marker suggests the library represents between 4× and 10× coverage of the diploid Fragaria genome, which is in accordance with the estimate of library coverage based on average insert size. Conclusion This BAC library will be used for the construction of a physical map for F. vesca and the superpools will permit physical anchoring of molecular markers using PCR. PMID:19772672
Candidate genes for obesity-susceptibility show enriched association within a large genome-wide association study for BMI.

PubMed

Vimaleswaran, Karani S; Tachmazidou, Ioanna; Zhao, Jing Hua; Hirschhorn, Joel N; Dudbridge, Frank; Loos, Ruth J F

2012-10-15

Before the advent of genome-wide association studies (GWASs), hundreds of candidate genes for obesity-susceptibility had been identified through a variety of approaches. We examined whether those obesity candidate genes are enriched for associations with body mass index (BMI) compared with non-candidate genes by using data from a large-scale GWAS. A thorough literature search identified 547 candidate genes for obesity-susceptibility based on evidence from animal studies, Mendelian syndromes, linkage studies, genetic association studies and expression studies. Genomic regions were defined to include the genes ±10 kb of flanking sequence around candidate and non-candidate genes. We used summary statistics publicly available from the discovery stage of the genome-wide meta-analysis for BMI performed by the genetic investigation of anthropometric traits consortium in 123 564 individuals. Hypergeometric, rank tail-strength and gene-set enrichment analysis tests were used to test for the enrichment of association in candidate compared with non-candidate genes. The hypergeometric test of enrichment was not significant at the 5% P-value quantile (P = 0.35), but was nominally significant at the 25% quantile (P = 0.015). The rank tail-strength and gene-set enrichment tests were nominally significant for the full set of genes and borderline significant for the subset without SNPs at P < 10(-7). Taken together, the observed evidence for enrichment suggests that the candidate gene approach retains some value. However, the degree of enrichment is small despite the extensive number of candidate genes and the large sample size. Studies that focus on candidate genes have only slightly increased chances of detecting associations, and are likely to miss many true effects in non-candidate genes, at least for obesity-related traits.
Characterization of Three Maize Bacterial Artificial Chromosome Libraries toward Anchoring of the Physical Map to the Genetic Map Using High-Density Bacterial Artificial Chromosome Filter Hybridization1

PubMed Central

Yim, Young-Sun; Davis, Georgia L.; Duru, Ngozi A.; Musket, Theresa A.; Linton, Eric W.; Messing, Joachim W.; McMullen, Michael D.; Soderlund, Carol A.; Polacco, Mary L.; Gardiner, Jack M.; Coe, Edward H.

2002-01-01

Three maize (Zea mays) bacterial artificial chromosome (BAC) libraries were constructed from inbred line B73. High-density filter sets from all three libraries, made using different restriction enzymes (HindIII, EcoRI, and MboI, respectively), were evaluated with a set of complex probes including the185-bp knob repeat, ribosomal DNA, two telomere-associated repeat sequences, four centromere repeats, the mitochondrial genome, a multifragment chloroplast DNA probe, and bacteriophage λ. The results indicate that the libraries are of high quality with low contamination by organellar and λ-sequences. The use of libraries from multiple enzymes increased the chance of recovering each region of the genome. Ninety maize restriction fragment-length polymorphism core markers were hybridized to filters of the HindIII library, representing 6× coverage of the genome, to initiate development of a framework for anchoring BAC contigs to the intermated B73 × Mo17 genetic map and to mark the bin boundaries on the physical map. All of the clones used as hybridization probes detected at least three BACs. Twenty-two single-copy number core markers identified an average of 7.4 ± 3.3 positive clones, consistent with the expectation of six clones. This information is integrated into fingerprinting data generated by the Arizona Genomics Institute to assemble the BAC contigs using fingerprint contig and contributed to the process of physical map construction. PMID:12481051
A Rapid Method of Genomic Array Analysis of Scaffold/Matrix Attachment Regions (S/MARs) Identifies a 2.5-Mb Region of Enhanced Scaffold/Matrix Attachment at a Human Neocentromere

PubMed Central

Sumer, Huseyin; Craig, Jeffrey M.; Sibson, Mandy; Choo, K.H. Andy

2003-01-01

Human neocentromeres are fully functional centromeres that arise at previously noncentromeric regions of the genome. We have tested a rapid procedure of genomic array analysis of chromosome scaffold/matrix attachment regions (S/MARs), involving the isolation of S/MAR DNA and hybridization of this DNA to a genomic BAC/PAC array. Using this procedure, we have defined a 2.5-Mb domain of S/MAR-enriched chromatin that fully encompasses a previously mapped centromere protein-A (CENP-A)-associated domain at a human neocentromere. We have independently verified this procedure using a previously established fluorescence in situ hybridization method on salt-treated metaphase chromosomes. In silico sequence analysis of the S/MAR-enriched and surrounding regions has revealed no outstanding sequence-related predisposition. This study defines the S/MAR-enriched domain of a higher eukaryotic centromere and provides a method that has broad application for the mapping of S/MAR attachment sites over large genomic regions or throughout a genome. PMID:12840048
Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for illumina genome sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Angelova, Angelina; Park, Sang-Hycuk; Kyndt, John

2013-09-01

With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis.more » The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.« less
CORALINA: a universal method for the generation of gRNA libraries for CRISPR-based screening.

PubMed

Köferle, Anna; Worf, Karolina; Breunig, Christopher; Baumann, Valentin; Herrero, Javier; Wiesbeck, Maximilian; Hutter, Lukas H; Götz, Magdalena; Fuchs, Christiane; Beck, Stephan; Stricker, Stefan H

2016-11-14

The bacterial CRISPR system is fast becoming the most popular genetic and epigenetic engineering tool due to its universal applicability and adaptability. The desire to deploy CRISPR-based methods in a large variety of species and contexts has created an urgent need for the development of easy, time- and cost-effective methods enabling large-scale screening approaches. Here we describe CORALINA (comprehensive gRNA library generation through controlled nuclease activity), a method for the generation of comprehensive gRNA libraries for CRISPR-based screens. CORALINA gRNA libraries can be derived from any source of DNA without the need of complex oligonucleotide synthesis. We show the utility of CORALINA for human and mouse genomic DNA, its reproducibility in covering the most relevant genomic features including regulatory, coding and non-coding sequences and confirm the functionality of CORALINA generated gRNAs. The simplicity and cost-effectiveness make CORALINA suitable for any experimental system. The unprecedented sequence complexities obtainable with CORALINA libraries are a necessary pre-requisite for less biased large scale genomic and epigenomic screens.
Citations as Data: Harvesting the Scholarly Record of Your University to Enrich Institutional Knowledge and Support Research

ERIC Educational Resources Information Center

Sterman, Leila Belle; Clark, Jason A.

2017-01-01

Many research libraries are looking for new ways to demonstrate value for their parent institutions. Metrics, assessment, and promotion of research continue to grow in importance, but they have not always fallen into the scope of services for the research library. Montana State University (MSU) Library recognized a need and interest to quantify…
A Century of Change: The Evolution of School Library Resources, 1915-2015

ERIC Educational Resources Information Center

Lamb, Annette

2015-01-01

School libraries have been in existence since at least the eighth century. However, it wasn't until the twentieth century that the school library was seen primarily as "a source of enrichment for the curriculum, and a means of developing reading and study habits in the pupils" (Clyde 1981, 263). While the formats available and tools for…
Enriching screening libraries with bioactive fragment space.

PubMed

Zhang, Na; Zhao, Hongtao

2016-08-01

By deconvoluting 238,073 bioactive molecules in the ChEMBL library into extended Murcko ring systems, we identified a set of 2245 ring systems present in at least 10 molecules. These ring systems belong to 2221 clusters by ECFP4 fingerprints with a minimum intracluster similarity of 0.8. Their overlap with ring systems in commercial libraries was further quantified. Our findings suggest that success of a small fragment library is driven by the convergence of effective coverage of bioactive ring systems (e.g., 10% coverage by 1000 fragments vs. 40% by 2million HTS compounds), high enrichment of bioactive ring systems, and low molecular complexity enhancing the probability of a match with the protein targets. Reconciling with the previous studies, bioactive ring systems are underrepresented in screening libraries. As such, we propose a library of virtual fragments with key functionalities via fragmentation of bioactive molecules. Its utility is exemplified by a prospective application on protein kinase CK2, resulting in the discovery of a series of novel inhibitors with the most potent compound having an IC50 of 0.5μM and a ligand efficiency of 0.41kcal/mol per heavy atom. Copyright © 2016 Elsevier Ltd. All rights reserved.
CRISPR/Cas9-mediated gene knockout screens and target identification via whole-genome sequencing uncover host genes required for picornavirus infection.

PubMed

Kim, Heon Seok; Lee, Kyungjin; Bae, Sangsu; Park, Jeongbin; Lee, Chong-Kyo; Kim, Meehyein; Kim, Eunji; Kim, Minju; Kim, Seokjoong; Kim, Chonsaeng; Kim, Jin-Soo

2017-06-23

Several groups have used genome-wide libraries of lentiviruses encoding small guide RNAs (sgRNAs) for genetic screens. In most cases, sgRNA expression cassettes are integrated into cells by using lentiviruses, and target genes are statistically estimated by the readout of sgRNA sequences after targeted sequencing. We present a new virus-free method for human gene knockout screens using a genome-wide library of CRISPR/Cas9 sgRNAs based on plasmids and target gene identification via whole-genome sequencing (WGS) confirmation of authentic mutations rather than statistical estimation through targeted amplicon sequencing. We used 30,840 pairs of individually synthesized oligonucleotides to construct the genome-scale sgRNA library, collectively targeting 10,280 human genes ( i.e. three sgRNAs per gene). These plasmid libraries were co-transfected with a Cas9-expression plasmid into human cells, which were then treated with cytotoxic drugs or viruses. Only cells lacking key factors essential for cytotoxic drug metabolism or viral infection were able to survive. Genomic DNA isolated from cells that survived these challenges was subjected to WGS to directly identify CRISPR/Cas9-mediated causal mutations essential for cell survival. With this approach, we were able to identify known and novel genes essential for viral infection in human cells. We propose that genome-wide sgRNA screens based on plasmids coupled with WGS are powerful tools for forward genetics studies and drug target discovery. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Genome-wide digital transcript analysis of putative fruitlet abscission related genes regulated by ethephon in litchi

PubMed Central

Li, Caiqin; Wang, Yan; Ying, Peiyuan; Ma, Wuqiang; Li, Jianguo

2015-01-01

The high level of physiological fruitlet abscission in litchi (Litchi chinensis Sonn.) causes severe yield loss. Cell separation occurs at the fruit abscission zone (FAZ) and can be triggered by ethylene. However, a deep knowledge of the molecular events occurring in the FAZ is still unknown. Here, genome-wide digital transcript abundance (DTA) analysis of putative fruit abscission related genes regulated by ethephon in litchi were studied. More than 81 million high quality reads from seven ethephon treated and untreated control libraries were obtained by high-throughput sequencing. Through DTA profile analysis in combination with Gene Ontology and KEGG pathway enrichment analyses, a total of 2730 statistically significant candidate genes were involved in the ethephon-promoted litchi fruitlet abscission. Of these, there were 1867 early-responsive genes whose expressions were up- or down-regulated from 0 to 1 d after treatment. The most affected genes included those related to ethylene biosynthesis and signaling, auxin transport and signaling, transcription factors (TFs), protein ubiquitination, ROS response, calcium signal transduction, and cell wall modification. These genes could be clustered into four groups and 13 subgroups according to their similar expression patterns. qRT-PCR displayed the expression pattern of 41 selected candidate genes, which proved the accuracy of our DTA data. Ethephon treatment significantly increased fruit abscission and ethylene production of fruitlet. The possible molecular events to control the ethephon-promoted litchi fruitlet abscission were prompted out. The increased ethylene evolution in fruitlet would suppress the synthesis and polar transport of auxin and trigger abscission signaling. To the best of our knowledge, it is the first time to monitor the gene expression profile occurring in the FAZ-enriched pedicel during litchi fruit abscission induced by ethephon on the genome-wide level. This study will contribute to a better understanding for the molecular regulatory mechanism of fruit abscission in litchi. PMID:26217356
The Library Work Order Processing System: A New Approach to Motivate Employees and to Increase Production in the Technical Service Department of Mercer County Community College Library. Applied Educational Research and Evaluation.

ERIC Educational Resources Information Center

Sim, Yong Sup

After reviewing the current movement toward job enrichment, a system was designed for the technical services department of the Mercer County Community College Library. The Library Work Order Processing System, as tried between January and March, 1974, was designed to permit each worker more variety of jobs. The technical services department was…
Selection of stable scFv antibodies by phage display.

PubMed

Brockmann, Eeva-Christine

2012-01-01

ScFv fragments are popular recombinant antibody formats but often suffer from limited stability. Phage display is a powerful tool in antibody engineering and applicable also for stability selection. ScFv variants with improved stability can be selected from large randomly mutated phage displayed libraries with a specific antigen after the unstable variants have been inactivated by heat or GdmCl. Irreversible scFv denaturation, which is a prerequisite for efficient selection, is achieved by combining denaturation with reduction of the intradomain disulfide bonds. Repeated selection cycles of increasing stringency result in enrichment of stabilized scFv fragments. Procedures for constructing a randomly mutated scFv library by error-prone PCR and phage display selection for enrichment of stable scFv antibodies from the library are described here.
Community analysis of hydrogen-producing extreme thermophilic anaerobic microflora enriched from cow manure with five substrates.

PubMed

Yokoyama, Hiroshi; Moriya, Naoko; Ohmori, Hideyuki; Waki, Miyoko; Ogino, Akifumi; Tanaka, Yasuo

2007-11-01

The present study analyzed the community structures of anaerobic microflora producing hydrogen under extreme thermophilic conditions by two culture-independent methods: denaturing gradient gel electrophoresis (DGGE) and clone library analyses. Extreme thermophilic microflora (ETM) was enriched from cow manure by repeated batch cultures at 75 degrees C, using a substrate of xylose, glucose, lactose, cellobiose, or soluble starch, and produced hydrogen at yields of 0.56, 2.65, 2.17, 2.68, and 1.73 mol/mol-monosaccharide degraded, respectively. The results from the DGGE and clone library analyses were consistent and demonstrated that the community structures of ETM enriched with the four hexose-based substrates (glucose, lactose, cellobiose, and soluble starch) consisted of a single species, closely related to a hydrogen-producing extreme thermophile, Caldoanaerobacter subterraneus, with diversity at subspecies levels. The ETM enriched with xylose was more diverse than those enriched with the other substrates, and contained the bacterium related to C. subterraneus and an unclassified bacterium, distantly related to a xylan-degrading and hydrogen-producing extreme thermophile, Caloramator fervidus.

A new age in functional genomics using CRISPR/Cas9 in arrayed library screening.

PubMed

Agrotis, Alexander; Ketteler, Robin

2015-01-01

CRISPR technology has rapidly changed the face of biological research, such that precise genome editing has now become routine for many labs within several years of its initial development. What makes CRISPR/Cas9 so revolutionary is the ability to target a protein (Cas9) to an exact genomic locus, through designing a specific short complementary nucleotide sequence, that together with a common scaffold sequence, constitute the guide RNA bridging the protein and the DNA. Wild-type Cas9 cleaves both DNA strands at its target sequence, but this protein can also be modified to exert many other functions. For instance, by attaching an activation domain to catalytically inactive Cas9 and targeting a promoter region, it is possible to stimulate the expression of a specific endogenous gene. In principle, any genomic region can be targeted, and recent efforts have successfully generated pooled guide RNA libraries for coding and regulatory regions of human, mouse and Drosophila genomes with high coverage, thus facilitating functional phenotypic screening. In this review, we will highlight recent developments in the area of CRISPR-based functional genomics and discuss potential future directions, with a special focus on mammalian cell systems and arrayed library screening.
Near-Complete Genome Sequence of Thalassospira sp. Strain KO164 Isolated from a Lignin-Enriched Marine Sediment Microcosm

PubMed Central

Woo, Hannah L.; O’Dell, Kaela B.; Utturkar, Sagar; McBride, Kathryn R.; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T. B. K.; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Woyke, Tanja; Brown, Steven D.

2016-01-01

Thalassospira sp. strain KO164 was isolated from eastern Mediterranean seawater and sediment laboratory microcosms enriched on insoluble organosolv lignin under oxic conditions. The near-complete genome sequence presented here will facilitate analyses into this deep-ocean bacterium’s ability to degrade recalcitrant organics such as lignin. PMID:27881538
Draft Genome Sequence of a Novel Thermofilum sp. Strain from a New Zealand Hot Spring Enrichment Culture

DOE PAGES

Reysenbach, Anna-Louise; Donaho, John; Hinsch, Todd; ...

2018-02-22

A draft genome of a newThermofilumsp. strain was obtained from an enrichment culture metagenome. Like its relatives,Thermofilumsp. strain NZ13 is adapted to organic-rich thermal environments and has to depend on other organisms and the environment for some key amino acids, purines, and cofactors.
Near-Complete Genome Sequence of Thalassospira sp. Strain KO164 Isolated from a Lignin-Enriched Marine Sediment Microcosm

DOE PAGES

Woo, Hannah L.; O’Dell, Kaela B.; Utturkar, Sagar; ...

2016-11-23

We isolated Thalassospirasp. strain KO164 from eastern Mediterranean seawater and sediment laboratory microcosms enriched on insoluble organosolv lignin under oxic conditions. Furthermore, an analysis of the deep-ocean bacterium’s ability to degrade recalcitrant organics such as lignin near-complete genome sequence, will be presented here.
Near-Complete Genome Sequence of Thalassospira sp. Strain KO164 Isolated from a Lignin-Enriched Marine Sediment Microcosm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woo, Hannah L.; O’Dell, Kaela B.; Utturkar, Sagar

We isolated Thalassospirasp. strain KO164 from eastern Mediterranean seawater and sediment laboratory microcosms enriched on insoluble organosolv lignin under oxic conditions. Furthermore, an analysis of the deep-ocean bacterium’s ability to degrade recalcitrant organics such as lignin near-complete genome sequence, will be presented here.
Draft Genome Sequence of a Novel Thermofilum sp. Strain from a New Zealand Hot Spring Enrichment Culture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reysenbach, Anna-Louise; Donaho, John; Hinsch, Todd

A draft genome of a newThermofilumsp. strain was obtained from an enrichment culture metagenome. Like its relatives,Thermofilumsp. strain NZ13 is adapted to organic-rich thermal environments and has to depend on other organisms and the environment for some key amino acids, purines, and cofactors.
Cloning of polymorphisms (COP): enrichment of polymorphic sequences from complex genomes

PubMed Central

Li, Jingfeng; Wang, Fuli; Zabarovska, Veronika; Wahlestedt, Claes; Zabarovsky, Eugene R.

2000-01-01

Here we describe a new procedure (cloning of polymorphisms, COP) for enrichment of single nucleotide polymorphisms (SNPs) that represent restriction fragment length polymorphisms (RFLPs). COP would be applicable to the isolation of SNPs from particular regions of the genome, e.g. CpG islands, chromosomal bands, YACs or PAC contigs. A combination of digestion with restriction enzymes, treatment with uracil-DNA glycosylase and mung bean nuclease, PCR amplification and purification with streptavidin magnetic beads was used to isolate polymorphic sequences from the genomes of two human samples. After only two cycles of enrichment, 80% of the isolated clones were found to contain RFLPs. A simple method for the PCR detection of these polymorphisms was also developed. PMID:10606669
Democratizing Human Genome Project Information: A Model Program for Education, Information and Debate in Public Libraries.

ERIC Educational Resources Information Center

Pollack, Miriam

The "Mapping the Human Genome" project demonstrated that librarians can help whomever they serve in accessing information resources in the areas of biological and health information, whether it is the scientists who are developing the information or a member of the public who is using the information. Public libraries can guide library…
Gene recovery microdissection (GRM) a process for producing chromosome region-specific libraries of expressed genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christian, A T; Coleman, M A; Tucker, J D

2001-02-08

Gene Recovery Microdissection (GRM) is a unique and cost-effective process for producing chromosome region-specific libraries of expressed genes. It accelerates the pace, reduces the cost, and extends the capabilities of functional genomic research, the means by which scientists will put to life-saving, life-enhancing use their knowledge of any plant or animal genome.
Next-generation sequencing library construction on a surface.

PubMed

Feng, Kuan; Costa, Justin; Edwards, Jeremy S

2018-05-30

Next-generation sequencing (NGS) has revolutionized almost all fields of biology, agriculture and medicine, and is widely utilized to analyse genetic variation. Over the past decade, the NGS pipeline has been steadily improved, and the entire process is currently relatively straightforward. However, NGS instrumentation still requires upfront library preparation, which can be a laborious process, requiring significant hands-on time. Herein, we present a simple but robust approach to streamline library preparation by utilizing surface bound transposases to construct DNA libraries directly on a flowcell surface. The surface bound transposases directly fragment genomic DNA while simultaneously attaching the library molecules to the flowcell. We sequenced and analysed a Drosophila genome library generated by this surface tagmentation approach, and we showed that our surface bound library quality was comparable to the quality of the library from a commercial kit. In addition to the time and cost savings, our approach does not require PCR amplification of the library, which eliminates potential problems associated with PCR duplicates. We described the first study to construct libraries directly on a flowcell. We believe our technique could be incorporated into the existing Illumina sequencing pipeline to simplify the workflow, reduce costs, and improve data quality.
zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm.

PubMed

Sand, Andreas; Kristiansen, Martin; Pedersen, Christian N S; Mailund, Thomas

2013-11-22

Hidden Markov models are widely used for genome analysis as they combine ease of modelling with efficient analysis algorithms. Calculating the likelihood of a model using the forward algorithm has worst case time complexity linear in the length of the sequence and quadratic in the number of states in the model. For genome analysis, however, the length runs to millions or billions of observations, and when maximising the likelihood hundreds of evaluations are often needed. A time efficient forward algorithm is therefore a key ingredient in an efficient hidden Markov model library. We have built a software library for efficiently computing the likelihood of a hidden Markov model. The library exploits commonly occurring substrings in the input to reuse computations in the forward algorithm. In a pre-processing step our library identifies common substrings and builds a structure over the computations in the forward algorithm which can be reused. This analysis can be saved between uses of the library and is independent of concrete hidden Markov models so one preprocessing can be used to run a number of different models.Using this library, we achieve up to 78 times shorter wall-clock time for realistic whole-genome analyses with a real and reasonably complex hidden Markov model. In one particular case the analysis was performed in less than 8 minutes compared to 9.6 hours for the previously fastest library. We have implemented the preprocessing procedure and forward algorithm as a C++ library, zipHMM, with Python bindings for use in scripts. The library is available at http://birc.au.dk/software/ziphmm/.
Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies

PubMed Central

2015-01-01

Background Most models of genome evolution concern either genetic sequences, gene content or gene order. They sometimes integrate two of the three levels, but rarely the three of them. Probabilistic models of gene order evolution usually have to assume constant gene content or adopt a presence/absence coding of gene neighborhoods which is blind to complex events modifying gene content. Results We propose a probabilistic evolutionary model for gene neighborhoods, allowing genes to be inserted, duplicated or lost. It uses reconciled phylogenies, which integrate sequence and gene content evolution. We are then able to optimize parameters such as phylogeny branch lengths, or probabilistic laws depicting the diversity of susceptibility of syntenic regions to rearrangements. We reconstruct a structure for ancestral genomes by optimizing a likelihood, keeping track of all evolutionary events at the level of gene content and gene synteny. Ancestral syntenies are associated with a probability of presence. We implemented the model with the restriction that at most one gene duplication separates two gene speciations in reconciled gene trees. We reconstruct ancestral syntenies on a set of 12 drosophila genomes, and compare the evolutionary rates along the branches and along the sites. We compare with a parsimony method and find a significant number of results not supported by the posterior probability. The model is implemented in the Bio++ library. It thus benefits from and enriches the classical models and methods for molecular evolution. PMID:26452018
Evolution-guided optimization of biosynthetic pathways.

PubMed

Raman, Srivatsan; Rogers, Jameson K; Taylor, Noah D; Church, George M

2014-12-16

Engineering biosynthetic pathways for chemical production requires extensive optimization of the host cellular metabolic machinery. Because it is challenging to specify a priori an optimal design, metabolic engineers often need to construct and evaluate a large number of variants of the pathway. We report a general strategy that combines targeted genome-wide mutagenesis to generate pathway variants with evolution to enrich for rare high producers. We convert the intracellular presence of the target chemical into a fitness advantage for the cell by using a sensor domain responsive to the chemical to control a reporter gene necessary for survival under selective conditions. Because artificial selection tends to amplify unproductive cheaters, we devised a negative selection scheme to eliminate cheaters while preserving library diversity. This scheme allows us to perform multiple rounds of evolution (addressing ∼10(9) cells per round) with minimal carryover of cheaters after each round. Based on candidate genes identified by flux balance analysis, we used targeted genome-wide mutagenesis to vary the expression of pathway genes involved in the production of naringenin and glucaric acid. Through up to four rounds of evolution, we increased production of naringenin and glucaric acid by 36- and 22-fold, respectively. Naringenin production (61 mg/L) from glucose was more than double the previous highest titer reported. Whole-genome sequencing of evolved strains revealed additional untargeted mutations that likely benefit production, suggesting new routes for optimization.
Development of Genomic Simple Sequence Repeats (SSR) by Enrichment Libraries in Date Palm.

PubMed

Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj

2017-01-01

Development of highly informative markers such as simple sequence repeats (SSR) for cultivar identification and germplasm characterization and management is essential for date palms genetic studies. The present study documents the development of SSR markers and assesses genetic relationships of commonly grown date palm (Phoenix dactylifera L.) cultivars in different geographical regions of Saudi Arabia. A total of 93 novel simple sequence repeat (SSR) markers were screened for their ability to detect polymorphism in date palm. Around 71% of genomic SSRs are dinucleotide, 25% trinucleotide, 3% tetranucleotide, and 1% pentanucleotide motives and show 100% polymorphism. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis illustrates that cultivars trend to group according to their class of maturity, region of cultivation, and fruit color. Analysis of molecular variations (AMOVA) reveals genetic variation among and within cultivars of 27% and 73%, respectively, according to the geographical distribution of the cultivars. Developed microsatellite markers are of additional value to date palm characterization, tools which can be used by researchers in population genetics, cultivar identification, as well as genetic resource exploration and management. The cultivars tested exhibited a significant amount of genetic diversity and could be suitable for successful breeding programs. Genomic sequences generated from this study are available at the National Center for Biotechnology Information (NCBI), Sequence Read Archive (Accession numbers. LIBGSS_039019).
Principles and application of antibody libraries for infectious diseases.

PubMed

Lim, Bee Nar; Tye, Gee Jun; Choong, Yee Siew; Ong, Eugene Boon Beng; Ismail, Asma; Lim, Theam Soon

2014-12-01

Antibodies have been used efficiently for the treatment and diagnosis of many diseases. Recombinant antibody technology allows the generation of fully human antibodies. Phage display is the gold standard for the production of human antibodies in vitro. To generate monoclonal antibodies by phage display, the generation of antibody libraries is crucial. Antibody libraries are classified according to the source where the antibody gene sequences were obtained. The most useful library for infectious diseases is the immunized library. Immunized libraries would allow better and selective enrichment of antibodies against disease antigens. The antibodies generated from these libraries can be translated for both diagnostic and therapeutic applications. This review focuses on the generation of immunized antibody libraries and the potential applications of the antibodies derived from these libraries.
Using Institute of Museum and Library Services Grants to Support Out-of-School Time Programs. Funding Note

ERIC Educational Resources Information Center

Griffin, Shawn Stelow

2010-01-01

Out-of-school time programs give many youth the chance to engage in interesting and enriching opportunities in the arts. One source of funding for art and cultural activities in out-of-school time programs is The Institute of Museum and Library Services (IMLS). This federal agency is charged with creating strong libraries and museums that connect…
Library preparation and data analysis packages for rapid genome sequencing.

PubMed

Pomraning, Kyle R; Smith, Kristina M; Bredeweg, Erin L; Connolly, Lanelle R; Phatale, Pallavi A; Freitag, Michael

2012-01-01

High-throughput sequencing (HTS) has quickly become a valuable tool for comparative genetics and genomics and is now regularly carried out in laboratories that are not connected to large sequencing centers. Here we describe an updated version of our protocol for constructing single- and paired-end Illumina sequencing libraries, beginning with purified genomic DNA. The present protocol can also be used for "multiplexing," i.e. the analysis of several samples in a single flowcell lane by generating "barcoded" or "indexed" Illumina sequencing libraries in a way that is independent from Illumina-supported methods. To analyze sequencing results, we suggest several independent approaches but end users should be aware that this is a quickly evolving field and that currently many alignment (or "mapping") and counting algorithms are being developed and tested.
The Application of Next-Generation Sequencing for Mutation Detection in Autosomal-Dominant Hereditary Hearing Impairment.

PubMed

Gürtler, Nicolas; Röthlisberger, Benno; Ludin, Katja; Schlegel, Christoph; Lalwani, Anil K

2017-07-01

Identification of the causative mutation using next-generation sequencing in autosomal-dominant hereditary hearing impairment, as mutation analysis in hereditary hearing impairment by classic genetic methods, is hindered by the high heterogeneity of the disease. Two Swiss families with autosomal-dominant hereditary hearing impairment. Amplified DNA libraries for next-generation sequencing were constructed from extracted genomic DNA, derived from peripheral blood, and enriched by a custom-made sequence capture library. Validated, pooled libraries were sequenced on an Illumina MiSeq instrument, 300 cycles and paired-end sequencing. Technical data analysis was performed with SeqMonk, variant analysis with GeneTalk or VariantStudio. The detection of mutations in genes related to hearing loss by next-generation sequencing was subsequently confirmed using specific polymerase-chain-reaction and Sanger sequencing. Mutation detection in hearing-loss-related genes. The first family harbored the mutation c.5383+5delGTGA in the TECTA-gene. In the second family, a novel mutation c.2614-2625delCATGGCGCCGTG in the WFS1-gene and a second mutation TCOF1-c.1028G>A were identified. Next-generation sequencing successfully identified the causative mutation in families with autosomal-dominant hereditary hearing impairment. The results helped to clarify the pathogenic role of a known mutation and led to the detection of a novel one. NGS represents a feasible approach with great potential future in the diagnostics of hereditary hearing impairment, even in smaller labs.
Advances in genome-wide RNAi cellular screens: a case study using the Drosophila JAK/STAT pathway

PubMed Central

2012-01-01

Background Genome-scale RNA-interference (RNAi) screens are becoming ever more common gene discovery tools. However, whilst every screen identifies interacting genes, less attention has been given to how factors such as library design and post-screening bioinformatics may be effecting the data generated. Results Here we present a new genome-wide RNAi screen of the Drosophila JAK/STAT signalling pathway undertaken in the Sheffield RNAi Screening Facility (SRSF). This screen was carried out using a second-generation, computationally optimised dsRNA library and analysed using current methods and bioinformatic tools. To examine advances in RNAi screening technology, we compare this screen to a biologically very similar screen undertaken in 2005 with a first-generation library. Both screens used the same cell line, reporters and experimental design, with the SRSF screen identifying 42 putative regulators of JAK/STAT signalling, 22 of which verified in a secondary screen and 16 verified with an independent probe design. Following reanalysis of the original screen data, comparisons of the two gene lists allows us to make estimates of false discovery rates in the SRSF data and to conduct an assessment of off-target effects (OTEs) associated with both libraries. We discuss the differences and similarities between the resulting data sets and examine the relative improvements in gene discovery protocols. Conclusions Our work represents one of the first direct comparisons between first- and second-generation libraries and shows that modern library designs together with methodological advances have had a significant influence on genome-scale RNAi screens. PMID:23006893
YAC cloning Mus musculus telomeric DNA: physical, genetic, in situ and STS markers for the distal telomere of chromosome 10.

PubMed

Kipling, D; Wilson, H E; Thomson, E J; Cooke, H J

1995-06-01

Three Mus musculus DBA/2 YAC libraries were constructed using a half-YAC telomere cloning vector. This functional complementation approach yields libraries which include terminal restriction fragments of the mouse genome. Screening all three libraries led to the isolation of 32 independent clones which carry linear YACs containing the mouse terminal repeat sequence, (TTAGGG)n. These YACs provide a resource to isolate regions of the mouse genome close to chromosome termini and excluded from existing conventional YAC libraries. To demonstrate their utility, a hybridization probe was isolated from Mtel-1, the first (TTAGGG)n-containing YAC isolated. This probe detects a approximately 70 kb Kpnl fragment in the mouse genome which is sensitive to pretreatment with BAL31 exonuclease. A PCR-based genetic marker generated from the sequence of this probe maps 4.4 cM from the most distal anchor locus on chromosome 10 in the EUCIB interspecific backcross. STS primers for this locus, D10Hgu1, were used to isolate YAC 110F4 from a commercially available mouse YAC library. Fluorescence in situ hybridization demonstrates that YAC 110F4 hybridizes to the distal telomere of chromosome 10. Clones in this collection of telomere YACs therefore partially overlap clones in conventional YAC libraries, and thus the previously unavailable terminal regions of the mouse genome can now be linked with the developing mouse STS YAC contig. Genetic markers such as D10Hgu1 allow the ends of the mouse genetic map to be defined, thus closing the map.

pileup.js: a JavaScript library for interactive and in-browser visualization of genomic data.

PubMed

Vanderkam, Dan; Aksoy, B Arman; Hodes, Isaac; Perrone, Jaclyn; Hammerbacher, Jeff

2016-08-01

P: ileup.js is a new browser-based genome viewer. It is designed to facilitate the investigation of evidence for genomic variants within larger web applications. It takes advantage of recent developments in the JavaScript ecosystem to provide a modular, reliable and easily embedded library. The code and documentation for pileup.js is publicly available at https://github.com/hammerlab/pileup.js under the Apache 2.0 license. correspondence@hammerlab.org. © The Author 2016. Published by Oxford University Press.
Cinnamides as selective small-molecule inhibitors of a cellular model of breast cancer stem cells.

PubMed

Germain, Andrew R; Carmody, Leigh C; Nag, Partha P; Morgan, Barbara; Verplank, Lynn; Fernandez, Cristina; Donckele, Etienne; Feng, Yuxiong; Perez, Jose R; Dandapani, Sivaraman; Palmer, Michelle; Lander, Eric S; Gupta, Piyush B; Schreiber, Stuart L; Munoz, Benito

2013-03-15

A high-throughput screen (HTS) was conducted against stably propagated cancer stem cell (CSC)-enriched populations using a library of 300,718 compounds from the National Institutes of Health (NIH) Molecular Libraries Small Molecule Repository (MLSMR). A cinnamide analog displayed greater than 20-fold selective inhibition of the breast CSC-like cell line (HMLE_sh_Ecad) over the isogenic control cell line (HMLE_sh_eGFP). Herein, we report structure-activity relationships of this class of cinnamides for selective lethality towards CSC-enriched populations. Copyright © 2013. Published by Elsevier Ltd.
Genomic profiling of plasma cell disorders in a clinical setting: integration of microarray and FISH, after CD138 selection of bone marrow

PubMed Central

Berry, Nadine Kaye; Bain, Nicole L; Enjeti, Anoop K; Rowlings, Philip

2014-01-01

Aim To evaluate the role of whole genome comparative genomic hybridisation microarray (array-CGH) in detecting genomic imbalances as compared to conventional karyotype (GTG-analysis) or myeloma specific fluorescence in situ hybridisation (FISH) panel in a diagnostic setting for plasma cell dyscrasia (PCD). Methods A myeloma-specific interphase FISH (i-FISH) panel was carried out on CD138 PC-enriched bone marrow (BM) from 20 patients having BM biopsies for evaluation of PCD. Whole genome array-CGH was performed on reference (control) and neoplastic (test patient) genomic DNA extracted from CD138 PC-enriched BM and analysed. Results Comparison of techniques demonstrated a much higher detection rate of genomic imbalances using array-CGH. Genomic imbalances were detected in 1, 19 and 20 patients using GTG-analysis, i-FISH and array-CGH, respectively. Genomic rearrangements were detected in one patient using GTG-analysis and seven patients using i-FISH, while none were detected using array-CGH. I-FISH was the most sensitive method for detecting gene rearrangements and GTG-analysis was the least sensitive method overall. All copy number aberrations observed in GTG-analysis were detected using array-CGH and i-FISH. Conclusions We show that array-CGH performed on CD138-enriched PCs significantly improves the detection of clinically relevant and possibly novel genomic abnormalities in PCD, and thus could be considered as a standard diagnostic technique in combination with IGH rearrangement i-FISH. PMID:23969274
Genomic profiling of plasma cell disorders in a clinical setting: integration of microarray and FISH, after CD138 selection of bone marrow.

PubMed

Berry, Nadine Kaye; Bain, Nicole L; Enjeti, Anoop K; Rowlings, Philip

2014-01-01

To evaluate the role of whole genome comparative genomic hybridisation microarray (array-CGH) in detecting genomic imbalances as compared to conventional karyotype (GTG-analysis) or myeloma specific fluorescence in situ hybridisation (FISH) panel in a diagnostic setting for plasma cell dyscrasia (PCD). A myeloma-specific interphase FISH (i-FISH) panel was carried out on CD138 PC-enriched bone marrow (BM) from 20 patients having BM biopsies for evaluation of PCD. Whole genome array-CGH was performed on reference (control) and neoplastic (test patient) genomic DNA extracted from CD138 PC-enriched BM and analysed. Comparison of techniques demonstrated a much higher detection rate of genomic imbalances using array-CGH. Genomic imbalances were detected in 1, 19 and 20 patients using GTG-analysis, i-FISH and array-CGH, respectively. Genomic rearrangements were detected in one patient using GTG-analysis and seven patients using i-FISH, while none were detected using array-CGH. I-FISH was the most sensitive method for detecting gene rearrangements and GTG-analysis was the least sensitive method overall. All copy number aberrations observed in GTG-analysis were detected using array-CGH and i-FISH. We show that array-CGH performed on CD138-enriched PCs significantly improves the detection of clinically relevant and possibly novel genomic abnormalities in PCD, and thus could be considered as a standard diagnostic technique in combination with IGH rearrangement i-FISH.
Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

USDA-ARS?s Scientific Manuscript database

A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...
Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS

PubMed Central

Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.

2016-01-01

Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560
Extending Immunological Profiling in the Gilthead Sea Bream, Sparus aurata, by Enriched cDNA Library Analysis, Microarray Design and Initial Studies upon the Inflammatory Response to PAMPs.

PubMed

Boltaña, Sebastian; Castellana, Barbara; Goetz, Giles; Tort, Lluis; Teles, Mariana; Mulero, Victor; Novoa, Beatriz; Figueras, Antonio; Goetz, Frederick W; Gallardo-Escarate, Cristian; Planas, Josep V; Mackenzie, Simon

2017-02-03

This study describes the development and validation of an enriched oligonucleotide-microarray platform for Sparus aurata (SAQ) to provide a platform for transcriptomic studies in this species. A transcriptome database was constructed by assembly of gilthead sea bream sequences derived from public repositories of mRNA together with reads from a large collection of expressed sequence tags (EST) from two extensive targeted cDNA libraries characterizing mRNA transcripts regulated by both bacterial and viral challenge. The developed microarray was further validated by analysing monocyte/macrophage activation profiles after challenge with two Gram-negative bacterial pathogen-associated molecular patterns (PAMPs; lipopolysaccharide (LPS) and peptidoglycan (PGN)). Of the approximately 10,000 EST sequenced, we obtained a total of 6837 EST longer than 100 nt, with 3778 and 3059 EST obtained from the bacterial-primed and from the viral-primed cDNA libraries, respectively. Functional classification of contigs from the bacterial- and viral-primed cDNA libraries by Gene Ontology (GO) showed that the top five represented categories were equally represented in the two libraries: metabolism (approximately 24% of the total number of contigs), carrier proteins/membrane transport (approximately 15%), effectors/modulators and cell communication (approximately 11%), nucleoside, nucleotide and nucleic acid metabolism (approximately 7.5%) and intracellular transducers/signal transduction (approximately 5%). Transcriptome analyses using this enriched oligonucleotide platform identified differential shifts in the response to PGN and LPS in macrophage-like cells, highlighting responsive gene-cassettes tightly related to PAMP host recognition. As observed in other fish species, PGN is a powerful activator of the inflammatory response in S. aurata macrophage-like cells. We have developed and validated an oligonucleotide microarray (SAQ) that provides a platform enriched for the study of gene expression in S. aurata with an emphasis upon immunity and the immune response.
Validation and application of quantitative PCR assays using host-specific Bacteroidales genetic markers for swine fecal pollution tracking.

PubMed

Fan, Lihua; Shuai, Jiangbing; Zeng, Ruoxue; Mo, Hongfei; Wang, Suhua; Zhang, Xiaofeng; He, Yongqiang

2017-12-01

Genome fragment enrichment (GFE) method was applied to identify host-specific bacterial genetic markers that differ among different fecal metagenomes. To enrich for swine-specific DNA fragments, swine fecal DNA composite (n = 34) was challenged against a DNA composite consisting of cow, human, goat, sheep, chicken, duck and goose fecal DNA extracts (n = 83). Bioinformatic analyses of 384 non-redundant swine enriched metagenomic sequences indicated a preponderance of Bacteroidales-like regions predicted to encode metabolism-associated, cellular processes and information storage and processing. After challenged against fecal DNA extracted from different animal sources, four sequences from the clone libraries targeting two Bacteroidales- (genes 1-38 and 3-53), a Clostridia- (gene 2-109) as well as a Bacilli-like sequence (gene 2-95), respectively, showed high specificity to swine feces based on PCR analysis. Host-specificity and host-sensitivity analysis confirmed that oligonucleotide primers and probes capable of annealing to select Bacteroidales-like sequences (1-38 and 3-53) exhibited high specificity (>90%) in quantitative PCR assays with 71 fecal DNAs from non-target animal sources. The two assays also demonstrated broad distributions of corresponding genetic markers (>94% positive) among 72 swine feces. After evaluation with environmental water samples from different areas, swine-targeted assays based on two Bacteroidales-like GFE sequences appear to be suitable quantitative tracing tools for swine fecal pollution. Copyright © 2017 Elsevier Ltd. All rights reserved.
Phylogenetic analysis of anaerobic psychrophilic enrichment cultures obtained from a greenland glacier ice core

NASA Technical Reports Server (NTRS)

Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

2003-01-01

The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at -9 degrees C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 x 10(7) cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at -2 degrees C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years.
Phylogenetic Analysis of Anaerobic Psychrophilic Enrichment Cultures Obtained from a Greenland Glacier Ice Core

PubMed Central

Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

2003-01-01

The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at −9°C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 × 107 cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at −2°C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years. PMID:12676695
Whole-Genome Sequencing of the World’s Oldest People

PubMed Central

Gierman, Hinco J.; Fortney, Kristen; Roach, Jared C.; Coles, Natalie S.; Li, Hong; Glusman, Gustavo; Markov, Glenn J.; Smith, Justin D.; Hood, Leroy; Coles, L. Stephen; Kim, Stuart K.

2014-01-01

Supercentenarians (110 years or older) are the world’s oldest people. Seventy four are alive worldwide, with twenty two in the United States. We performed whole-genome sequencing on 17 supercentenarians to explore the genetic basis underlying extreme human longevity. We found no significant evidence of enrichment for a single rare protein-altering variant or for a gene harboring different rare protein altering variants in supercentenarian compared to control genomes. We followed up on the gene most enriched for rare protein-altering variants in our cohort of supercentenarians, TSHZ3, by sequencing it in a second cohort of 99 long-lived individuals but did not find a significant enrichment. The genome of one supercentenarian had a pathogenic mutation in DSC2, known to predispose to arrhythmogenic right ventricular cardiomyopathy, which is recommended to be reported to this individual as an incidental finding according to a recent position statement by the American College of Medical Genetics and Genomics. Even with this pathogenic mutation, the proband lived to over 110 years. The entire list of rare protein-altering variants and DNA sequence of all 17 supercentenarian genomes is available as a resource to assist the discovery of the genetic basis of extreme longevity in future studies. PMID:25390934
Whole-genome sequencing of the world's oldest people.

PubMed

Gierman, Hinco J; Fortney, Kristen; Roach, Jared C; Coles, Natalie S; Li, Hong; Glusman, Gustavo; Markov, Glenn J; Smith, Justin D; Hood, Leroy; Coles, L Stephen; Kim, Stuart K

2014-01-01

Supercentenarians (110 years or older) are the world's oldest people. Seventy four are alive worldwide, with twenty two in the United States. We performed whole-genome sequencing on 17 supercentenarians to explore the genetic basis underlying extreme human longevity. We found no significant evidence of enrichment for a single rare protein-altering variant or for a gene harboring different rare protein altering variants in supercentenarian compared to control genomes. We followed up on the gene most enriched for rare protein-altering variants in our cohort of supercentenarians, TSHZ3, by sequencing it in a second cohort of 99 long-lived individuals but did not find a significant enrichment. The genome of one supercentenarian had a pathogenic mutation in DSC2, known to predispose to arrhythmogenic right ventricular cardiomyopathy, which is recommended to be reported to this individual as an incidental finding according to a recent position statement by the American College of Medical Genetics and Genomics. Even with this pathogenic mutation, the proband lived to over 110 years. The entire list of rare protein-altering variants and DNA sequence of all 17 supercentenarian genomes is available as a resource to assist the discovery of the genetic basis of extreme longevity in future studies.
SVGenes: a library for rendering genomic features in scalable vector graphic format.

PubMed

Etherington, Graham J; MacLean, Daniel

2013-08-01

Drawing genomic features in attractive and informative ways is a key task in visualization of genomics data. Scalable Vector Graphics (SVG) format is a modern and flexible open standard that provides advanced features including modular graphic design, advanced web interactivity and animation within a suitable client. SVGs do not suffer from loss of image quality on re-scaling and provide the ability to edit individual elements of a graphic on the whole object level independent of the whole image. These features make SVG a potentially useful format for the preparation of publication quality figures including genomic objects such as genes or sequencing coverage and for web applications that require rich user-interaction with the graphical elements. SVGenes is a Ruby-language library that uses SVG primitives to render typical genomic glyphs through a simple and flexible Ruby interface. The library implements a simple Page object that spaces and contains horizontal Track objects that in turn style, colour and positions features within them. Tracks are the level at which visual information is supplied providing the full styling capability of the SVG standard. Genomic entities like genes, transcripts and histograms are modelled in Glyph objects that are attached to a track and take advantage of SVG primitives to render the genomic features in a track as any of a selection of defined glyphs. The feature model within SVGenes is simple but flexible and not dependent on particular existing gene feature formats meaning graphics for any existing datasets can easily be created without need for conversion. The library is provided as a Ruby Gem from https://rubygems.org/gems/bio-svgenes under the MIT license, and open source code is available at https://github.com/danmaclean/bioruby-svgenes also under the MIT License. dan.maclean@tsl.ac.uk.
Near-Complete Genome Sequence of Thalassospira sp. Strain KO164 Isolated from a Lignin-Enriched Marine Sediment Microcosm.

PubMed

Woo, Hannah L; O'Dell, Kaela B; Utturkar, Sagar; McBride, Kathryn R; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T B K; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Woyke, Tanja; Brown, Steven D; Hazen, Terry C

2016-11-23

Thalassospira sp. strain KO164 was isolated from eastern Mediterranean seawater and sediment laboratory microcosms enriched on insoluble organosolv lignin under oxic conditions. The near-complete genome sequence presented here will facilitate analyses into this deep-ocean bacterium's ability to degrade recalcitrant organics such as lignin. Copyright © 2016 Woo et al.
Microsatellite markers for the yam bean Pachyrhizus (Fabaceae)1

PubMed Central

Delêtre, Marc; Soengas, Beatriz; Utge, José; Lambourdière, Josie; Sørensen, Marten

2013-01-01

• Premise of the study: Microsatellite loci were developed for the understudied root crop yam bean (Pachyrhizus spp.) to investigate intraspecific diversity and interspecific relationships within the genus Pachyrhizus. • Methods and Results: Seventeen nuclear simple sequence repeat (SSR) markers with perfect di- and trinucleotide repeats were developed from 454 pyrosequencing of SSR-enriched genomic libraries. Loci were characterized in P. ahipa and wild and cultivated populations of four closely related species. All loci successfully cross-amplified and showed high levels of polymorphism, with number of alleles ranging from three to 12 and expected heterozygosity ranging from 0.095 to 0.831 across the genus. • Conclusions: By enabling rapid assessment of genetic diversity in three native neotropical crops, P. ahipa, P. erosus, and P. tuberosus, and two wild relatives, P. ferrugineus and P. panamensis, these markers will allow exploration of the genetic diversity and evolutionary history of the genus Pachyrhizus. PMID:25202568
Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues

PubMed Central

Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.

2014-01-01

RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209
Polymorphic microsatellite markers for the endangered fish, the slender shiner Pseudopungtungia tenuicorpa and cross-species amplification across five related species.

PubMed

Kim, K S; Moon, S J; Han, S H; Kim, K Y; Bang, I C

2016-09-02

The slender shiner Pseudopungtungia tenuicorpa (Cypriniformes; Cyprinidae; Gobioninae) is an endangered freshwater fish species endemic to Korea. The current strategies for its conservation involve the study of population genetic characters and identification of management units. These strategies require suitable molecular markers to study genetic diversity and genetic structure. Here, we developed nine polymorphic microsatellite markers for P. tenuicorpa for the first time by applying an enrichment method from a size-selected genomic library. The developed microsatellite markers produced a total of 101 alleles (average 11.2). The observed and expected heterozygosities averaged 0.805 and 0.835, respectively. Among the nine identified markers, five markers showed successful amplification across five related Korean Gobioninae species. Thus, the microsatellite markers developed in this study will be useful to establish conservation strategies for both P. tenuicorpa and other related species.
Determination of buoyant density and sensitivity to chloroform and freon for the etiological agent of infectious salmonid anaemia

USGS Publications Warehouse

Christie, K.E.; Hjeltnes, B.; Uglenes , I.; Winton, J.R.

1993-01-01

Plasma was collected from Atlantic salmon Salrno salar with acute infectious salmon anaemia (ISA) and used to challenge Atlantic salmon parr by intraperitoneal injection. Treatment of plasma with the lipid solvent, chloroform, showed that the etiological agent of ISA contained essential lipids, probably as a viral envelope. Some infectivity remained following treatment with freon. Injection challenges using fractions from equilibrium density gradient centrifugation of plasma from fish with acute ISA revealed a band of infectivity in the range 1.184 to 1.262 g cm-3. The band was believed to conta~n both complete ISA-virus particles and infectious particles lacking a complete envelope, nucleocapsid or genome. Density gradient centrifugation of infectious plasma for enrichment of the putative ISA virus appeared to offer a suitable method for obtaining virus-specific nucleic acid for use in the construction of cDNA libraries.
Development and characterization of microsatellite markers for the medicinal plant Smilax brasiliensis (Smilacaceae) and related species1

PubMed Central

Martins, Aline R.; Abreu, Aluana G.; Bajay, Miklos M.; Villela, Priscilla M. S.; Batista, Carlos E. A.; Monteiro, Mariza; Alves-Pereira, Alessandro; Figueira, Glyn M.; Pinheiro, José B.; Appezzato-da-Glória, Beatriz; Zucchi, Maria I.

2013-01-01

• Premise of the study: A new set of microsatellite or simple sequence repeat (SSR) markers were developed for Smilax brasiliensis, which is popularly known as sarsaparilla and used in folk medicine as a tonic, antirheumatic, and antisyphilitic. Smilax brasiliensis is sold in Brazilian pharmacies, and its origin and effectiveness are not subject to quality control. • Methods and Results: Using a protocol for genomic library enrichment, primer pairs were developed for 26 microsatellite loci and validated in 17 accessions of S. brasiliensis. Thirteen loci were polymorphic and four were monomorphic. The primers successfully amplified alleles in the congeners S. campestris, S. cissoides, S. fluminensis, S. goyazana, S. polyantha, S. quinquenervia, S. rufescens, S. subsessiliflora, and S. syphilitica. • Conclusions: The new SSR markers described herein are informative tools for genetic diversity and gene flow studies in S. brasiliensis and several congeners. PMID:25202555
Genetic variation at microsatellite loci in the tropical herb Aphelandra aurantiaca (Acanthaceae).

PubMed

Suárez-Montes, Pilar; Tapia-López, Rosalinda; Núñez-Farfán, Juan

2015-11-01

To assess the effect of forest fragmentation on genetic variation and population structure of Aphelandra aurantiaca (Acanthaceae), a tropical and ornamental herbaceous perennial plant, we developed the first microsatellite primers for the species. Fourteen microsatellite markers were isolated and characterized from A. aurantiaca genomic libraries enriched for di-, tri-, and tetranucleotide repeat motifs. Polymorphism was evaluated in 107 individuals from four natural populations. Twelve out of 14 genetic markers were polymorphic. The number of alleles per locus ranged from two to 12, and the observed and expected heterozygosities ranged from 0.22 to 0.96 and from 0.20 to 0.87, respectively. Fixation indices ranged from -0.41 to 0.44. These newly developed microsatellite markers for A. aurantiaca will be useful for future population genetic studies, specifically to detect the possible loss of genetic diversity due to habitat fragmentation.

Development of microsatellite loci in Artocarpus altilis (Moraceae) and cross-amplification in congeneric species.

PubMed

Witherup, Colby; Ragone, Diane; Wiesner-Hanks, Tyr; Irish, Brian; Scheffler, Brian; Simpson, Sheron; Zee, Francis; Zuberi, M Iqbal; Zerega, Nyree J C

2013-07-01

Microsatellite loci were isolated and characterized from enriched genomic libraries of Artocarpus altilis (breadfruit) and tested in four Artocarpus species and one hybrid. The microsatellite markers provide new tools for further studies in Artocarpus. • A total of 25 microsatellite loci were evaluated across four Artocarpus species and one hybrid. Twenty-one microsatellite loci were evaluated on A. altilis (241), A. camansi (34), A. mariannensis (15), and A. altilis × mariannensis (64) samples. Nine of those loci plus four additional loci were evaluated on A. heterophyllus (jackfruit, 426) samples. All loci are polymorphic for at least one species. The average number of alleles ranges from two to nine within taxa. • These microsatellite primers will facilitate further studies on the genetic structure and evolutionary and domestication history of Artocarpus species. They will aid in cultivar identification and establishing germplasm conservation strategies for breadfruit and jackfruit.
Genome-scale CRISPR-Cas9 Knockout and Transcriptional Activation Screening

PubMed Central

Joung, Julia; Konermann, Silvana; Gootenberg, Jonathan S.; Abudayyeh, Omar O.; Platt, Randall J.; Brigham, Mark D.; Sanjana, Neville E.; Zhang, Feng

2017-01-01

Forward genetic screens are powerful tools for the unbiased discovery and functional characterization of specific genetic elements associated with a phenotype of interest. Recently, the RNA-guided endonuclease Cas9 from the microbial CRISPR (clustered regularly interspaced short palindromic repeats) immune system has been adapted for genome-scale screening by combining Cas9 with pooled guide RNA libraries. Here we describe a protocol for genome-scale knockout and transcriptional activation screening using the CRISPR-Cas9 system. Custom- or ready-made guide RNA libraries are constructed and packaged into lentiviral vectors for delivery into cells for screening. As each screen is unique, we provide guidelines for determining screening parameters and maintaining sufficient coverage. To validate candidate genes identified from the screen, we further describe strategies for confirming the screening phenotype as well as genetic perturbation through analysis of indel rate and transcriptional activation. Beginning with library design, a genome-scale screen can be completed in 9–15 weeks followed by 4–5 weeks of validation. PMID:28333914
Genome-scale CRISPR-Cas9 knockout and transcriptional activation screening.

PubMed

Joung, Julia; Konermann, Silvana; Gootenberg, Jonathan S; Abudayyeh, Omar O; Platt, Randall J; Brigham, Mark D; Sanjana, Neville E; Zhang, Feng

2017-04-01

Forward genetic screens are powerful tools for the unbiased discovery and functional characterization of specific genetic elements associated with a phenotype of interest. Recently, the RNA-guided endonuclease Cas9 from the microbial CRISPR (clustered regularly interspaced short palindromic repeats) immune system has been adapted for genome-scale screening by combining Cas9 with pooled guide RNA libraries. Here we describe a protocol for genome-scale knockout and transcriptional activation screening using the CRISPR-Cas9 system. Custom- or ready-made guide RNA libraries are constructed and packaged into lentiviral vectors for delivery into cells for screening. As each screen is unique, we provide guidelines for determining screening parameters and maintaining sufficient coverage. To validate candidate genes identified by the screen, we further describe strategies for confirming the screening phenotype, as well as genetic perturbation, through analysis of indel rate and transcriptional activation. Beginning with library design, a genome-scale screen can be completed in 9-15 weeks, followed by 4-5 weeks of validation.
OCLC in Asia Pacific.

ERIC Educational Resources Information Center

Chang, Min-min

1998-01-01

Discusses the Online Computer Library Center (OCLC) and the changing Asia Pacific library scene under the broad headings of the three phases of technology innovation. Highlights include WorldCat and the OCLC shared cataloging system; resource sharing and interlibrary loan; enriching OCLC online catalog with Asian collections; and future outlooks.…
A pair of new BAC and BIBAC vectors that facilitate BAC/BIBAC library construction and intact large genomic DNA insert exchange.

PubMed

Shi, Xue; Zeng, Haiyang; Xue, Yadong; Luo, Meizhong

2011-10-11

Large-insert BAC and BIBAC libraries are important tools for structural and functional genomics studies of eukaryotic genomes. To facilitate the construction of BAC and BIBAC libraries and the transfer of complete large BAC inserts into BIBAC vectors, which is desired in positional cloning, we developed a pair of new BAC and BIBAC vectors. The new BAC vector pIndigoBAC536-S and the new BIBAC vector BIBAC-S have the following features: 1) both contain two 18-bp non-palindromic I-SceI sites in an inverted orientation at positions that flank an identical DNA fragment containing the lacZ selection marker and the cloning site. Large DNA inserts can be excised from the vectors as single fragments by cutting with I-SceI, allowing the inserts to be easily sized. More importantly, because the two vectors contain different antibiotic resistance genes for transformant selection and produce the same non-complementary 3' protruding ATAA ends by I-SceI that suppress self- and inter-ligations, the exchange of intact large genomic DNA inserts between the BAC and BIBAC vectors is straightforward; 2) both were constructed as high-copy composite vectors. Reliable linearized and dephosphorylated original low-copy pIndigoBAC536-S and BIBAC-S vectors that are ready for library construction can be prepared from the high-copy composite vectors pHZAUBAC1 and pHZAUBIBAC1, respectively, without the need for additional preparation steps or special reagents, thus simplifying the construction of BAC and BIBAC libraries. BIBAC clones constructed with the new BIBAC-S vector are stable in both E. coli and Agrobacterium. The vectors can be accessed through our website http://GResource.hzau.edu.cn. The two new vectors and their respective high-copy composite vectors can largely facilitate the construction and characterization of BAC and BIBAC libraries. The transfer of complete large genomic DNA inserts from one vector to the other is made straightforward.
Enriching the Catalog

ERIC Educational Resources Information Center

Tennant, Roy

2004-01-01

After decades of costly and time-consuming effort, nearly all libraries have completed the retrospective conversion of their card catalogs to electronic form. However, bibliographic systems still are really not much more than card catalogs on wheels. Enriched content that Amazon.com takes for granted--such as digitized tables of contents, cover…
Draft Genome Sequence of a "Candidatus Brocadia" Bacterium Enriched from Activated Sludge Collected in a Tropical Climate.

PubMed

Liu, Xianghui; Arumugam, Krithika; Natarajan, Gayathri; Seviour, Thomas W; Drautz-Moses, Daniela I; Wuertz, Stefan; Law, Yingyu; Williams, Rohan B H

2018-05-10

Here, we present the draft genome sequence of an anaerobic ammonium-oxidizing bacterium (AnAOB), " Candidatus Brocadia," which was enriched in an anammox reactor. A 3.2-Mb genome sequence comprising 168 contigs was assembled, in which 2,765 protein-coding genes, 47 tRNAs, and one each of 5S, 16S, and 23S rRNAs were annotated. No evidence for the presence of a nitric oxide-forming nitrite reductase was found. Copyright © 2018 Liu et al.
Genome-wide Target Enrichment-aided Chip Design: a 66 K SNP Chip for Cashmere Goat.

PubMed

Qiao, Xian; Su, Rui; Wang, Yang; Wang, Ruijun; Yang, Ting; Li, Xiaokai; Chen, Wei; He, Shiyang; Jiang, Yu; Xu, Qiwu; Wan, Wenting; Zhang, Yaolei; Zhang, Wenguang; Chen, Jiang; Liu, Bin; Liu, Xin; Fan, Yixing; Chen, Duoyuan; Jiang, Huaizhi; Fang, Dongming; Liu, Zhihong; Wang, Xiaowen; Zhang, Yanjun; Mao, Danqing; Wang, Zhiying; Di, Ran; Zhao, Qianjun; Zhong, Tao; Yang, Huanming; Wang, Jian; Wang, Wen; Dong, Yang; Chen, Xiaoli; Xu, Xun; Li, Jinquan

2017-08-17

Compared with the commercially available single nucleotide polymorphism (SNP) chip based on the Bead Chip technology, the solution hybrid selection (SHS)-based target enrichment SNP chip is not only design-flexible, but also cost-effective for genotype sequencing. In this study, we propose to design an animal SNP chip using the SHS-based target enrichment strategy for the first time. As an update to the international collaboration on goat research, a 66 K SNP chip for cashmere goat was created from the whole-genome sequencing data of 73 individuals. Verification of this 66 K SNP chip with the whole-genome sequencing data of 436 cashmere goats showed that the SNP call rates was between 95.3% and 99.8%. The average sequencing depth for target SNPs were 40X. The capture regions were shown to be 200 bp that flank target SNPs. This chip was further tested in a genome-wide association analysis of cashmere fineness (fiber diameter). Several top hit loci were found marginally associated with signaling pathways involved in hair growth. These results demonstrate that the 66 K SNP chip is a useful tool in the genomic analyses of cashmere goats. The successful chip design shows that the SHS-based target enrichment strategy could be applied to SNP chip design in other species.
Prospective identification of parasitic sequences in phage display screens

PubMed Central

Matochko, Wadim L.; Cory Li, S.; Tang, Sindy K.Y.; Derda, Ratmir

2014-01-01

Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 109 random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the ‘parasites’), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 ‘parasites’. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼106 diverse clones prevents the biased selection of parasitic clones. PMID:24217917
Construction and characterization of a bacterial artificial chromosome library for hexaploid wheat line 92R137

USDA-ARS?s Scientific Manuscript database

For map-based cloning of genes conferring important traits in the hexaploid wheat line 92R137, a bacterial artificial chromosome (BAC) library, including two sub libraries, was constructed using the genomic DNA of 92R137 digested with restriction enzymes HindIII and BamHI. The BAC library was compos...
Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

PubMed Central

Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

2007-01-01

Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Preparation and screening of an arrayed human genomic library generated with the P1 cloning system.

PubMed Central

Shepherd, N S; Pfrogner, B D; Coulby, J N; Ackerman, S L; Vaidyanathan, G; Sauer, R H; Balkenhol, T C; Sternberg, N

1994-01-01

We describe here the construction and initial characterization of a 3-fold coverage genomic library of the human haploid genome that was prepared using the bacteriophage P1 cloning system. The cloned DNA inserts were produced by size fractionation of a Sau3AI partial digest of high molecular weight genomic DNA isolated from primary cells of human foreskin fibroblasts. The inserts were cloned into the pAd10sacBII vector and packaged in vitro into P1 phage. These were used to generate recombinant bacterial clones, each of which was picked robotically from an agar plate into a well of a 96-well microtiter dish, grown overnight, and stored at -70 degrees C. The resulting library, designated DMPC-HFF#1 series A, consists of approximately 130,000-140,000 recombinant clones that were stored in 1500 microtiter dishes. To screen the library, clones were combined in a pooling strategy and specific loci were identified by PCR analysis. On average, the library contains two or three different clones for each locus screened. To date we have identified a total of 17 clones containing the hypoxanthine-guanine phosphoribosyltransferase, human serum albumin-human alpha-fetoprotein, p53, cyclooxygenase I, human apurinic endonuclease, beta-polymerase, and DNA ligase I genes. The cloned inserts average 80 kb in size and range from 70 to 95 kb, with one 49-kb insert and one 62-kb insert. Images PMID:8146166
Deep Learning through Concept-Based Inquiry

ERIC Educational Resources Information Center

Donham, Jean

2010-01-01

Learning in the library should present opportunities to enrich student learning activities to address concerns of interest and cognitive complexity, but these must be tasks that call for in-depth analysis--not merely gathering facts. Library learning experiences need to demand enough of students to keep them interested and also need to be…
Examining the Association between the "Imagination Library" Early Childhood Literacy Program and Kindergarten Readiness

ERIC Educational Resources Information Center

Samiei, Shahin; Bush, Andrew J.; Sell, Marie; Imig, Doug

2016-01-01

This study evaluated participation in the "Imagination Library" early childhood literacy enrichment program and children's pre-literacy and pre-numeracy skills at kindergarten entry in an urban school district. Previous studies have demonstrated that program participation is associated with greater early childhood reading practices.…
Teaching Analytics: A Clustering and Triangulation Study of Digital Library User Data

ERIC Educational Resources Information Center

Xu, Beijie; Recker, Mimi

2012-01-01

Teachers and students increasingly enjoy unprecedented access to abundant web resources and digital libraries to enhance and enrich their classroom experiences. However, due to the distributed nature of such systems, conventional educational research methods, such as surveys and observations, provide only limited snapshots. In addition,…
Now's the Time: Online Library Orientations

ERIC Educational Resources Information Center

Farrell, Sandy L.; Driver, Carol; Weathers, Anita

2011-01-01

Increasingly at West Kentucky Community and Technical College (WKCTC), English 101, English 102, and Enrichment 091 are taught online, allowing students more work availability and time with family, while decreasing commute time and expense. In order to meet the library orientation needs of these online students and provide other options for…
No Longer the "Poor Man's University"

ERIC Educational Resources Information Center

Hackett, Abi; Novitzky, Jan

2008-01-01

Museums, libraries and archives can offer fantastic opportunities for adult learning in their own right. Many know that what is on offer can complement and enrich the opportunities offered through the more formal adult learning sector. The schools sector has long-standing links with museums and libraries. In adult education there are less…
K-bZIP Mediated SUMO-2/3 Specific Modification on the KSHV Genome Negatively Regulates Lytic Gene Expression and Viral Reactivation

PubMed Central

Yang, Wan-Shan; Hsu, Hung-Wei; Campbell, Mel; Cheng, Chia-Yang; Chang, Pei-Ching

2015-01-01

SUMOylation is associated with epigenetic regulation of chromatin structure and transcription. Epigenetic modifications of herpesviral genomes accompany the transcriptional switch of latent and lytic genes during the virus life cycle. Here, we report a genome-wide comparison of SUMO paralog modification on the KSHV genome. Using chromatin immunoprecipitation in conjunction with high-throughput sequencing, our study revealed highly distinct landscape changes of SUMO paralog genomic modifications associated with KSHV reactivation. A rapid and widespread deposition of SUMO-2/3, compared with SUMO-1, modification across the KSHV genome upon reactivation was observed. Interestingly, SUMO-2/3 enrichment was inversely correlated with H3K9me3 mark after reactivation, indicating that SUMO-2/3 may be responsible for regulating the expression of viral genes located in low heterochromatin regions during viral reactivation. RNA-sequencing analysis showed that the SUMO-2/3 enrichment pattern positively correlated with KSHV gene expression profiles. Activation of KSHV lytic genes located in regions with high SUMO-2/3 enrichment was enhanced by SUMO-2/3 knockdown. These findings suggest that SUMO-2/3 viral chromatin modification contributes to the diminution of viral gene expression during reactivation. Our previous study identified a SUMO-2/3-specific viral E3 ligase, K-bZIP, suggesting a potential role of this enzyme in regulating SUMO-2/3 enrichment and viral gene repression. Consistent with this prediction, higher K-bZIP binding on SUMO-2/3 enrichment region during reactivation was observed. Moreover, a K-bZIP SUMO E3 ligase dead mutant, K-bZIP-L75A, in the viral context, showed no SUMO-2/3 enrichment on viral chromatin and higher expression of viral genes located in SUMO-2/3 enriched regions during reactivation. Importantly, virus production significantly increased in both SUMO-2/3 knockdown and KSHV K-bZIP-L75A mutant cells. These results indicate that SUMO-2/3 modification of viral chromatin may function to counteract KSHV reactivation. As induction of herpesvirus reactivation may activate cellular antiviral regimes, our results suggest that development of viral SUMO E3 ligase specific inhibitors may be an avenue for anti-virus therapy. PMID:26197391
De novo transcriptome sequencing and discovery of genes related to copper tolerance in Paeonia ostii.

PubMed

Wang, Yanjie; Dong, Chunlan; Xue, Zeyun; Jin, Qijiang; Xu, Yingchun

2016-01-15

Paeonia ostii, an important ornamental and medicinal plant, grows normally on copper (Cu) mines with widespread Cu contamination of soils, and it has the ability to lower Cu contents in the Cu-contaminated soils. However, very little molecular information concerned with Cu resistance of P. ostii is available. In this study, high-throughput de novo transcriptome sequencing was carried out for P. ostii with and without Cu treatment using Illumina HiSeq 2000 platform. A total of 77,704 All-unigenes were obtained with a mean length of 710 bp. Of these unigenes, 47,461 were annotated with public databases based on sequence similarities. Comparative transcript profiling allowed the discovery of 4324 differentially expressed genes (DEGs), with 2207 up-regulated and 2117 down-regulated unigenes in Cu-treated library as compared to the control counterpart. Based on these DEGs, Gene Ontology (GO) enrichment analysis indicated Cu stress-relevant terms, such as 'membrane' and 'antioxidant activity'. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered some important pathways, including 'biosynthesis of secondary metabolites' and 'metabolic pathways'. In addition, expression patterns of 12 selected DEGs derived from quantitative real-time polymerase chain reaction (qRT-PCR) were consistent with their transcript abundance changes obtained by transcriptomic analyses, suggesting that all the 12 genes were authentically involved in Cu tolerance in P. ostii. This is the first report to identify genes related to Cu stress responses in P. ostii, which could offer valuable information on the molecular mechanisms of Cu resistance, and provide a basis for further genomics research on this and related ornamental species for phytoremediation. Copyright © 2015 Elsevier B.V. All rights reserved.
Massive migration from the steppe was a source for Indo-European languages in Europe.

PubMed

Haak, Wolfgang; Lazaridis, Iosif; Patterson, Nick; Rohland, Nadin; Mallick, Swapan; Llamas, Bastien; Brandt, Guido; Nordenfelt, Susanne; Harney, Eadaoin; Stewardson, Kristin; Fu, Qiaomei; Mittnik, Alissa; Bánffy, Eszter; Economou, Christos; Francken, Michael; Friederich, Susanne; Pena, Rafael Garrido; Hallgren, Fredrik; Khartanovich, Valery; Khokhlov, Aleksandr; Kunst, Michael; Kuznetsov, Pavel; Meller, Harald; Mochalov, Oleg; Moiseyev, Vayacheslav; Nicklisch, Nicole; Pichler, Sandra L; Risch, Roberto; Rojo Guerra, Manuel A; Roth, Christina; Szécsényi-Nagy, Anna; Wahl, Joachim; Meyer, Matthias; Krause, Johannes; Brown, Dorcas; Anthony, David; Cooper, Alan; Alt, Kurt Werner; Reich, David

2015-06-11

We generated genome-wide data from 69 Europeans who lived between 8,000-3,000 years ago by enriching ancient DNA libraries for a target set of almost 400,000 polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies and to obtain new insights about the past. We show that the populations of Western and Far Eastern Europe followed opposite trajectories between 8,000-5,000 years ago. At the beginning of the Neolithic period in Europe, ∼8,000-7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a ∼24,000-year-old Siberian. By ∼6,000-5,000 years ago, farmers throughout much of Europe had more hunter-gatherer ancestry than their predecessors, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but also from a population of Near Eastern ancestry. Western and Eastern Europe came into contact ∼4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced ∼75% of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least ∼3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for a steppe origin of at least some of the Indo-European languages of Europe.

Massive migration from the steppe was a source for Indo-European languages in Europe

PubMed Central

Haak, Wolfgang; Lazaridis, Iosif; Patterson, Nick; Rohland, Nadin; Mallick, Swapan; Llamas, Bastien; Brandt, Guido; Nordenfelt, Susanne; Harney, Eadaoin; Stewardson, Kristin; Fu, Qiaomei; Mittnik, Alissa; Bánffy, Eszter; Economou, Christos; Francken, Michael; Friederich, Susanne; Pena, Rafael Garrido; Hallgren, Fredrik; Khartanovich, Valery; Khokhlov, Aleksandr; Kunst, Michael; Kuznetsov, Pavel; Meller, Harald; Mochalov, Oleg; Moiseyev, Vayacheslav; Nicklisch, Nicole; Pichler, Sandra L.; Risch, Roberto; Rojo Guerra, Manuel A.; Roth, Christina; Szécsényi-Nagy, Anna; Wahl, Joachim; Meyer, Matthias; Krause, Johannes; Brown, Dorcas; Anthony, David; Cooper, Alan; Alt, Kurt Werner; Reich, David

2016-01-01

We generated genome-wide data from 69 Europeans who lived between 8,000–3,000 years ago by enriching ancient DNA libraries for a target set of almost 400,000 polymorphisms. Enrichment of these positions decreases the sequencing required for genome-wide ancient DNA analysis by a median of around 250-fold, allowing us to study an order of magnitude more individuals than previous studies1–8 and to obtain new insights about the past. We show that the populations of Western and Far Eastern Europe followed opposite trajectories between 8,000–5,000 years ago. At the beginning of the Neolithic period in Europe, 8,000–7,000 years ago, closely related groups of early farmers appeared in Germany, Hungary and Spain, different from indigenous hunter-gatherers, whereas Russia was inhabited by a distinctive population of hunter-gatherers with high affinity to a 24,000-year-old Siberian6. By 6,000–5,000 years ago, farmers throughout much of Europe had more hunter-gatherer ancestry than their predecessors, but in Russia, the Yamnaya steppe herders of this time were descended not only from the preceding eastern European hunter-gatherers, but also from a population of Near Eastern ancestry. Western and Eastern Europe came into contact 4,500 years ago, as the Late Neolithic Corded Ware people from Germany traced 75% of their ancestry to the Yamnaya, documenting a massive migration into the heartland of Europe from its eastern periphery. This steppe ancestry persisted in all sampled central Europeans until at least 3,000 years ago, and is ubiquitous in present-day Europeans. These results provide support for a steppe origin9 of at least some of the Indo-European languages of Europe. PMID:25731166
Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nierman, William C.

At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phredmore » Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.« less
The SUPERFAMILY database in 2004: additions and improvements.

PubMed

Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K; Chothia, Cyrus; Gough, Julian

2004-01-01

The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.
Construction of Signature-tagged Mutant Library in Mesorhizobium loti as a Powerful Tool for Functional Genomics

PubMed Central

Shimoda, Yoshikazu; Mitsui, Hisayuki; Kamimatsuse, Hiroko; Minamisawa, Kiwamu; Nishiyama, Eri; Ohtsubo, Yoshiyuki; Nagata, Yuji; Tsuda, Masataka; Shinpo, Sayaka; Watanabe, Akiko; Kohara, Mitsuyo; Yamada, Manabu; Nakamura, Yasukazu; Tabata, Satoshi; Sato, Shusei

2008-01-01

Rhizobia are nitrogen-fixing soil bacteria that establish endosymbiosis with some leguminous plants. The completion of several rhizobial genome sequences provides opportunities for genome-wide functional studies of the physiological roles of many rhizobial genes. In order to carry out genome-wide phenotypic screenings, we have constructed a large mutant library of the nitrogen-fixing symbiotic bacterium, Mesorhizobium loti, by transposon mutagenesis. Transposon insertion mutants were generated using the signature-tagged mutagenesis (STM) technique and a total of 29 330 independent mutants were obtained. Along with the collection of transposon mutants, we have determined the transposon insertion sites for 7892 clones, and confirmed insertions in 3680 non-redundant M. loti genes (50.5% of the total number of M. loti genes). Transposon insertions were randomly distributed throughout the M. loti genome without any bias toward G+C contents of insertion target sites and transposon plasmids used for the mutagenesis. We also show the utility of STM mutants by examining the specificity of signature tags and test screenings for growth- and nodulation-deficient mutants. This defined mutant library allows for genome-wide forward- and reverse-genetic functional studies of M. loti and will serve as an invaluable resource for researchers to further our understanding of rhizobial biology. PMID:18658183
Coral life history and symbiosis: Functional genomic resources for two reef building Caribbean corals, Acropora palmata and Montastraea faveolata

PubMed Central

Schwarz, Jodi A; Brokstein, Peter B; Voolstra, Christian; Terry, Astrid Y; Miller, David J; Szmant, Alina M; Coffroth, Mary Alice; Medina, Mónica

2008-01-01

Background Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Results We generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembled into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians Nematostella vectensis and Hydra magnipapillata. Conclusion Partial sequencing of 5 cDNA libraries each for A. palmata and M. faveolata has produced a rich set of candidate genes (4,980 genes from A. palmata, and 1,732 genes from M. faveolata) that we can use as a starting point for examining the life history and symbiosis of these two species, as well as to further expand the dataset of cnidarian genes for comparative genomics and evolutionary studies. PMID:18298846
Coral Life History and Symbiosis: functional genomic resources for two reef building Caribbean corals, Acropora palmata and Montastraea faveolata

DOE PAGES

Schwarz, Jodi A.; Brokstein, Peter B.; Voolstra, Christian R.; ...

2008-02-25

Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Here we generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembledmore » into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians Nematostella vectensis and Hydra magnipapillata. In conclusion, partial sequencing of 5 cDNA libraries each for A. palmata and M. faveolata has produced a rich set of candidate genes (4,980 genes from A. palmata, and 1,732 genes from M. faveolata) that we can use as a starting point for examining the life history and symbiosis of these two species, as well as to further expand the dataset of cnidarian genes for comparative genomics and evolutionary studies.« less
Coral Life History and Symbiosis: functional genomic resources for two reef building Caribbean corals, Acropora palmata and Montastraea faveolata

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schwarz, Jodi A.; Brokstein, Peter B.; Voolstra, Christian R.

Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Here we generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembledmore » into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians Nematostella vectensis and Hydra magnipapillata. In conclusion, partial sequencing of 5 cDNA libraries each for A. palmata and M. faveolata has produced a rich set of candidate genes (4,980 genes from A. palmata, and 1,732 genes from M. faveolata) that we can use as a starting point for examining the life history and symbiosis of these two species, as well as to further expand the dataset of cnidarian genes for comparative genomics and evolutionary studies.« less
BACCardI--a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison.

PubMed

Bartels, Daniela; Kespohl, Sebastian; Albaum, Stefan; Drüke, Tanja; Goesmann, Alexander; Herold, Julia; Kaiser, Olaf; Pühler, Alfred; Pfeiffer, Friedhelm; Raddatz, Günter; Stoye, Jens; Meyer, Folker; Schuster, Stephan C

2005-04-01

We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.
Genome Sequence of Halomonas sp. Strain KO116, an Ionic Liquid- Tolerant Marine Bacterium Isolated from a Lignin-Enriched Seawater Microcosm

DOE PAGES

O'Dell, Kaela; Woo, Hannah L.; Utturkar, Sagar M.; ...

2015-05-07

Halomonas sp. strain KO116 was isolated from Nile Delta Mediterranean Sea surface water enriched with insoluble organosolv lignin. It was further screened for growth on alkali lignin minimal salts medium agar. The strain tolerates the ionic liquid 1-ethyl-3-methylimidazolium acetate. Its complete genome sequence is presented in this report.
Human retinoblastoma susceptibility gene: genomic organization and analysis of heterozygous intragenic deletion mutants.

PubMed Central

Bookstein, R; Lee, E Y; To, H; Young, L J; Sery, T W; Hayes, R C; Friedmann, T; Lee, W H

1988-01-01

A gene in chromosome region 13q14 has been identified as the human retinoblastoma susceptibility (RB) gene on the basis of altered gene expression found in virtually all retinoblastomas. In order to further characterize the RB gene and its structural alterations, we examined genomic clones of the RB gene isolated from both a normal human genomic library and a library made from DNA of the retinoblastoma cell line Y79. First, a restriction and exon map of the RB gene was constructed by aligning overlapping genomic clones, yielding three contiguous regions ("contigs") of 150 kilobases total length separated by two gaps. At least 20 exons were identified in genomic clones, and these were provisionally numbered. Second, two overlapping genomic clones that demonstrated a DNA deletion of exons 2 through 6 from one RB allele were isolated from the Y79 library. To confirm and extend this result, a unique sequence probe from intron 1 was used to detect similar and possibly identical heterozygous deletions in genomic DNA from three retinoblastoma cell lines, thereby explaining the origins of their shortened RB mRNA transcripts. The same probe detected genomic rearrangements in fibroblasts from two hereditary retinoblastoma patients, indicating that intron 1 includes a frequent site for mutations conferring predisposition to retinoblastoma. Third, this probe also detected a polymorphic site for BamHI with allele frequencies near 0.5/0.5. Identification of commonly mutated regions will contribute significantly to genetic diagnosis in retinoblastoma patients and families. Images PMID:2895471
Functional genomics to discover antibiotic resistance genes: The paradigm of resistance to colistin mediated by ethanolamine phosphotransferase in Shewanella algae MARS 14.

PubMed

Telke, Amar A; Rolain, Jean-Marc

2015-12-01

Shewanella algae MARS 14 is a colistin-resistant clinical isolate retrieved from bronchoalveolar lavage of a hospitalised patient. A functional genomics strategy was employed to discover the molecular support for colistin resistance in S. algae MARS 14. A pZE21 MCS-1 plasmid-based genomic expression library was constructed in Escherichia coli TOP10. The estimated library size was 1.30×10(8) bp. Functional screening of colistin-resistant clones was carried out on Luria-Bertani agar containing 8 mg/L colistin. Five colistin-resistant clones were obtained after complete screening of the genomic expression library. Analysis of DNA sequencing results found a unique gene in all selected clones. Amino acid sequence analysis of this unique gene using the Integrated Microbial Genomes (IMG) and KEGG databases revealed that this gene encodes ethanolamine phosphotransferase (EptA, or so-called PmrC). Reverse transcription PCR analysis indicated that resistance to colistin in S. algae MARS 14 was associated with overexpression of EptA (27-fold increase), which plays a crucial role in the arrangement of outer membrane lipopolysaccharide. Copyright © 2015 Elsevier B.V. and the International Society of Chemotherapy. All rights reserved.
Genomic resources for songbird research and their use in characterizing gene expression during brain development

PubMed Central

Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry

2007-01-01

Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146
Comparison Of A Neutron Kinetics Parameter For A Polyethylene Moderated Highly Enriched Uranium System

DOE Office of Scientific and Technical Information (OSTI.GOV)

McKenzie, IV, George Espy; Goda, Joetta Marie; Grove, Travis Justin

This paper examines the comparison of MCNP® code’s capability to calculate kinetics parameters effectively for a thermal system containing highly enriched uranium (HEU). The Rossi-α parameter was chosen for this examination because it is relatively easy to measure as well as easy to calculate using MCNP®’s kopts card. The Rossi-α also incorporates many other parameters of interest in nuclear kinetics most of which are more difficult to precisely measure. The comparison looks at two different nuclear data libraries for comparison to the experimental data. These libraries are ENDF/BVI (.66c) and ENDF/BVII (.80c).
Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

USDA-ARS?s Scientific Manuscript database

The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...
Comparative Genomics as a Foundation for Evo-Devo Studies in Birds.

PubMed

Grayson, Phil; Sin, Simon Y W; Sackton, Timothy B; Edwards, Scott V

2017-01-01

Developmental genomics is a rapidly growing field, and high-quality genomes are a useful foundation for comparative developmental studies. A high-quality genome forms an essential reference onto which the data from numerous assays and experiments, including ChIP-seq, ATAC-seq, and RNA-seq, can be mapped. A genome also streamlines and simplifies the development of primers used to amplify putative regulatory regions for enhancer screens, cDNA probes for in situ hybridization, microRNAs (miRNAs) or short hairpin RNAs (shRNA) for RNA interference (RNAi) knockdowns, mRNAs for misexpression studies, and even guide RNAs (gRNAs) for CRISPR knockouts. Finally, much can be gleaned from comparative genomics alone, including the identification of highly conserved putative regulatory regions. This chapter provides an overview of laboratory and bioinformatics protocols for DNA extraction, library preparation, library quantification, and genome assembly, from fresh or frozen tissue to a draft avian genome. Generating a high-quality draft genome can provide a developmental research group with excellent resources for their study organism, opening the doors to many additional assays and experiments.
De novo transcriptome assembly and analysis of differential gene expression following peptidoglycan (PGN) challenge in Antheraea pernyi.

PubMed

Liu, Yu; Xin, Zhao-Zhe; Zhang, Dai-Zhen; Zhu, Xiao-Yu; Wang, Ying; Chen, Li; Tang, Bo-Ping; Zhou, Chun-Lin; Chai, Xin-Yue; Tian, Ji-Wu; Liu, Qiu-Ning

2018-06-01

Antheraea pernyi is not only an important economic insect, it is increasingly employed as a model organism due to a variety of advantages, including ease of rearing and experimental manipulation compared with other Lepidoptera. Peptidoglycan (PGN) is a major component of the bacterial cell wall, and interactions between PGN and A. pernyi cause a series of physiological changes in the insect. In the present study, we constructed cDNA libraries from a A. pernyi PGN-infected group and a control group stimulated with phosphate-buffered saline (PBS). The transcriptome was de novo assembled using the Trinity platform, and 1698 differentially expressed genes (DEGs) were identified, comprising 894 up-regulated and 804 down-regulated genes. To further investigate immune-related DEGs, gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment were performed. GO analysis identified major immune-related GO terms and KEGG enrichment indicated gene responses to three pathways related to the insect immune system. Several homologous genes related to the immune response of the A. pernyi fat body post-PGN infection were identified and categorised. Taken together, the results provide insight into the complex molecular mechanisms of the responses to bacterial infection at the transcriptional level. Copyright © 2018 Elsevier B.V. All rights reserved.
Construction and Analysis of Two Genome-Scale Deletion Libraries for Bacillus subtilis.

PubMed

Koo, Byoung-Mo; Kritikos, George; Farelli, Jeremiah D; Todor, Horia; Tong, Kenneth; Kimsey, Harvey; Wapinski, Ilan; Galardini, Marco; Cabal, Angelo; Peters, Jason M; Hachmann, Anna-Barbara; Rudner, David Z; Allen, Karen N; Typas, Athanasios; Gross, Carol A

2017-03-22

A systems-level understanding of Gram-positive bacteria is important from both an environmental and health perspective and is most easily obtained when high-quality, validated genomic resources are available. To this end, we constructed two ordered, barcoded, erythromycin-resistance- and kanamycin-resistance-marked single-gene deletion libraries of the Gram-positive model organism, Bacillus subtilis. The libraries comprise 3,968 and 3,970 genes, respectively, and overlap in all but four genes. Using these libraries, we update the set of essential genes known for this organism, provide a comprehensive compendium of B. subtilis auxotrophic genes, and identify genes required for utilizing specific carbon and nitrogen sources, as well as those required for growth at low temperature. We report the identification of enzymes catalyzing several missing steps in amino acid biosynthesis. Finally, we describe a suite of high-throughput phenotyping methodologies and apply them to provide a genome-wide analysis of competence and sporulation. Altogether, we provide versatile resources for studying gene function and pathway and network architecture in Gram-positive bacteria. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Using genic sequence capture in combination with a syntenic pseudo genome to map a deletion mutant in a wheat species.

PubMed

Gardiner, Laura-Jayne; Gawroński, Piotr; Olohan, Lisa; Schnurbusch, Thorsten; Hall, Neil; Hall, Anthony

2014-12-01

Mapping-by-sequencing analyses have largely required a complete reference sequence and employed whole genome re-sequencing. In species such as wheat, no finished genome reference sequence is available. Additionally, because of its large genome size (17 Gb), re-sequencing at sufficient depth of coverage is not practical. Here, we extend the utility of mapping by sequencing, developing a bespoke pipeline and algorithm to map an early-flowering locus in einkorn wheat (Triticum monococcum L.) that is closely related to the bread wheat genome A progenitor. We have developed a genomic enrichment approach using the gene-rich regions of hexaploid bread wheat to design a 110-Mbp NimbleGen SeqCap EZ in solution capture probe set, representing the majority of genes in wheat. Here, we use the capture probe set to enrich and sequence an F2 mapping population of the mutant. The mutant locus was identified in T. monococcum, which lacks a complete genome reference sequence, by mapping the enriched data set onto pseudo-chromosomes derived from the capture probe target sequence, with a long-range order of genes based on synteny of wheat with Brachypodium distachyon. Using this approach we are able to map the region and identify a set of deleted genes within the interval. © 2014 The Authors.The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
DNA-Encoded Dynamic Combinatorial Chemical Libraries.

PubMed

Reddavide, Francesco V; Lin, Weilin; Lehnert, Sarah; Zhang, Yixin

2015-06-26

Dynamic combinatorial chemistry (DCC) explores the thermodynamic equilibrium of reversible reactions. Its application in the discovery of protein binders is largely limited by difficulties in the analysis of complex reaction mixtures. DNA-encoded chemical library (DECL) technology allows the selection of binders from a mixture of up to billions of different compounds; however, experimental results often show low a signal-to-noise ratio and poor correlation between enrichment factor and binding affinity. Herein we describe the design and application of DNA-encoded dynamic combinatorial chemical libraries (EDCCLs). Our experiments have shown that the EDCCL approach can be used not only to convert monovalent binders into high-affinity bivalent binders, but also to cause remarkably enhanced enrichment of potent bivalent binders by driving their in situ synthesis. We also demonstrate the application of EDCCLs in DNA-templated chemical reactions. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gordon, Sean P.; Contreras-Moreira, Bruno; Woods, Daniel P.

While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely tomore » be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.« less

Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure.

PubMed

Gordon, Sean P; Contreras-Moreira, Bruno; Woods, Daniel P; Des Marais, David L; Burgess, Diane; Shu, Shengqiang; Stritt, Christoph; Roulin, Anne C; Schackwitz, Wendy; Tyler, Ludmila; Martin, Joel; Lipzen, Anna; Dochy, Niklas; Phillips, Jeremy; Barry, Kerrie; Geuten, Koen; Budak, Hikmet; Juenger, Thomas E; Amasino, Richard; Caicedo, Ana L; Goodstein, David; Davidson, Patrick; Mur, Luis A J; Figueroa, Melania; Freeling, Michael; Catalan, Pilar; Vogel, John P

2017-12-19

While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.
Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure

DOE PAGES

Gordon, Sean P.; Contreras-Moreira, Bruno; Woods, Daniel P.; ...

2017-12-19

While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely tomore » be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.« less
Comparative genomics of Burkholderia multivorans, a ubiquitous pathogen with a highly conserved genomic structure

PubMed Central

Cooper, Vaughn S.; Hatcher, Philip J.; Verheyde, Bart; Carlier, Aurélien; Vandamme, Peter

2017-01-01

The natural environment serves as a reservoir of opportunistic pathogens. A well-established method for studying the epidemiology of such opportunists is multilocus sequence typing, which in many cases has defined strains predisposed to causing infection. Burkholderia multivorans is an important pathogen in people with cystic fibrosis (CF) and its epidemiology suggests that strains are acquired from non-human sources such as the natural environment. This raises the central question of whether the isolation source (CF or environment) or the multilocus sequence type (ST) of B. multivorans better predicts their genomic content and functionality. We identified four pairs of B. multivorans isolates, representing distinct STs and consisting of one CF and one environmental isolate each. All genomes were sequenced using the PacBio SMRT sequencing technology, which resulted in eight high-quality B. multivorans genome assemblies. The present study demonstrated that the genomic structure of the examined B. multivorans STs is highly conserved and that the B. multivorans genomic lineages are defined by their ST. Orthologous protein families were not uniformly distributed among chromosomes, with core orthologs being enriched on the primary chromosome and ST-specific orthologs being enriched on the second and third chromosome. The ST-specific orthologs were enriched in genes involved in defense mechanisms and secondary metabolism, corroborating the strain-specificity of these virulence characteristics. Finally, the same B. multivorans genomic lineages occur in both CF and environmental samples and on different continents, demonstrating their ubiquity and evolutionary persistence. PMID:28430818
Genomic clones for human cholinesterase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kott, M.; Venta, P.J.; Larsen, J.

1987-05-01

A human genomic library was prepared from peripheral white blood cells from a single donor by inserting an MboI partial digest into BamHI poly-linker sites of EMBL3. This library was screened using an oligolabeled human cholinesterase cDNA probe over 700 bp long. The latter probe was obtained from a human basal ganglia cDNA library. Of approximately 2 million clones screened with high stringency conditions several positive clones were identified; two have been plaque purified. One of these clones has been partially mapped using restriction enzymes known to cut within the coded region of the cDNA for human serum cholinesterase. Hybridizationmore » of the fragments and their sizes are as expected if the genomic clone is cholinesterase. Sequencing of the DNA fragments in M13 is in progress to verify the identify of the clone and the location of introns.« less
Comparative genome map of human and cattle

DOE Office of Scientific and Technical Information (OSTI.GOV)

Solinas-Toldo, S.; Fries, R.; Lengauer, C.

Chromosomal homologies between individual human chromosomes and the bovine karyotype have been established by using a new approach termed Zoo-FISH. Labeled DNA libraries from flow-sorted human chromosomes were used as probes for fluorescence in situ hybridization on cattle chromosomes. All human DNA libraries, except the Y chromosome library, hybridized to one or more cattle chromosomes, identifying and delineating 50 segments of homology, most of them corresponding to the regions of homology as identified by the previous mapping of individual conserved loci. However, Zoo-FISH refines the comparative maps constructed by molecular gene mapping of individual loci by providing information on themore » boundaries of conserved regions in the absence of obvious cytogenetic homologies of human and bovine chromosomes. It allows study of karyotypic evolution and opens new avenues for genomic analysis by facilitating the extrapolation of results from the human genome initiative. 50 refs., 3 figs., 1 tab.« less
Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data.

PubMed

Olova, Nelly; Krueger, Felix; Andrews, Simon; Oxley, David; Berrens, Rebecca V; Branco, Miguel R; Reik, Wolf

2018-03-15

Whole-genome bisulfite sequencing (WGBS) is becoming an increasingly accessible technique, used widely for both fundamental and disease-oriented research. Library preparation methods benefit from a variety of available kits, polymerases and bisulfite conversion protocols. Although some steps in the procedure, such as PCR amplification, are known to introduce biases, a systematic evaluation of biases in WGBS strategies is missing. We perform a comparative analysis of several commonly used pre- and post-bisulfite WGBS library preparation protocols for their performance and quality of sequencing outputs. Our results show that bisulfite conversion per se is the main trigger of pronounced sequencing biases, and PCR amplification builds on these underlying artefacts. The majority of standard library preparation methods yield a significantly biased sequence output and overestimate global methylation. Importantly, both absolute and relative methylation levels at specific genomic regions vary substantially between methods, with clear implications for DNA methylation studies. We show that amplification-free library preparation is the least biased approach for WGBS. In protocols with amplification, the choice of bisulfite conversion protocol or polymerase can significantly minimize artefacts. To aid with the quality assessment of existing WGBS datasets, we have integrated a bias diagnostic tool in the Bismark package and offer several approaches for consideration during the preparation and analysis of WGBS datasets.
Selection dynamic of Escherichia coli host in M13 combinatorial peptide phage display libraries.

PubMed

Zanconato, Stefano; Minervini, Giovanni; Poli, Irene; De Lucrezia, Davide

2011-01-01

Phage display relies on an iterative cycle of selection and amplification of random combinatorial libraries to enrich the initial population of those peptides that satisfy a priori chosen criteria. The effectiveness of any phage display protocol depends directly on library amino acid sequence diversity and the strength of the selection procedure. In this study we monitored the dynamics of the selective pressure exerted by the host organism on a random peptide library in the absence of any additional selection pressure. The results indicate that sequence censorship exerted by Escherichia coli dramatically reduces library diversity and can significantly impair phage display effectiveness.
Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

PubMed Central

2011-01-01

Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the library has a large number of transcription factors and will be interesting for discovery and validation of drought or abiotic stress related genes in common bean. PMID:22118559
High-Throughput resequencing of maize landraces at genomic regions associated with flowering time

USDA-ARS?s Scientific Manuscript database

Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequenci...
Enriched Title-Based Keyword Index Generation Using dBase II.

ERIC Educational Resources Information Center

Rajendran, P. P.

1986-01-01

Describes the use of a database management system (DBMS)--dBaseII--to create an enriched title-based keyword index for a collection of news items at the Renewable Energy Resources Information Center of the Asian Institute of Technology. The use of DBMSs in libraries in developing countries is emphasized. (Author/LRW)
Cell-free translational screening of an expression sequence tag library of Clonorchis sinensis for novel antigen discovery.

PubMed

Kasi, Devi; Catherine, Christy; Lee, Seung-Won; Lee, Kyung-Ho; Kim, Yu Jung; Ro Lee, Myeong; Ju, Jung Won; Kim, Dong-Myung

2017-05-01

The rapidly evolving cloning and sequencing technologies have enabled understanding of genomic structure of parasite genomes, opening up new ways of combatting parasite-related diseases. To make the most of the exponentially accumulating genomic data, however, it is crucial to analyze the proteins encoded by these genomic sequences. In this study, we adopted an engineered cell-free protein synthesis system for large-scale expression screening of an expression sequence tag (EST) library of Clonorchis sinensis to identify potential antigens that can be used for diagnosis and treatment of clonorchiasis. To allow high-throughput expression and identification of individual genes comprising the library, a cell-free synthesis reaction was designed such that both the template DNA and the expressed proteins were co-immobilized on the same microbeads, leading to microbead-based linkage of the genotype and phenotype. This reaction configuration allowed streamlined expression, recovery, and analysis of proteins. This approach enabled us to identify 21 antigenic proteins. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:832-837, 2017. © 2017 American Institute of Chemical Engineers.
Microbial communities involved in methane production from hydrocarbons in oil sands tailings.

PubMed

Siddique, Tariq; Penner, Tara; Klassen, Jonathan; Nesbø, Camilla; Foght, Julia M

2012-09-04

Microbial metabolism of residual hydrocarbons, primarily short-chain n-alkanes and certain monoaromatic hydrocarbons, in oil sands tailings ponds produces large volumes of CH(4) in situ. We characterized the microbial communities involved in methanogenic biodegradation of whole naphtha (a bitumen extraction solvent) and its short-chain n-alkane (C(6)-C(10)) and BTEX (benzene, toluene, ethylbenzene, and xylenes) components using primary enrichment cultures derived from oil sands tailings. Clone libraries of bacterial 16S rRNA genes amplified from these enrichments showed increased proportions of two orders of Bacteria: Clostridiales and Syntrophobacterales, with Desulfotomaculum and Syntrophus/Smithella as the closest named relatives, respectively. In parallel archaeal clone libraries, sequences affiliated with cultivated acetoclastic methanogens (Methanosaetaceae) were enriched in cultures amended with n-alkanes, whereas hydrogenotrophic methanogens (Methanomicrobiales) were enriched with BTEX. Naphtha-amended cultures harbored a blend of these two archaeal communities. The results imply syntrophic oxidation of hydrocarbons in oil sands tailings, with the activities of different carbon flow pathways to CH(4) being influenced by the primary hydrocarbon substrate. These results have implications for predicting greenhouse gas emissions from oil sands tailings repositories.
BAC library development, and clone characterization for dormancy-responsive DREB4A, DAM, and FT from leafy spurge (Euphorbia esula L.) identifies differential splicing and conserved promoter motifs

USDA-ARS?s Scientific Manuscript database

We developed two leafy spurge BAC libraries that together represent approximately 5X coverage of the leafy spurge genome. The BAC libraries have an average insert size of approximately 143 kb, and copies of the library and filters for hybridization-based screening are publicly available through the ...
Culture-dependent and culture-independent characterization of microbial assemblages associated with high-temperature petroleum reservoirs.

PubMed

Orphan, V J; Taylor, L T; Hafenbradl, D; Delong, E F

2000-02-01

Recent investigations of oil reservoirs in a variety of locales have indicated that these habitats may harbor active thermophilic prokaryotic assemblages. In this study, we used both molecular and culture-based methods to characterize prokaryotic consortia associated with high-temperature, sulfur-rich oil reservoirs in California. Enrichment cultures designed for anaerobic thermophiles, both autotrophic and heterotrophic, were successful at temperatures ranging from 60 to 90 degrees C. Heterotrophic enrichments from all sites yielded sheathed rods (Thermotogales), pleomorphic rods resembling Thermoanaerobacter, and Thermococcus-like isolates. The predominant autotrophic microorganisms recovered from inorganic enrichments using H(2), acetate, and CO(2) as energy and carbon sources were methanogens, including isolates closely related to Methanobacterium, Methanococcus, and Methanoculleus species. Two 16S rRNA gene (rDNA) libraries were generated from total community DNA collected from production wellheads, using either archaeal or universal oligonucleotide primer sets. Sequence analysis of the universal library indicated that a large percentage of clones were highly similar to known bacterial and archaeal isolates recovered from similar habitats. Represented genera in rDNA clone libraries included Thermoanaerobacter, Thermococcus, Desulfothiovibrio, Aminobacterium, Acidaminococcus, Pseudomonas, Halomonas, Acinetobacter, Sphingomonas, Methylobacterium, and Desulfomicrobium. The archaeal library was dominated by methanogen-like rDNAs, with a lower percentage of clones belonging to the Thermococcales. Our results strongly support the hypothesis that sulfur-utilizing and methane-producing thermophilic microorganisms have a widespread distribution in oil reservoirs and the potential to actively participate in the biogeochemical transformation of carbon, hydrogen, and sulfur in situ.
Transcriptome analysis of zebrafish embryos exposed to deltamethrin.

PubMed

Chueh, Tsung-Cheng; Hsu, Li-Sung; Kao, Chin-Ming; Hsu, Tung-Wei; Liao, Hung-Yu; Wang, Kuan-Yi; Chen, Ssu Ching

2017-05-01

Deltamethrin (DTM), a type II pyrethroid, is one of the most commonly used insecticides. The increased use of pyrethroid leads to potential adverse effects, particularly in sensitive populations such as children and pregnant women. None of the related studies was focused on the transcriptome responses in zebrafish embryos after treatment with DTM; therefore, RNA-seq, a high-throughput method, was performed to analyze the global expression of differential expressed genes (DEGs) in zebrafish embryos treated with DTM (40 and 80 μg/L) from fertilization to 48 h postfertilization (hpf) as compared with that in the control group (without DTM treatment). Two cDNA libraries were generated from treated embryos and one cDNA library from nontreated embryos, respectively. Over 92% of reads mapped to the reference in these three libraries. It was observed that many differential genes were expressed in comparison with embryos before and after DTM. The 20 most differentially expressed upregulated or downregulated genes were majorly involved in the signaling transduction. Validation of selected nine genes expression using qRT-PCR confirmed RNA-seq results. The transcriptome sequences were further subjected to gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, showing G-protein-coupled receptor signaling pathway and neuroactive ligand-receptor interaction, respectively, were most enriched. The data from this study contributed to a better understanding of the potential consequences of fish exposed to DTM, to an evaluation of the potential threat of DTM to fish populations in aquatic environments. © 2016 Wiley Periodicals, Inc. Environ Toxicol 32: 1548-1557, 2017. © 2016 Wiley Periodicals, Inc.
A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

PubMed

Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

2013-07-18

Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.
Biased selection of propagation-related TUPs from phage display peptide libraries.

PubMed

Zade, Hesam Motaleb; Keshavarz, Reihaneh; Shekarabi, Hosna Sadat Zahed; Bakhshinejad, Babak

2017-08-01

Phage display is rapidly advancing as a screening strategy in drug discovery and drug delivery. Phage-encoded combinatorial peptide libraries can be screened through the affinity selection procedure of biopanning to find pharmaceutically relevant cell-specific ligands. However, the unwanted enrichment of target-unrelated peptides (TUPs) with no true affinity for the target presents an important barrier to the successful screening of phage display libraries. Propagation-related TUPs (Pr-TUPs) are an emerging but less-studied category of phage display-derived false-positive hits that are displayed on the surface of clones with faster propagation rates. Despite long regarded as an unbiased selection system, accumulating evidence suggests that biopanning may create biological bias toward selection of phage clones with certain displayed peptides. This bias can be dependent on or independent of the displayed sequence and may act as a major driving force for the isolation of fast-growing clones. Sequence-dependent bias is reflected by censorship or over-representation of some amino acids in the displayed peptide and sequence-independent bias is derived from either point mutations or rare recombination events occurring in the phage genome. It is of utmost interest to clean biopanning data by identifying and removing Pr-TUPs. Experimental and bioinformatic approaches can be exploited for Pr-TUP discovery. With no doubt, obtaining deeper insight into how Pr-TUPs emerge during biopanning and how they could be detected provides a basis for using cell-targeting peptides isolated from phage display screening in the development of disease-specific diagnostic and therapeutic platforms.
Genome sequencing of a single tardigrade Hypsibius dujardini individual

PubMed Central

Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru

2016-01-01

Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies. PMID:27529330
Genome sequencing of a single tardigrade Hypsibius dujardini individual.

PubMed

Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru

2016-08-16

Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies.
Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.

2005-08-26

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. Amore » minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.« less

Reference quality assembly of the 3.5-Gb genome of Capsicum annuum from a single linked-read library.

PubMed

Hulse-Kemp, Amanda M; Maheshwari, Shamoni; Stoffel, Kevin; Hill, Theresa A; Jaffe, David; Williams, Stephen R; Weisenfeld, Neil; Ramakrishnan, Srividya; Kumar, Vijay; Shah, Preyas; Schatz, Michael C; Church, Deanna M; Van Deynze, Allen

2018-01-01

Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper ( Capsicum annuum ) genome with a single Linked-Read library. Plant genomes, including pepper, are characterized by long, highly similar repetitive sequences. Accordingly, significant effort is used to ensure that the sequenced plant is highly homozygous and the resulting assembly is a haploid consensus. With a phased assembly approach, we targeted a heterozygous F 1 derived from a wide cross to assess the ability to derive both haplotypes and characterize a pungency gene with a large insertion/deletion. The Supernova software generated a highly ordered, more contiguous sequence assembly than all currently available C. annuum reference genomes. Over 83% of the final assembly was anchored and oriented using four publicly available de novo linkage maps. A comparison of the annotation of conserved eukaryotic genes indicated the completeness of assembly. The validity of the phased assembly is further demonstrated with the complete recovery of both 2.5-Kb insertion/deletion haplotypes of the PUN1 locus in the F 1 sample that represents pungent and nonpungent peppers, as well as nearly full recovery of the BUSCO2 gene set within each of the two haplotypes. The most contiguous pepper genome assembly to date has been generated which demonstrates that Linked-Read library technology provides a tool to de novo assemble complex highly repetitive heterozygous plant genomes. This technology can provide an opportunity to cost-effectively develop high-quality genome assemblies for other complex plants and compare structural and gene differences through accurate haplotype reconstruction.
Versatile P(acman) BAC Libraries for Transgenesis Studies in Drosophila melanogaster

DOE Office of Scientific and Technical Information (OSTI.GOV)

Venken, Koen J.T.; Carlson, Joseph W.; Schulze, Karen L.

2009-04-21

We constructed Drosophila melanogaster BAC libraries with 21-kb and 83-kb inserts in the P(acman) system. Clones representing 12-fold coverage and encompassing more than 95percent of annotated genes were mapped onto the reference genome. These clones can be integrated into predetermined attP sites in the genome using Phi C31 integrase to rescue mutations. They can be modified through recombineering, for example to incorporate protein tags and assess expression patterns.
Selective recruitment of nuclear factors to productively replicating herpes simplex virus genomes.

PubMed

Dembowski, Jill A; DeLuca, Neal A

2015-05-01

Much of the HSV-1 life cycle is carried out in the cell nucleus, including the expression, replication, repair, and packaging of viral genomes. Viral proteins, as well as cellular factors, play essential roles in these processes. Isolation of proteins on nascent DNA (iPOND) was developed to label and purify cellular replication forks. We adapted aspects of this method to label viral genomes to both image, and purify replicating HSV-1 genomes for the identification of associated proteins. Many viral and cellular factors were enriched on viral genomes, including factors that mediate DNA replication, repair, chromatin remodeling, transcription, and RNA processing. As infection proceeded, packaging and structural components were enriched to a greater extent. Among the more abundant proteins that copurified with genomes were the viral transcription factor ICP4 and the replication protein ICP8. Furthermore, all seven viral replication proteins were enriched on viral genomes, along with cellular PCNA and topoisomerases, while other cellular replication proteins were not detected. The chromatin-remodeling complexes present on viral genomes included the INO80, SWI/SNF, NURD, and FACT complexes, which may prevent chromatinization of the genome. Consistent with this conclusion, histones were not readily recovered with purified viral genomes, and imaging studies revealed an underrepresentation of histones on viral genomes. RNA polymerase II, the mediator complex, TFIID, TFIIH, and several other transcriptional activators and repressors were also affinity purified with viral DNA. The presence of INO80, NURD, SWI/SNF, mediator, TFIID, and TFIIH components is consistent with previous studies in which these complexes copurified with ICP4. Therefore, ICP4 is likely involved in the recruitment of these key cellular chromatin remodeling and transcription factors to viral genomes. Taken together, iPOND is a valuable method for the study of viral genome dynamics during infection and provides a comprehensive view of how HSV-1 selectively utilizes cellular resources.
Long non-coding RNAs and mRNAs profiling during spleen development in pig.

PubMed

Che, Tiandong; Li, Diyan; Jin, Long; Fu, Yuhua; Liu, Yingkai; Liu, Pengliang; Wang, Yixin; Tang, Qianzi; Ma, Jideng; Wang, Xun; Jiang, Anan; Li, Xuewei; Li, Mingzhou

2018-01-01

Genome-wide transcriptomic studies in humans and mice have become extensive and mature. However, a comprehensive and systematic understanding of protein-coding genes and long non-coding RNAs (lncRNAs) expressed during pig spleen development has not been achieved. LncRNAs are known to participate in regulatory networks for an array of biological processes. Here, we constructed 18 RNA libraries from developing fetal pig spleen (55 days before birth), postnatal pig spleens (0, 30, 180 days and 2 years after birth), and the samples from the 2-year-old Wild Boar. A total of 15,040 lncRNA transcripts were identified among these samples. We found that the temporal expression pattern of lncRNAs was more restricted than observed for protein-coding genes. Time-series analysis showed two large modules for protein-coding genes and lncRNAs. The up-regulated module was enriched for genes related to immune and inflammatory function, while the down-regulated module was enriched for cell proliferation processes such as cell division and DNA replication. Co-expression networks indicated the functional relatedness between protein-coding genes and lncRNAs, which were enriched for similar functions over the series of time points examined. We identified numerous differentially expressed protein-coding genes and lncRNAs in all five developmental stages. Notably, ceruloplasmin precursor (CP), a protein-coding gene participating in antioxidant and iron transport processes, was differentially expressed in all stages. This study provides the first catalog of the developing pig spleen, and contributes to a fuller understanding of the molecular mechanisms underpinning mammalian spleen development.
Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

PubMed

Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

2017-08-30

To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.
Construction and characterization of two BAC libraries representing a deep-coverage of the genome of chicory (Cichorium intybus L., Asteraceae)

PubMed Central

2010-01-01

Background The Asteraceae represents an important plant family with respect to the numbers of species present in the wild and used by man. Nonetheless, genomic resources for Asteraceae species are relatively underdeveloped, hampering within species genetic studies as well as comparative genomics studies at the family level. So far, six BAC libraries have been described for the main crops of the family, i.e. lettuce and sunflower. Here we present the characterization of BAC libraries of chicory (Cichorium intybus L.) constructed from two genotypes differing in traits related to sexual and vegetative reproduction. Resolving the molecular mechanisms underlying traits controlling the reproductive system of chicory is a key determinant for hybrid development, and more generally will provide new insights into these traits, which are poorly investigated so far at the molecular level in Asteraceae. Findings Two bacterial artificial chromosome (BAC) libraries, CinS2S2 and CinS1S4, were constructed from HindIII-digested high molecular weight DNA of the contrasting genotypes C15 and C30.01, respectively. C15 was hermaphrodite, non-embryogenic, and S2S2 for the S-locus implicated in self-incompatibility, whereas C30.01 was male sterile, embryogenic, and S1S4. The CinS2S2 and CinS1S4 libraries contain 89,088 and 81,408 clones. Mean insert sizes of the CinS2S2 and CinS1S4 clones are 90 and 120 kb, respectively, and provide together a coverage of 12.3 haploid genome equivalents. Contamination with mitochondrial and chloroplast DNA sequences was evaluated with four mitochondrial and four chloroplast specific probes, and was estimated to be 0.024% and 1.00% for the CinS2S2 library, and 0.028% and 2.35% for the CinS1S4 library. Using two single copy genes putatively implicated in somatic embryogenesis, screening of both libraries resulted in detection of 12 and 13 positive clones for each gene, in accordance with expected numbers. Conclusions This indicated that both BAC libraries are valuable tools for molecular studies in chicory, one goal being the positional cloning of the S-locus in this Asteraceae species. PMID:20701751
Construction and characterization of two BAC libraries representing a deep-coverage of the genome of chicory (Cichorium intybus L., Asteraceae).

PubMed

Gonthier, Lucy; Bellec, Arnaud; Blassiau, Christelle; Prat, Elisa; Helmstetter, Nicolas; Rambaud, Caroline; Huss, Brigitte; Hendriks, Theo; Bergès, Hélène; Quillet, Marie-Christine

2010-08-11

The Asteraceae represents an important plant family with respect to the numbers of species present in the wild and used by man. Nonetheless, genomic resources for Asteraceae species are relatively underdeveloped, hampering within species genetic studies as well as comparative genomics studies at the family level. So far, six BAC libraries have been described for the main crops of the family, i.e. lettuce and sunflower. Here we present the characterization of BAC libraries of chicory (Cichorium intybus L.) constructed from two genotypes differing in traits related to sexual and vegetative reproduction. Resolving the molecular mechanisms underlying traits controlling the reproductive system of chicory is a key determinant for hybrid development, and more generally will provide new insights into these traits, which are poorly investigated so far at the molecular level in Asteraceae. Two bacterial artificial chromosome (BAC) libraries, CinS2S2 and CinS1S4, were constructed from HindIII-digested high molecular weight DNA of the contrasting genotypes C15 and C30.01, respectively. C15 was hermaphrodite, non-embryogenic, and S2S2 for the S-locus implicated in self-incompatibility, whereas C30.01 was male sterile, embryogenic, and S1S4. The CinS2S2 and CinS1S4 libraries contain 89,088 and 81,408 clones. Mean insert sizes of the CinS2S2 and CinS1S4 clones are 90 and 120 kb, respectively, and provide together a coverage of 12.3 haploid genome equivalents. Contamination with mitochondrial and chloroplast DNA sequences was evaluated with four mitochondrial and four chloroplast specific probes, and was estimated to be 0.024% and 1.00% for the CinS2S2 library, and 0.028% and 2.35% for the CinS1S4 library. Using two single copy genes putatively implicated in somatic embryogenesis, screening of both libraries resulted in detection of 12 and 13 positive clones for each gene, in accordance with expected numbers. This indicated that both BAC libraries are valuable tools for molecular studies in chicory, one goal being the positional cloning of the S-locus in this Asteraceae species.
Analysis of disease-associated objects at the Rat Genome Database

PubMed Central

Wang, Shur-Jen; Laulederkind, Stanley J. F.; Hayman, G. T.; Smith, Jennifer R.; Petri, Victoria; Lowry, Timothy F.; Nigam, Rajni; Dwinell, Melinda R.; Worthey, Elizabeth A.; Munzenmaier, Diane H.; Shimoyama, Mary; Jacob, Howard J.

2013-01-01

The Rat Genome Database (RGD) is the premier resource for genetic, genomic and phenotype data for the laboratory rat, Rattus norvegicus. In addition to organizing biological data from rats, the RGD team focuses on manual curation of gene–disease associations for rat, human and mouse. In this work, we have analyzed disease-associated strains, quantitative trait loci (QTL) and genes from rats. These disease objects form the basis for seven disease portals. Among disease portals, the cardiovascular disease and obesity/metabolic syndrome portals have the highest number of rat strains and QTL. These two portals share 398 rat QTL, and these shared QTL are highly concentrated on rat chromosomes 1 and 2. For disease-associated genes, we performed gene ontology (GO) enrichment analysis across portals using RatMine enrichment widgets. Fifteen GO terms, five from each GO aspect, were selected to profile enrichment patterns of each portal. Of the selected biological process (BP) terms, ‘regulation of programmed cell death’ was the top enriched term across all disease portals except in the obesity/metabolic syndrome portal where ‘lipid metabolic process’ was the most enriched term. ‘Cytosol’ and ‘nucleus’ were common cellular component (CC) annotations for disease genes, but only the cancer portal genes were highly enriched with ‘nucleus’ annotations. Similar enrichment patterns were observed in a parallel analysis using the DAVID functional annotation tool. The relationship between the preselected 15 GO terms and disease terms was examined reciprocally by retrieving rat genes annotated with these preselected terms. The individual GO term–annotated gene list showed enrichment in physiologically related diseases. For example, the ‘regulation of blood pressure’ genes were enriched with cardiovascular disease annotations, and the ‘lipid metabolic process’ genes with obesity annotations. Furthermore, we were able to enhance enrichment of neurological diseases by combining ‘G-protein coupled receptor binding’ annotated genes with ‘protein kinase binding’ annotated genes. Database URL: http://rgd.mcw.edu PMID:23794737
High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency.

PubMed

Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J; Burnett, John C; Zhou, Jiehua

2016-09-22

The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct "biased sequences" and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the "biased sequences" was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy.
High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency

PubMed Central

Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J.; Burnett, John C.; Zhou, Jiehua

2016-01-01

The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct “biased sequences” and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the “biased sequences” was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy. PMID:27652575
A High-Quality Reference Genome for the Invasive Mosquitofish Gambusia affinis Using a Chicago Library.

PubMed

Hoffberg, Sandra L; Troendle, Nicholas J; Glenn, Travis C; Mahmud, Ousman; Louha, Swarnali; Chalopin, Domitille; Bennetzen, Jeffrey L; Mauricio, Rodney

2018-04-27

The western mosquitofish, Gambusia affinis, is a freshwater poecilid fish native to the southeastern United States but with a global distribution due to widespread human introduction. Gambusia affinis has been used as a model species for a broad range of evolutionary and ecological studies. We sequenced the genome of a male G. affinis to facilitate genetic studies in diverse fields including invasion biology and comparative genetics. We generated Illumina short read data from paired-end libraries and in vitro proximity-ligation libraries. We obtained 54.9× coverage, N50 contig length of 17.6 kb, and N50 scaffold length of 6.65 Mb. Compared to two other species in the Poeciliidae family, G. affinis has slightly fewer genes that have shorter total, exon, and intron length on average. Using a set of universal single-copy orthologs in fish genomes, we found 95.5% of these genes were complete in the G. affinis assembly. The number of transposable elements in the G. affinis assembly is similar to those of closely related species. The high-quality genome sequence and annotations we report will be valuable resources for scientists to map the genetic architecture of traits of interest in this species. Copyright © 2018, G3: Genes, Genomes, Genetics.
Next-generation libraries for robust RNA interference-based genome-wide screens

PubMed Central

Kampmann, Martin; Horlbeck, Max A.; Chen, Yuwen; Tsai, Jordan C.; Bassik, Michael C.; Gilbert, Luke A.; Villalta, Jacqueline E.; Kwon, S. Chul; Chang, Hyeshik; Kim, V. Narry; Weissman, Jonathan S.

2015-01-01

Genetic screening based on loss-of-function phenotypes is a powerful discovery tool in biology. Although the recent development of clustered regularly interspaced short palindromic repeats (CRISPR)-based screening approaches in mammalian cell culture has enormous potential, RNA interference (RNAi)-based screening remains the method of choice in several biological contexts. We previously demonstrated that ultracomplex pooled short-hairpin RNA (shRNA) libraries can largely overcome the problem of RNAi off-target effects in genome-wide screens. Here, we systematically optimize several aspects of our shRNA library, including the promoter and microRNA context for shRNA expression, selection of guide strands, and features relevant for postscreen sample preparation for deep sequencing. We present next-generation high-complexity libraries targeting human and mouse protein-coding genes, which we grouped into 12 sublibraries based on biological function. A pilot screen suggests that our next-generation RNAi library performs comparably to current CRISPR interference (CRISPRi)-based approaches and can yield complementary results with high sensitivity and high specificity. PMID:26080438
A Fast Solution to NGS Library Prep with Low Nanogram DNA Input

PubMed Central

Liu, Pingfang; Lohman, Gregory J.S.; Cantor, Eric; Langhorst, Bradley W.; Yigit, Erbay; Apone, Lynne M.; Munafo, Daniela B.; Stewart, Fiona J.; Evans, Thomas C.; Nichols, Nicole; Dimalanta, Eileen T.; Davis, Theodore B.; Sumner, Christine

2013-01-01

Next Generation Sequencing (NGS) has significantly impacted human genetics, enabling a comprehensive characterization of the human genome as well as a better understanding of many genomic abnormalities. By delivering massive DNA sequences at unprecedented speed and cost, NGS promises to make personalized medicine a reality in the foreseeable future. To date, library construction with clinical samples has been a challenge, primarily due to the limited quantities of sample DNA available. Our objective here was to overcome this challenge by developing NEBNext® Ultra DNA Library Prep Kit, a fast library preparation method. Specifically, we streamlined the workflow utilizing novel NEBNext reagents and adaptors, including a new DNA polymerase that has been optimized to minimize GC bias. As a result of this work, we have developed a simple method for library construction from an amount of DNA as low as 5 ng, which can be used for both intact and fragmented DNA. Moreover, the workflow is compatible with multiple NGS platforms.
Sequencing thousands of single-cell genomes with combinatorial indexing.

PubMed

Vitak, Sarah A; Torkenczy, Kristof A; Rosenkrantz, Jimi L; Fields, Andrew J; Christiansen, Lena; Wong, Melissa H; Carbone, Lucia; Steemers, Frank J; Adey, Andrew

2017-03-01

Single-cell genome sequencing has proven valuable for the detection of somatic variation, particularly in the context of tumor evolution. Current technologies suffer from high library construction costs, which restrict the number of cells that can be assessed and thus impose limitations on the ability to measure heterogeneity within a tissue. Here, we present single-cell combinatorial indexed sequencing (SCI-seq) as a means of simultaneously generating thousands of low-pass single-cell libraries for detection of somatic copy-number variants. We constructed libraries for 16,698 single cells from a combination of cultured cell lines, primate frontal cortex tissue and two human adenocarcinomas, and obtained a detailed assessment of subclonal variation within a pancreatic tumor.
Proteogenomics connects somatic mutations to signalling in breast cancer.

PubMed

Mertins, Philipp; Mani, D R; Ruggles, Kelly V; Gillette, Michael A; Clauser, Karl R; Wang, Pei; Wang, Xianlong; Qiao, Jana W; Cao, Song; Petralia, Francesca; Kawaler, Emily; Mundt, Filip; Krug, Karsten; Tu, Zhidong; Lei, Jonathan T; Gatza, Michael L; Wilkerson, Matthew; Perou, Charles M; Yellapantula, Venkata; Huang, Kuan-lin; Lin, Chenwei; McLellan, Michael D; Yan, Ping; Davies, Sherri R; Townsend, R Reid; Skates, Steven J; Wang, Jing; Zhang, Bing; Kinsinger, Christopher R; Mesri, Mehdi; Rodriguez, Henry; Ding, Li; Paulovich, Amanda G; Fenyö, David; Ellis, Matthew J; Carr, Steven A

2016-06-02

Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.
Reducing assembly complexity of microbial genomes with single-molecule sequencing.

PubMed

Koren, Sergey; Harhay, Gregory P; Smith, Timothy P L; Bono, James L; Harhay, Dayna M; Mcvey, Scott D; Radune, Diana; Bergman, Nicholas H; Phillippy, Adam M

2013-01-01

The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem. To measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library. These single-library assemblies are also more accurate than typical short-read assemblies and hybrid assemblies of short and long reads. Automated assembly of long, single-molecule sequencing data reduces the cost of microbial finishing to $1,000 for most genomes, and future advances in this technology are expected to drive the cost lower. This is expected to increase the number of completed genomes, improve the quality of microbial genome databases, and enable high-fidelity, population-scale studies of pan-genomes and chromosomal organization.
The Influence of Literacy-Enriched Play Settings on Preschoolers' Conceptions of Print.

ERIC Educational Resources Information Center

Neuman, Susan B.; Roskos, Kathy

This study examined the influence of literacy-enriched play centers on preschoolers' conceptions of print. Subjects, 25 boys and 12 girls aged 4 and 5 years from 2 urban preschool classes, were systematically observed before and after 5 basic design changes were made in the classrooms. Four distinct play centers (post office, library, office, and…
Development and characterization of genomic SSR markers in Cynodon transvaalensis Burtt-Davy.

PubMed

Tan, Chengcheng; Wu, Yanqi; Taliaferro, Charles M; Bell, Greg E; Martin, Dennis L; Smith, Mike W

2014-08-01

Simple sequence repeat (SSR) markers are a major molecular tool for genetic and genomic research that have been extensively developed and used in major crops. However, few are available in African bermudagrass (Cynodon transvaalensis Burtt-Davy), an economically important warm-season turfgrass species. African bermudagrass is mainly used for hybridizations with common bermudagrass [C. dactylon var. dactylon (L.) Pers.] in the development of superior interspecific hybrid turfgrass cultivars. Accordingly, the major objective of this study was to develop and characterize a large set of SSR markers. Genomic DNA of C. transvaalensis '4200TN 24-2' from an Oklahoma State University (OSU) turf nursery was extracted for construction of four SSR genomic libraries enriched with [CA](n), [GA](n), [AAG](n), and [AAT](n) as core repeat motifs. A total of 3,064 clones were sequenced at the OSU core facility. The sequences were categorized into singletons and contiguous sequences to exclude redundancy. From the two sequence categories, 1,795 SSR loci were identified. After excluding duplicate SSRs by comparison with previously developed SSR markers using a nucleotide basic local alignment tool, 1,426 unique primer pairs (PPs) were designed. Out of the 1,426 designed PPs, 981 (68.8 %) amplified alleles of the expected size in the donor DNA. Polymorphisms of the SSR PPs tested in eight C. transvaalensis plants were 93 % polymorphic with 544 markers effective in all genotypes. Inheritance of the SSRs was examined in six F(1) progeny of African parents 'T577' × 'Uganda', indicating 917 markers amplified heritable alleles. The SSR markers developed in the study are the first large set of co-dominant markers in African bermudagrass and should be highly valuable for molecular and traditional breeding research.
Wheat EST resources for functional genomics of abiotic stress

PubMed Central

Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey

2006-01-01

Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID:16772040
Begin at the beginning: A BAC-end view of the passion fruit (Passiflora) genome.

PubMed

Santos, Anselmo Azevedo; Penha, Helen Alves; Bellec, Arnaud; Munhoz, Carla de Freitas; Pedrosa-Harand, Andrea; Bergès, Hélène; Vieira, Maria Lucia Carneiro

2014-09-26

The passion fruit (Passiflora edulis) is a tropical crop of economic importance both for juice production and consumption as fresh fruit. The juice is also used in concentrate blends that are consumed worldwide. However, very little is known about the genome of the species. Therefore, improving our understanding of passion fruit genomics is essential and to some degree a pre-requisite if its genetic resources are to be used more efficiently. In this study, we have constructed a large-insert BAC library and provided the first view on the structure and content of the passion fruit genome, using BAC-end sequence (BES) data as a major resource. The library consisted of 82,944 clones and its levels of organellar DNA were very low. The library represents six haploid genome equivalents, and the average insert size was 108 kb. To check its utility for gene isolation, successful macroarray screening experiments were carried out with probes complementary to eight Passiflora gene sequences available in public databases. BACs harbouring those genes were used in fluorescent in situ hybridizations and unique signals were detected for four BACs in three chromosomes (n=9). Then, we explored 10,000 BES and we identified reads likely to contain repetitive mobile elements (19.6% of all BES), simple sequence repeats and putative proteins, and to estimate the GC content (~42%) of the reads. Around 9.6% of all BES were found to have high levels of similarity to plant genes and ontological terms were assigned to more than half of the sequences analysed (940). The vast majority of the top-hits made by our sequences were to Populus trichocarpa (24.8% of the total occurrences), Theobroma cacao (21.6%), Ricinus communis (14.3%), Vitis vinifera (6.5%) and Prunus persica (3.8%). We generated the first large-insert library for a member of Passifloraceae. This BAC library provides a new resource for genetic and genomic studies, as well as it represents a valuable tool for future whole genome study. Remarkably, a number of BAC-end pair sequences could be mapped to intervals of the sequenced Arabidopsis thaliana, V. vinifera and P. trichocarpa chromosomes, and putative collinear microsyntenic regions were identified.

Generation of an arrayed CRISPR-Cas9 library targeting epigenetic regulators: from high-content screens to in vivo assays

PubMed Central

2017-01-01

ABSTRACT The CRISPR-Cas9 system has revolutionized genome engineering, allowing precise modification of DNA in various organisms. The most popular method for conducting CRISPR-based functional screens involves the use of pooled lentiviral libraries in selection screens coupled with next-generation sequencing. Screens employing genome-scale pooled small guide RNA (sgRNA) libraries are demanding, particularly when complex assays are used. Furthermore, pooled libraries are not suitable for microscopy-based high-content screens or for systematic interrogation of protein function. To overcome these limitations and exploit CRISPR-based technologies to comprehensively investigate epigenetic mechanisms, we have generated a focused sgRNA library targeting 450 epigenetic regulators with multiple sgRNAs in human cells. The lentiviral library is available both in an arrayed and pooled format and allows temporally-controlled induction of gene knock-out. Characterization of the library showed high editing activity of most sgRNAs and efficient knock-out at the protein level in polyclonal populations. The sgRNA library can be used for both selection and high-content screens, as well as for targeted investigation of selected proteins without requiring isolation of knock-out clones. Using a variety of functional assays we show that the library is suitable for both in vitro and in vivo applications, representing a unique resource to study epigenetic mechanisms in physiological and pathological conditions. PMID:29327641
A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries

PubMed Central

2011-01-01

Genome targeting methods enable cost-effective capture of specific subsets of the genome for sequencing. We present here an automated, highly scalable method for carrying out the Solution Hybrid Selection capture approach that provides a dramatic increase in scale and throughput of sequence-ready libraries produced. Significant process improvements and a series of in-process quality control checkpoints are also added. These process improvements can also be used in a manual version of the protocol. PMID:21205303
Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library.

PubMed

Sánchez, Cecilia Castaño; Smith, Timothy P L; Wiedmann, Ralph T; Vallejo, Roger L; Salem, Mohamed; Yao, Jianbo; Rexroad, Caird E

2009-11-25

To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts. The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.
MicroRNA-guided prioritization of genome-wide association signals reveals the importance of microRNA-target gene networks for complex traits in cattle.

PubMed

Fang, Lingzhao; Sørensen, Peter; Sahana, Goutam; Panitz, Frank; Su, Guosheng; Zhang, Shengli; Yu, Ying; Li, Bingjie; Ma, Li; Liu, George; Lund, Mogens Sandø; Thomsen, Bo

2018-06-19

MicroRNAs (miRNA) are key modulators of gene expression and so act as putative fine-tuners of complex phenotypes. Here, we hypothesized that causal variants of complex traits are enriched in miRNAs and miRNA-target networks. First, we conducted a genome-wide association study (GWAS) for seven functional and milk production traits using imputed sequence variants (13~15 million) and >10,000 animals from three dairy cattle breeds, i.e., Holstein (HOL), Nordic red cattle (RDC) and Jersey (JER). Second, we analyzed for enrichments of association signals in miRNAs and their miRNA-target networks. Our results demonstrated that genomic regions harboring miRNA genes were significantly (P < 0.05) enriched with GWAS signals for milk production traits and mastitis, and that enrichments within miRNA-target gene networks were significantly higher than in random gene-sets for the majority of traits. Furthermore, most between-trait and across-breed correlations of enrichments with miRNA-target networks were significantly greater than with random gene-sets, suggesting pleiotropic effects of miRNAs. Intriguingly, genes that were differentially expressed in response to mammary gland infections were significantly enriched in the miRNA-target networks associated with mastitis. All these findings were consistent across three breeds. Collectively, our observations demonstrate the importance of miRNAs and their targets for the expression of complex traits.
New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

PubMed Central

2011-01-01

Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy. PMID:21767393
New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits.

PubMed

Saski, Christopher A; Li, Zhigang; Feltus, Frank A; Luo, Hong

2011-07-18

Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)

PubMed Central

Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto

2017-01-01

Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Engineering improved bio-jet fuel tolerance in Escherichia coli using a transgenic library from the hydrocarbon-degrader Marinobacter aquaeolei.

PubMed

Tomko, Timothy A; Dunlop, Mary J

2015-01-01

Recent metabolic engineering efforts have generated microorganisms that can produce biofuels, including bio-jet fuels, however these fuels are often toxic to cells, limiting production yields. There are natural examples of microorganisms that have evolved mechanisms for tolerating hydrocarbon-rich environments, such as those that thrive near natural oil seeps and in oil-polluted waters. Using genomic DNA from the hydrocarbon-degrading microbe Marinobacter aquaeolei, we constructed a transgenic library that we expressed in Escherichia coli. We exposed cells to inhibitory levels of pinene, a monoterpene that can serve as a jet fuel precursor with chemical properties similar to existing tactical fuels. Using a sequential strategy with a fosmid library followed by a plasmid library, we were able to isolate a region of DNA from the M. aquaeolei genome that conferred pinene tolerance when expressed in E. coli. We determined that a single gene, yceI, was responsible for the tolerance improvements. Overexpression of this gene placed no additional burden on the host. We also tested tolerance to other monoterpenes and showed that yceI selectively improves tolerance. The genomes of hydrocarbon-tolerant microbes represent a rich resource for tolerance engineering. Using a transgenic library, we were able to identify a single gene that improves E. coli's tolerance to the bio-jet fuel precursor pinene.
BAC Libraries from Wheat Chromosome 7D – Efficient Tool for Positional Cloning of Aphid Resistance Genes

USDA-ARS?s Scientific Manuscript database

Positional cloning in bread wheat is a tedious task due to its huge genome size (~17 Gbp) and polyploid character. BAC libraries represent an essential tool for positional cloning. However, wheat BAC libraries comprise more than million clones, which make their screening very laborious. Here we pres...
Genome sequences of wild and domestic bactrian camels

PubMed Central

Jirimutu; Wang, Zhen; Ding, Guohui; Chen, Gangliang; Sun, Yamin; Sun, Zhihong; Zhang, Heping; Wang, Lei; Hasi, Surong; Zhang, Yan; Li, Jianmei; Shi, Yixiang; Xu, Ze; He, Chuan; Yu, Siriguleng; Li, Shengdi; Zhang, Wenbin; Batmunkh, Mijiddorj; Ts, Batsukh; Narenbatu; Unierhu; Bat-Ireedui, Shirzana; Gao, Hongwei; Baysgalan, Banzragch; Li, Qing; Jia, Zhiling; Turigenbayila; Subudenggerile; Narenmanduhu; Wang, Zhaoxia; Wang, Juan; Pan, Lei; Chen, Yongcan; Ganerdene, Yaichil; Dabxilt; Erdemt; Altansha; Altansukh; Liu, Tuya; Cao, Minhui; Aruuntsever; Bayart; Hosblig; He, Fei; Zha-ti, A; Zheng, Guangyong; Qiu, Feng; Sun, Zikui; Zhao, Lele; Zhao, Wenjing; Liu, Baohong; Li, Chao; Chen, Yunqin; Tang, Xiaoyan; Guo, Chunyan; Liu, Wei; Ming, Liang; Temuulen; Cui, Aiying; Li, Yi; Gao, Junhui; Li, Jing; Wurentaodi; Niu, Shen; Sun, Tao; Zhai, Zhengxiao; Zhang, Min; Chen, Chen; Baldan, Tunteg; Bayaer, Tuman; Li, Yixue; Meng, He

2012-01-01

Bactrian camels serve as an important means of transportation in the cold desert regions of China and Mongolia. Here we present a 2.01 Gb draft genome sequence from both a wild and a domestic bactrian camel. We estimate the camel genome to be 2.38 Gb, containing 20,821 protein-coding genes. Our phylogenomics analysis reveals that camels shared common ancestors with other even-toed ungulates about 55–60 million years ago. Rapidly evolving genes in the camel lineage are significantly enriched in metabolic pathways, and these changes may underlie the insulin resistance typically observed in these animals. We estimate the genome-wide heterozygosity rates in both wild and domestic camels to be 1.0 × 10−3. However, genomic regions with significantly lower heterozygosity are found in the domestic camel, and olfactory receptors are enriched in these regions. Our comparative genomics analyses may also shed light on the genetic basis of the camel's remarkable salt tolerance and unusual immune system. PMID:23149746
Negative Enrichment and Isolation of Circulating Tumor Cells for Whole Genome Amplification.

PubMed

Kanwar, Nisha; Done, Susan J

2017-01-01

Circulating tumor cells (CTCs) are a rare population of cells found in the peripheral blood of patients with many types of cancer such as breast, prostate, colon, and lung cancers. Higher numbers of these cells in blood are associated with a poorer prognosis of patients. Genomic profiling of CTCs would help characterize markers specific for the identification of these cells in blood, and also define genomic alterations that give these cells a metastatic advantage over other cells in the primary tumor. Here, we describe an immunomagnetic method to enrich CTCs from the blood of patients with breast cancer, followed by single-cell laser capture microdissection to isolate single CTCs. Whole genome amplification of isolated CTCs allows for many downstream applications to be performed to aide in their characterization, such as whole genome or exome sequencing, Single Nucleotide Polymorphism (SNP) and copy number analysis, and targeted sequencing or quantitative Polymerase Chain Reaction (qPCR) for genomic analyses.
Development of microbial genome-probing microarrays using digital multiple displacement amplification of uncultivated microbial single cells.

PubMed

Chang, Ho-Won; Sung, Youlboong; Kim, Kyoung-Ho; Nam, Young-Do; Roh, Seong Woon; Kim, Min-Soo; Jeon, Che Ok; Bae, Jin-Woo

2008-08-15

A crucial problem in the use of previously developed genome-probing microarrays (GPM) has been the inability to use uncultivated bacterial genomes to take advantage of the high sensitivity and specificity of GPM in microbial detection and monitoring. We show here a method, digital multiple displacement amplification (MDA), to amplify and analyze various genomes obtained from single uncultivated bacterial cells. We used 15 genomes from key microbes involved in dichloromethane (DCM)-dechlorinating enrichment as microarray probes to uncover the bacterial population dynamics of samples without PCR amplification. Genomic DNA amplified from single cells originating from uncultured bacteria with 80.3-99.4% similarity to 16S rRNA genes of cultivated bacteria. The digital MDA-GPM method successfully monitored the dynamics of DCM-dechlorinating communities from different phases of enrichment status. Without a priori knowledge of microbial diversity, the digital MDA-GPM method could be designed to monitor most microbial populations in a given environmental sample.
Cross-species bacterial artificial chromosome (BAC) library screening via overgo-based hybridization and BAC-contig mapping of a yield enhancement quantitative trait locus (QTL) yld1.1 in the Malaysian wild rice Oryza rufipogon.

PubMed

Song, Beng-Kah; Nadarajah, Kalaivani; Romanov, Michael N; Ratnam, Wickneswari

2005-01-01

The construction of BAC-contig physical maps is an important step towards a partial or ultimate genome sequence analysis. Here, we describe our initial efforts to apply an overgo approach to screen a BAC library of the Malaysian wild rice species, Oryza rufipogon. Overgo design is based on repetitive element masking and sequence uniqueness, and uses short probes (approximately 40 bp), making this method highly efficient and specific. Pairs of 24-bp oligos that contain an 8-bp overlap were developed from the publicly available genomic sequences of the cultivated rice, O. sativa, to generate 20 overgo probes for a 1-Mb region that encompasses a yield enhancement QTL yld1.1 in O. rufipogon. The advantages of a high similarity in melting temperature, hybridization kinetics and specific activities of overgos further enabled a pooling strategy for library screening by filter hybridization. Two pools of ten overgos each were hybridized to high-density filters representing the O. rufipogon genomic BAC library. These screening tests succeeded in providing 69 PCR-verified positive hits from a total of 23,040 BAC clones of the entire O. rufipogon library. A minimal tilling path of clones was generated to contribute to a fully covered BAC-contig map of the targeted 1-Mb region. The developed protocol for overgo design based on O. sativa sequences as a comparative genomic framework, and the pooled overgo hybridization screening technique are suitable means for high-resolution physical mapping and the identification of BAC candidates for sequencing.
Placental Hypomethylation Is More Pronounced in Genomic Loci Devoid of Retroelements

PubMed Central

Chatterjee, Aniruddha; Macaulay, Erin C.; Rodger, Euan J.; Stockwell, Peter A.; Parry, Matthew F.; Roberts, Hester E.; Slatter, Tania L.; Hung, Noelyn A.; Devenish, Celia J.; Morison, Ian M.

2016-01-01

The human placenta is hypomethylated compared to somatic tissues. However, the degree and specificity of placental hypomethylation across the genome is unclear. We assessed genome-wide methylation of the human placenta and compared it to that of the neutrophil, a representative homogeneous somatic cell. We observed global hypomethylation in placenta (relative reduction of 22%) compared to neutrophils. Placental hypomethylation was pronounced in intergenic regions and gene bodies, while the unmethylated state of the promoter remained conserved in both tissues. For every class of repeat elements, the placenta showed lower methylation but the degree of hypomethylation differed substantially between these classes. However, some retroelements, especially the evolutionarily younger Alu elements, retained high levels of placental methylation. Surprisingly, nonretrotransposon-containing sequences showed a greater degree of placental hypomethylation than retrotransposons in every genomic element (intergenic, introns, and exons) except promoters. The differentially methylated fragments (DMFs) in placenta and neutrophils were enriched in gene-poor and CpG-poor regions. The placentally hypomethylated DMFs were enriched in genomic regions that are usually inactive, whereas hypermethylated DMFs were enriched in active regions. Hypomethylation of the human placenta is not specific to retroelements, indicating that the evolutionary advantages of placental hypomethylation go beyond those provided by expression of retrotransposons and retrogenes. PMID:27172225
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation.

PubMed

Mourad, Raphaël; Cuvier, Olivier

2016-05-01

Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1.
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation

PubMed Central

Mourad, Raphaël; Cuvier, Olivier

2016-01-01

Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1. PMID:27203237
Quantitative screening of yeast surface-displayed polypeptide libraries by magnetic bead capture.

PubMed

Yeung, Yik A; Wittrup, K Dane

2002-01-01

Magnetic bead capture is demonstrated here to be a feasible alternative for quantitative screening of favorable mutants from a cell-displayed polypeptide library. Flow cytometric sorting with fluorescent probes has been employed previously for high throughput screening for either novel binders or improved mutants. However, many laboratories do not have ready access to this technology as a result of the limited availability and high cost of cytometers, restricting the use of cell-displayed libraries. Using streptavidin-coated magnetic beads and biotinylated ligands, an alternative approach to cell-based library screening for improved mutants was developed. Magnetic bead capture probability of labeled cells is shown to be closely correlated with the surface ligand density. A single-pass enrichment ratio of 9400 +/- 1800-fold, at the expense of 85 +/- 6% binder losses, is achieved from screening a library that contains one antibody-displaying cell (binder) in 1.1 x 10(5) nondisplaying cells. Additionally, kinetic screening for an initial high affinity to low affinity (7.7-fold lower) mutant ratio of 1:95,000, the magnetic bead capture method attains a single-pass enrichment ratio of 600 +/- 200-fold with a 75 +/- 24% probability of loss for the higher affinity mutant. The observed high loss probabilities can be straightforwardly compensated for by library oversampling, given the inherently parallel nature of the screen. Overall, these results demonstrate that magnetic beads are capable of quantitatively screening for novel binders and improved mutants. The described methods are directly analogous to procedures in common use for phage display and should lower the barriers to entry for use of cell surface display libraries.
Automated recycling of chemistry for virtual screening and library design.

PubMed

Vainio, Mikko J; Kogej, Thierry; Raubacher, Florian

2012-07-23

An early stage drug discovery project needs to identify a number of chemically diverse and attractive compounds. These hit compounds are typically found through high-throughput screening campaigns. The diversity of the chemical libraries used in screening is therefore important. In this study, we describe a virtual high-throughput screening system called Virtual Library. The system automatically "recycles" validated synthetic protocols and available starting materials to generate a large number of virtual compound libraries, and allows for fast searches in the generated libraries using a 2D fingerprint based screening method. Virtual Library links the returned virtual hit compounds back to experimental protocols to quickly assess the synthetic accessibility of the hits. The system can be used as an idea generator for library design to enrich the screening collection and to explore the structure-activity landscape around a specific active compound.
GenomeVista

DOE Office of Scientific and Technical Information (OSTI.GOV)

Poliakov, Alexander; Couronne, Olivier

2002-11-04

Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
Scribl: an HTML5 Canvas-based graphics library for visualizing genomic data over the web.

PubMed

Miller, Chase A; Anthony, Jon; Meyer, Michelle M; Marth, Gabor

2013-02-01

High-throughput biological research requires simultaneous visualization as well as analysis of genomic data, e.g. read alignments, variant calls and genomic annotations. Traditionally, such integrative analysis required desktop applications operating on locally stored data. Many current terabyte-size datasets generated by large public consortia projects, however, are already only feasibly stored at specialist genome analysis centers. As even small laboratories can afford very large datasets, local storage and analysis are becoming increasingly limiting, and it is likely that most such datasets will soon be stored remotely, e.g. in the cloud. These developments will require web-based tools that enable users to access, analyze and view vast remotely stored data with a level of sophistication and interactivity that approximates desktop applications. As rapidly dropping cost enables researchers to collect data intended to answer questions in very specialized contexts, developers must also provide software libraries that empower users to implement customized data analyses and data views for their particular application. Such specialized, yet lightweight, applications would empower scientists to better answer specific biological questions than possible with general-purpose genome browsers currently available. Using recent advances in core web technologies (HTML5), we developed Scribl, a flexible genomic visualization library specifically targeting coordinate-based data such as genomic features, DNA sequence and genetic variants. Scribl simplifies the development of sophisticated web-based graphical tools that approach the dynamism and interactivity of desktop applications. Software is freely available online at http://chmille4.github.com/Scribl/ and is implemented in JavaScript with all modern browsers supported.

Bioprospecting for Genes that Confer Biofuel Tolerance to Escherichia Coli Using a Genomic Library Approach

NASA Astrophysics Data System (ADS)

Tomko, Timothy

Microorganisms are capable of producing advanced biofuels that can be used as 'drop-in' alternatives to conventional liquid fuels. However, vital physiological processes and membrane properties are often disrupted by the presence of biofuel and limit the production yields. In order to make microbial biofuels a competitive fuel source, finding mechanisms for improving resistance to the toxic effects of biofuel production is vital. This investigation aims to identify resistance mechanisms from microorganisms that have evolved to withstand hydrocarbon-rich environments, such as those that thrive near natural oil seeps and in oil-polluted waters. First, using genomic DNA from Marinobacter aquaeolei, we constructed a transgenic library that we expressed in Escherichia coli. We exposed cells to inhibitory levels of pinene, a monoterpene that can serve as a jet fuel precursor with chemical properties similar to existing tactical fuels. Using a sequential strategy of a fosmid library followed by a plasmid library, we were able to isolate a region of DNA from the M. aquaeolei genome that conferred pinene tolerance when expressed in E. coli. We determined that a single gene, yceI, was responsible for the tolerance improvements. Overexpression of this gene placed no additional burden on the host. We also tested tolerance to other monoterpenes and showed that yceI selectively improves tolerance. Additionally, we used genomic DNA from Pseudomonas putida KT2440, which has innate solvent-tolerance properties, to create transgenic libraries in an E. coli host. We exposed cells containing the library to pinene, selecting for genes that improved tolerance. Importantly, we found that expressing the sigma factor RpoD from P. putida greatly expanded the diversity of tolerance genes recovered. With low expression of rpoDP. putida, we isolated a single pinene tolerance gene; with increased expression of the sigma factor our selection experiments returned multiple distinct tolerance mechanisms, including some that have been previously documented and also new mechanisms. Interestingly, high levels of rpoDP. putida, induction resulted in decreased diversity. We found that the tolerance levels provided by some genes are highly sensitive to the level of induction of rpoD P. putida,, while others provide tolerance across a wide range of rpoDP. putida, levels. This method for unlocking diversity in tolerance screening using heterologous sigma factor expression was applicable to both plasmid and fosmid-based transgenic libraries. These results suggest that by controlling the expression of appropriate heterologous sigma factors, we can greatly increase the searchable genomic space within transgenic libraries. This dissertation describes a method of effectively screening genomic DNA from multiple organisms for genes to mitigate biofuel stress and shows how tolerance genes can improve bacterial growth in the presence of toxic biofuel compounds. These identified genes can be targeted in future studies as candidates for use in biofuel production strains to increase biofuel yields.
Inclusion of Population-specific Reference Panel from India to the 1000 Genomes Phase 3 Panel Improves Imputation Accuracy.

PubMed

Ahmad, Meraj; Sinha, Anubhav; Ghosh, Sreya; Kumar, Vikrant; Davila, Sonia; Yajnik, Chittaranjan S; Chandak, Giriraj R

2017-07-27

Imputation is a computational method based on the principle of haplotype sharing allowing enrichment of genome-wide association study datasets. It depends on the haplotype structure of the population and density of the genotype data. The 1000 Genomes Project led to the generation of imputation reference panels which have been used globally. However, recent studies have shown that population-specific panels provide better enrichment of genome-wide variants. We compared the imputation accuracy using 1000 Genomes phase 3 reference panel and a panel generated from genome-wide data on 407 individuals from Western India (WIP). The concordance of imputed variants was cross-checked with next-generation re-sequencing data on a subset of genomic regions. Further, using the genome-wide data from 1880 individuals, we demonstrate that WIP works better than the 1000 Genomes phase 3 panel and when merged with it, significantly improves the imputation accuracy throughout the minor allele frequency range. We also show that imputation using only South Asian component of the 1000 Genomes phase 3 panel works as good as the merged panel, making it computationally less intensive job. Thus, our study stresses that imputation accuracy using 1000 Genomes phase 3 panel can be further improved by including population-specific reference panels from South Asia.
Genome-wide transposon insertion scanning of environmental survival functions in the polycyclic aromatic hydrocarbon degrading bacterium Sphingomonas wittichii RW1.

PubMed

Roggo, Clémence; Coronado, Edith; Moreno-Forero, Silvia K; Harshman, Keith; Weber, Johann; van der Meer, Jan Roelof

2013-10-01

Sphingomonas wittichii RW1 is a dibenzofuran and dibenzodioxin-degrading bacterium with potentially interesting properties for bioaugmentation of contaminated sites. In order to understand the capacity of the microorganism to survive in the environment we used a genome-wide transposon scanning approach. RW1 transposon libraries were generated with around 22,000 independent insertions. Libraries were grown for an average of 50 generations (five successive passages in batch liquid medium) with salicylate as sole carbon and energy source in presence or absence of salt stress at -1.5 MPa. Alternatively, libraries were grown in sand with salicylate, at 50% water holding capacity, for 4 and 10 days (equivalent to 7 generations). Library DNA was recovered from the different growth conditions and scanned by ultrahigh throughput sequencing for the positions and numbers of inserted transposed kanamycin resistance gene. No transposon reads were recovered in 579 genes (10% of all annotated genes in the RW1 genome) in any of the libraries, suggesting those to be essential for survival under the used conditions. Libraries recovered from sand differed strongly from those incubated in liquid batch medium. In particular, important functions for survival of cells in sand at the short term concerned nutrient scavenging, energy metabolism and motility. In contrast to this, fatty acid metabolism and oxidative stress response were essential for longer term survival of cells in sand. Comparison to transcriptome data suggested important functions in sand for flagellar movement, pili synthesis, trehalose and polysaccharide synthesis and putative cell surface antigen proteins. Interestingly, a variety of genes were also identified, interruption of which cause significant increase in fitness during growth on salicylate. One of these was an Lrp family transcription regulator and mutants in this gene covered more than 90% of the total library after 50 generations of growth on salicylate. Our results demonstrate the power of genome-wide transposon scanning approaches for analysis of complex traits. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Genomics of compositae weeds: EST libraries, microarrays, and evidence of introgression

USDA-ARS?s Scientific Manuscript database

• Premise of Study: Weeds cause considerable environmental and economic damage. However, genomic characterization of weeds has lagged behind that of model plants and crop species. Here we report on the development of genomic tools and resources for 11 weeds from the Compositae family that can serve ...
Genome-wide annotation of mutations in a phenotyped mutant library provides an efficient platform for discovery of casual gene mutations

USDA-ARS?s Scientific Manuscript database

Ethyl methanesulfonate (EMS) efficiently generates high-density mutations in genomes. Conventionally, these mutations are identified by techniques that can detect single-nucleotide mismatches in heteroduplexes of individual PCR amplicons. We applied whole-genome sequencing to 256-phenotyped mutant l...
78 FR 55752 - National Human Genome Research Institute; Notice of Closed Meetings

Federal Register 2010, 2011, 2012, 2013, 2014

2013-09-11

... applications. Place: National Human Genome Research Institute, 4th Floor Library, 5635 Fishers Lane, Rockville... Research Institute; Notice of Closed Meetings Pursuant to section 10(d) of the Federal Advisory Committee... clearly unwarranted invasion of personal privacy. Name of Committee: National Human Genome Research...
Genome-centric metatranscriptomes and ecological roles of the active microbial populations during cellulosic biomass anaerobic digestion.

PubMed

Jia, Yangyang; Ng, Siu-Kin; Lu, Hongyuan; Cai, Mingwei; Lee, Patrick K H

2018-01-01

Although anaerobic digestion for biogas production is used worldwide in treatment processes to recover energy from carbon-rich waste such as cellulosic biomass, the activities and interactions among the microbial populations that perform anaerobic digestion deserve further investigations, especially at the population genome level. To understand the cellulosic biomass-degrading potentials in two full-scale digesters, this study examined five methanogenic enrichment cultures derived from the digesters that anaerobically digested cellulose or xylan for more than 2 years under 35 or 55 °C conditions. Metagenomics and metatranscriptomics were used to capture the active microbial populations in each enrichment culture and reconstruct their meta-metabolic network and ecological roles. 107 population genomes were reconstructed from the five enrichment cultures using a differential coverage binning approach, of which only a subset was highly transcribed in the metatranscriptomes. Phylogenetic and functional convergence of communities by enrichment condition and phase of fermentation was observed for the highly transcribed populations in the metatranscriptomes. In the 35 °C cultures grown on cellulose, Clostridium cellulolyticum -related and Ruminococcus -related bacteria were identified as major hydrolyzers and primary fermenters in the early growth phase, while Clostridium leptum -related bacteria were major secondary fermenters and potential fatty acid scavengers in the late growth phase. While the meta-metabolism and trophic roles of the cultures were similar, the bacterial populations performing each function were distinct between the enrichment conditions. Overall, a population genome-centric view of the meta-metabolism and functional roles of key active players in anaerobic digestion of cellulosic biomass was obtained. This study represents a major step forward towards understanding the microbial functions and interactions at population genome level during the microbial conversion of lignocellulosic biomass to methane. The knowledge of this study can facilitate development of potential biomarkers and rational design of the microbiome in anaerobic digesters.
Modeling genome coverage in single-cell sequencing

PubMed Central

Daley, Timothy; Smith, Andrew D.

2014-01-01

Motivation: Single-cell DNA sequencing is necessary for examining genetic variation at the cellular level, which remains hidden in bulk sequencing experiments. But because they begin with such small amounts of starting material, the amount of information that is obtained from single-cell sequencing experiment is highly sensitive to the choice of protocol employed and variability in library preparation. In particular, the fraction of the genome represented in single-cell sequencing libraries exhibits extreme variability due to quantitative biases in amplification and loss of genetic material. Results: We propose a method to predict the genome coverage of a deep sequencing experiment using information from an initial shallow sequencing experiment mapped to a reference genome. The observed coverage statistics are used in a non-parametric empirical Bayes Poisson model to estimate the gain in coverage from deeper sequencing. This approach allows researchers to know statistical features of deep sequencing experiments without actually sequencing deeply, providing a basis for optimizing and comparing single-cell sequencing protocols or screening libraries. Availability and implementation: The method is available as part of the preseq software package. Source code is available at http://smithlabresearch.org/preseq. Contact: andrewds@usc.edu Supplementary information: Supplementary material is available at Bioinformatics online. PMID:25107873
Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

PubMed

Aigrain, Louise; Gu, Yong; Quail, Michael A

2016-06-13

The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.
Development of ORIGEN Libraries for Mixed Oxide (MOX) Fuel Assembly Designs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mertyurek, Ugur; Gauld, Ian C.

In this research, ORIGEN cross section libraries for reactor-grade mixed oxide (MOX) fuel assembly designs have been developed to provide fast and accurate depletion calculations to predict nuclide inventories, radiation sources and thermal decay heat information needed in safety evaluations and safeguards verification measurements of spent nuclear fuel. These ORIGEN libraries are generated using two-dimensional lattice physics assembly models that include enrichment zoning and cross section data based on ENDF/B-VII.0 evaluations. Using the SCALE depletion sequence, burnup-dependent cross sections are created for selected commercial reactor assembly designs and a representative range of reactor operating conditions, fuel enrichments, and fuel burnup.more » The burnup dependent cross sections are then interpolated to provide problem-dependent cross sections for ORIGEN, avoiding the need for time-consuming lattice physics calculations. The ORIGEN libraries for MOX assembly designs are validated against destructive radiochemical assay measurements of MOX fuel from the MALIBU international experimental program. This program included measurements of MOX fuel from a 15 × 15 pressurized water reactor assembly and a 9 × 9 boiling water reactor assembly. The ORIGEN MOX libraries are also compared against detailed assembly calculations from the Phase IV-B numerical MOX fuel burnup credit benchmark coordinated by the Nuclear Energy Agency within the Organization for Economic Cooperation and Development. Finally, the nuclide compositions calculated by ORIGEN using the MOX libraries are shown to be in good agreement with other physics codes and with experimental data.« less
Development of ORIGEN Libraries for Mixed Oxide (MOX) Fuel Assembly Designs

DOE PAGES

Mertyurek, Ugur; Gauld, Ian C.

2015-12-24

In this research, ORIGEN cross section libraries for reactor-grade mixed oxide (MOX) fuel assembly designs have been developed to provide fast and accurate depletion calculations to predict nuclide inventories, radiation sources and thermal decay heat information needed in safety evaluations and safeguards verification measurements of spent nuclear fuel. These ORIGEN libraries are generated using two-dimensional lattice physics assembly models that include enrichment zoning and cross section data based on ENDF/B-VII.0 evaluations. Using the SCALE depletion sequence, burnup-dependent cross sections are created for selected commercial reactor assembly designs and a representative range of reactor operating conditions, fuel enrichments, and fuel burnup.more » The burnup dependent cross sections are then interpolated to provide problem-dependent cross sections for ORIGEN, avoiding the need for time-consuming lattice physics calculations. The ORIGEN libraries for MOX assembly designs are validated against destructive radiochemical assay measurements of MOX fuel from the MALIBU international experimental program. This program included measurements of MOX fuel from a 15 × 15 pressurized water reactor assembly and a 9 × 9 boiling water reactor assembly. The ORIGEN MOX libraries are also compared against detailed assembly calculations from the Phase IV-B numerical MOX fuel burnup credit benchmark coordinated by the Nuclear Energy Agency within the Organization for Economic Cooperation and Development. Finally, the nuclide compositions calculated by ORIGEN using the MOX libraries are shown to be in good agreement with other physics codes and with experimental data.« less
Always Feed the Clowns and Other Tips for Building Better Partnerships between School Librarians and Providers of Educational Programs

ERIC Educational Resources Information Center

Edwards, Jason

2015-01-01

Jason Edwards travels to schools and libraries across the nation performing educational enrichment programs, such as his Monster Hunt Library Skills-Building Adventure Program, for librarians and students. In this article, he shares tips that he has gleaned that may help librarian/programmer partnerships function more smoothly. Three of the…
PhysiomeSpace: digital library service for biomedical data

PubMed Central

Testi, Debora; Quadrani, Paolo; Viceconti, Marco

2010-01-01

Every research laboratory has a wealth of biomedical data locked up, which, if shared with other experts, could dramatically improve biomedical and healthcare research. With the PhysiomeSpace service, it is now possible with a few clicks to share with selected users biomedical data in an easy, controlled and safe way. The digital library service is managed using a client–server approach. The client application is used to import, fuse and enrich the data information according to the PhysiomeSpace resource ontology and upload/download the data to the library. The server services are hosted on the Biomed Town community portal, where through a web interface, the user can complete the metadata curation and share and/or publish the data resources. A search service capitalizes on the domain ontology and on the enrichment of metadata for each resource, providing a powerful discovery environment. Once the users have found the data resources they are interested in, they can add them to their basket, following a metaphor popular in e-commerce web sites. When all the necessary resources have been selected, the user can download the basket contents into the client application. The digital library service is now in beta and open to the biomedical research community. PMID:20478910
PhysiomeSpace: digital library service for biomedical data.

PubMed

Testi, Debora; Quadrani, Paolo; Viceconti, Marco

2010-06-28

Every research laboratory has a wealth of biomedical data locked up, which, if shared with other experts, could dramatically improve biomedical and healthcare research. With the PhysiomeSpace service, it is now possible with a few clicks to share with selected users biomedical data in an easy, controlled and safe way. The digital library service is managed using a client-server approach. The client application is used to import, fuse and enrich the data information according to the PhysiomeSpace resource ontology and upload/download the data to the library. The server services are hosted on the Biomed Town community portal, where through a web interface, the user can complete the metadata curation and share and/or publish the data resources. A search service capitalizes on the domain ontology and on the enrichment of metadata for each resource, providing a powerful discovery environment. Once the users have found the data resources they are interested in, they can add them to their basket, following a metaphor popular in e-commerce web sites. When all the necessary resources have been selected, the user can download the basket contents into the client application. The digital library service is now in beta and open to the biomedical research community.
SAGE analysis of early oogenesis in the silkworm, Bombyx mori.

PubMed

Funaguma, Shunsuke; Hashimoto, Shin-ichi; Suzuki, Yutaka; Omuro, Naoko; Sugano, Sumio; Mita, Kazuei; Katsuma, Susumu; Shimada, Toru

2007-02-01

To identify genes involved in the differentiation of Bombyx cystoblast, we constructed two 3' long serial analysis of gene expression (Long SAGE) libraries from stage 1-3 or stage 2-3 egg chambers and compared their gene expression profiles. In both libraries, the most frequent tags were derived from the same novel transcript. The transcript does not have any open reading frame capable of encoding a protein with over 100 amino acids in length. RNA blot analysis revealed that this transcript is specifically and abundantly expressed in the Bombyx ovary, mainly the germ line cells in the ovarioles. These results suggest that Bombyx oogenesis may be regulated by a previously unidentified non-coding RNA. Comparison of the gene expression profiles between the stage 1-3 and stage 2-3 egg chamber libraries revealed that 272 tags were significantly more abundant in stage 1-3 egg chambers (p<0.05 and at least two-fold change) than in library 2. Among the differentially expressed transcripts were the sequences that correspond to ATP synthase subunit d (3.1-fold enriched) and ATP synthase coupling factor 6 (9.1-fold enriched), suggesting that they are involved in regulation of cell cycle of cystocytes.
Covalent antibody display—an in vitro antibody-DNA library selection system

PubMed Central

Reiersen, Herald; Løbersli, Inger; Løset, Geir Å.; Hvattum, Else; Simonsen, Bjørg; Stacy, John E.; McGregor, Duncan; FitzGerald, Kevin; Welschof, Martin; Brekke, Ole H.; Marvik, Ole J.

2005-01-01

The endonuclease P2A initiates the DNA replication of the bacteriophage P2 by making a covalent bond with its own phosphate backbone. This enzyme has now been exploited as a new in vitro display tool for antibody fragments. We have constructed genetic fusions of P2A with single-chain antibodies (scFvs). Linear DNA of these fusion proteins were processed in an in vitro coupled transcription–translation mixture of Escherichia coli S30 lysate. Complexes of scFv–P2A fusion proteins covalently bound to their own DNA were isolated after panning on immobilized antigen, and the enriched DNAs were recovered by PCR and prepared for the subsequent cycles of panning. We have demonstrated the enrichment of scFvs from spiked libraries and the specific selection of different anti-tetanus toxoid scFvs from a V-gene library with 50 million different members prepared from human lymphocytes. This covalent antibody display technology offers a complete in vitro selection system based exclusively on DNA–protein complexes. PMID:15653626
The Drosophila gene collection: Identification of putative full-length cDNAs for 70 percent of D. melanogaster genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stapleton, Mark; Liao, Guochun; Brokstein, Peter

2002-08-12

Collections of full-length nonredundant cDNA clones are critical reagents for functional genomics. The first step toward these resources is the generation and single-pass sequencing of cDNA libraries that contain a high proportion of full-length clones. The first release of the Drosophila Gene Collection Release 1 (DGCr1) was produced from six libraries representing various tissues, developmental stages, and the cultured S2 cell line. Nearly 80,000 random 5prime expressed sequence tags (EST) from these libraries were collapsed into a nonredundant set of 5849 cDNAs, corresponding to {approx}40 percent of the 13,474 predicted genes in Drosophila. To obtain cDNA clones representing the remainingmore » genes, we have generated an additional 157,835 5prime ESTs from two previously existing and three new libraries. One new library is derived from adult testis, a tissue we previously did not exploit for gene discovery; two new cap-trapped normalized libraries are derived from 0-22hr embryos and adult heads. Taking advantage of the annotated D. melanogaster genome sequence, we clustered the ESTs by aligning them to the genome. Clusters that overlap genes not already represented by cDNA clones in the DGCr1 were analyzed further, and putative full-length clones were selected for inclusion in the new DGC. This second release of the DGC (DGCr2) contains 5061 additional clones, extending the collection to 10,910 cDNAs representing >70 percent of the predicted genes in Drosophila.« less
From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

PubMed

Kwok, Hin; Chiang, Alan Kwok Shing

2016-02-24

Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.
Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity

PubMed Central

Hurst, Gregory D.D.

2017-01-01

High throughput (or ‘next generation’) sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and ‘contaminating’ material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these ‘contaminations’ provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee (Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo. We conclude that ‘contamination’ in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses. PMID:28717593
Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity.

PubMed

Gerth, Michael; Hurst, Gregory D D

2017-01-01

High throughput (or 'next generation') sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and 'contaminating' material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these 'contaminations' provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee ( Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo . We conclude that 'contamination' in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses.

Identification, characterization and distribution of transposable elements in the flax (Linum usitatissimum L.) genome.

PubMed

González, Leonardo Galindo; Deyholos, Michael K

2012-11-21

Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression. Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (≥ 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution. The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated.
Identification, characterization and distribution of transposable elements in the flax (Linum usitatissimum L.) genome

PubMed Central

2012-01-01

Background Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression. Results Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (≥ 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution. Conclusions The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated. PMID:23171245
Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients

DTIC Science & Technology

2014-01-01

patients from our pilot set, we identified genomic variants (SNVs and CNVs to date) that are enriched in metastatic tumors. Aligned reads were used for...patient, illustrating variants that are detected at low VAF frequency in the primary tumor but greatly enriched in the metastatic lesion, or that are...more than one of the 9 samples, several of them, including PIK3CA E545K (not detectable at 25X read depth in patient 1 primary tumor, but present at
Hybrid selection for sequencing pathogen genomes from clinical samples

PubMed Central

2011-01-01

We have adapted a solution hybrid selection protocol to enrich pathogen DNA in clinical samples dominated by human genetic material. Using mock mixtures of human and Plasmodium falciparum malaria parasite DNA as well as clinical samples from infected patients, we demonstrate an average of approximately 40-fold enrichment of parasite DNA after hybrid selection. This approach will enable efficient genome sequencing of pathogens from clinical samples, as well as sequencing of endosymbiotic organisms such as Wolbachia that live inside diverse metazoan phyla. PMID:21835008
Genome-wide identification and characterization of long non-coding RNAs in developmental skeletal muscle of fetal goat.

PubMed

Zhan, Siyuan; Dong, Yao; Zhao, Wei; Guo, Jiazhong; Zhong, Tao; Wang, Linjie; Li, Li; Zhang, Hongping

2016-08-22

Long non-coding RNAs (lncRNAs) have been studied extensively over the past few years. Large numbers of lncRNAs have been identified in mouse, rat, and human, and some of them have been shown to play important roles in muscle development and myogenesis. However, there are few reports on the characterization of lncRNAs covering all the development stages of skeletal muscle in livestock. RNA libraries constructed from developing longissimus dorsi muscle of fetal (45, 60, and 105 days of gestation) and postnatal (3 days after birth) goat (Capra hircus) were sequenced. A total of 1,034,049,894 clean reads were generated. Among them, 3981 lncRNA transcripts corresponding to 2739 lncRNA genes were identified, including 3515 intergenic lncRNAs and 466 anti-sense lncRNAs. Notably, in pairwise comparisons between the libraries of skeletal muscle at the different development stages, a total of 577 transcripts were differentially expressed (P < 0.05) which were validated by qPCR using randomly selected six lncRNA genes. The identified goat lncRNAs shared some characteristics, such as fewer exons and shorter length, with the lncRNAs in other mammals. We also found 1153 lncRNAs genes were neighbored 1455 protein-coding genes (<10 kb upstream and downstream) and functionally enriched in transcriptional regulation and development-related processes, indicating they may be in cis-regulatory relationships. Additionally, Pearson's correlation coefficients of co-expression levels suggested 1737 lncRNAs and 19,422 mRNAs were possibly in trans-regulatory relationships (r > 0.95 or r < -0.95). These co-expressed mRNAs were enriched in development-related biological processes such as muscle system processes, regulation of cell growth, muscle cell development, regulation of transcription, and embryonic morphogenesis. This study provides a catalog of goat muscle-related lncRNAs, and will contribute to a fuller understanding of the molecular mechanism underpinning muscle development in mammals.
America After 3PM Special Report on Summer: Missed Opportunities, Unmet Demand

ERIC Educational Resources Information Center

Afterschool Alliance, 2010

2010-01-01

For many children in America, summer vacation means camp, trips to new or familiar destinations, visits to museums, parks and libraries, and a variety of enriching activities--either with families or as part of a summer learning program. But for millions of others, when schools close for the summer, safe and enriching learning environments are out…
Phylogenetic analysis of TCE-dechlorinating consortia enriched on a variety of electron donors.

PubMed

Freeborn, Ryan A; West, Kimberlee A; Bhupathiraju, Vishvesh K; Chauhan, Sadhana; Rahm, Brian G; Richardson, Ruth E; Alvarez-Cohen, Lisa

2005-11-01

Two rapidly fermented electron donors, lactate and methanol, and two slowly fermented electron donors, propionate and butyrate, were selected for enrichment studies to evaluate the characteristics of anaerobic microbial consortia that reductively dechlorinate TCE to ethene. Each electron donor enrichment subculture demonstrated the ability to dechlorinate TCE to ethene through several serial transfers. Microbial community analyses based upon 16S rDNA, including terminal restriction fragment length polymorphism (T-RFLP) and clone library/sequencing, were performed to assess major changes in microbial community structure associated with electron donors capable of stimulating reductive dechlorination. Results demonstrated that five phylogenic subgroups or genera of bacteria were present in all consortia, including Dehalococcoides sp., low G+C Gram-positives (mostly Clostridium and Eubacterium sp.), Bacteroides sp., Citrobacter sp., and delta Proteobacteria (mostly Desulfovibrio sp.). Phylogenetic association indicates that only minor shifts in the microbial community structure occurred between the four alternate electron donor enrichments and the parent consortium. Inconsistent detection of Dehalococcoides spp. in clone libraries and T-RFLP of enrichment subcultures was resolved using quantitative polymerase chain reaction (Q-PCR). Q-PCR with primers specific to Dehalococcoides 16S rDNA resulted in positive detection of this species in all enrichments. Our results suggest that TCE-dechlorinating consortia can be stably maintained on a variety of electron donors and that quantities of Dehalococcoides cells detected with Dehalococcoides specific 16S rDNA primer/probe sets do not necessarily correlate well with solvent degradation rates.
Tissue-Specific Transcriptomic Profiling of Sorghum propinquum using a Rice Genome Array

PubMed Central

Zhang, Ting; Zhao, Xiuqin; Huang, Liyu; Liu, Xiaoyue; Zong, Ying; Zhu, Linghua; Yang, Daichang; Fu, Binying

2013-01-01

Sorghum (Sorghum bicolor) is one of the world's most important cereal crops. S. propinquum is a perennial wild relative of S. bicolor with well-developed rhizomes. Functional genomics analysis of S. propinquum, especially with respect to molecular mechanisms related to rhizome growth and development, can contribute to the development of more sustainable grain, forage, and bioenergy cropping systems. In this study, we used a whole rice genome oligonucleotide microarray to obtain tissue-specific gene expression profiles of S. propinquum with special emphasis on rhizome development. A total of 548 tissue-enriched genes were detected, including 31 and 114 unique genes that were expressed predominantly in the rhizome tips (RT) and internodes (RI), respectively. Further GO analysis indicated that the functions of these tissue-enriched genes corresponded to their characteristic biological processes. A few distinct cis-elements, including ABA-responsive RY repeat CATGCA, sugar-repressive TTATCC, and GA-responsive TAACAA, were found to be prevalent in RT-enriched genes, implying an important role in rhizome growth and development. Comprehensive comparative analysis of these rhizome-enriched genes and rhizome-specific genes previously identified in Oryza longistaminata and S. propinquum indicated that phytohormones, including ABA, GA, and SA, are key regulators of gene expression during rhizome development. Co-localization of rhizome-enriched genes with rhizome-related QTLs in rice and sorghum generated functional candidates for future cloning of genes associated with rhizome growth and development. PMID:23536906
Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus.

PubMed

Condon, David E; Tran, Phu V; Lien, Yu-Chin; Schug, Jonathan; Georgieff, Michael K; Simmons, Rebecca A; Won, Kyoung-Jae

2018-02-05

Identification of differentially methylated regions (DMRs) is the initial step towards the study of DNA methylation-mediated gene regulation. Previous approaches to call DMRs suffer from false prediction, use extreme resources, and/or require library installation and input conversion. We developed a new approach called Defiant to identify DMRs. Employing Weighted Welch Expansion (WWE), Defiant showed superior performance to other predictors in the series of benchmarking tests on artificial and real data. Defiant was subsequently used to investigate DNA methylation changes in iron-deficient rat hippocampus. Defiant identified DMRs close to genes associated with neuronal development and plasticity, which were not identified by its competitor. Importantly, Defiant runs between 5 to 479 times faster than currently available software packages. Also, Defiant accepts 10 different input formats widely used for DNA methylation data. Defiant effectively identifies DMRs for whole-genome bisulfite sequencing (WGBS), reduced-representation bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB-seq), and HpaII tiny fragment enrichment by ligation-mediated PCR-tag (HELP) assays.
De Novo Assembly and Comparative Transcriptome Analysis Provide Insight into Lysine Biosynthesis in Toona sinensis Roem.

PubMed

Zhang, Xia; Song, Zhenqiao; Liu, Tian; Guo, Linlin; Li, Xingfeng

2016-01-01

Toona sinensis Roem is a popular leafy vegetable in Chinese cuisine and is also used as a traditional Chinese medicine. In this study, leaf samples were collected from the same plant on two development stages and then used for high-throughput Illumina RNA-sequencing (RNA-Seq). 125,884 transcripts and 54,628 unigenes were obtained through de novo assembly. A total of 25,570 could be annotated with known biological functions, which indicated that the T. sinensis leaves and shoots were undergoing multiple developmental processes especially for active metabolic processes. Analysis of differentially expressed unigenes between the two libraries showed that the lysine biosynthesis was an enriched KEGG pathway, and candidate genes involved in the lysine biosynthesis pathway in T. sinensis leaves and shoots were identified. Our results provide a primary analysis of the gene expression files of T. sinensis leaf and shoot on different development stages and afford a valuable resource for genetic and genomic research on plant lysine biosynthesis.
Development of microsatellite loci in Artocarpus altilis (Moraceae) and cross-amplification in congeneric species1

PubMed Central

Witherup, Colby; Ragone, Diane; Wiesner-Hanks, Tyr; Irish, Brian; Scheffler, Brian; Simpson, Sheron; Zee, Francis; Zuberi, M. Iqbal; Zerega, Nyree J. C.

2013-01-01

• Premise of the study: Microsatellite loci were isolated and characterized from enriched genomic libraries of Artocarpus altilis (breadfruit) and tested in four Artocarpus species and one hybrid. The microsatellite markers provide new tools for further studies in Artocarpus. • Methods and Results: A total of 25 microsatellite loci were evaluated across four Artocarpus species and one hybrid. Twenty-one microsatellite loci were evaluated on A. altilis (241), A. camansi (34), A. mariannensis (15), and A. altilis × mariannensis (64) samples. Nine of those loci plus four additional loci were evaluated on A. heterophyllus (jackfruit, 426) samples. All loci are polymorphic for at least one species. The average number of alleles ranges from two to nine within taxa. • Conclusions: These microsatellite primers will facilitate further studies on the genetic structure and evolutionary and domestication history of Artocarpus species. They will aid in cultivar identification and establishing germplasm conservation strategies for breadfruit and jackfruit. PMID:25202565
Isolation and Characterization of Eleven Polymorphic Microsatellite Loci for the Valuable Medicinal Plant Dendrobium huoshanense and Cross-Species Amplification

PubMed Central

Wang, Hui; Chen, Nai-Fu; Zheng, Ji-Yang; Wang, Wen-Cai; Pei, Yun-Yun; Zhu, Guo-Ping

2012-01-01

Dendrobium huoshanense (Orchidaceae) is a perennial herb and a widely used medicinal plant in Traditional Chinese medicine (TCM) endemic to Huoshan County town in Anhui province in Southeast China. A microsatellite-enriched genomic DNA library of D. huoshanense was developed and screened to identify marker loci. Eleven polymorphic loci were isolated and analyzed by screening 25 individuals collected from a natural population. The number of alleles per locus ranged from 2 to 5. The observed and expected heterozygosities ranged from 0.227 to 0.818 and from 0.317 to 0.757, respectively. Two loci showed significant deviations from Hardy-Weinberg equilibrium and four of the pairwise comparisons of loci revealed linkage disequilibrium (p < 0.05). These microsatellite loci were cross-amplified for five congeneric species and seven loci can be amplified in all species. These simple sequence repeats (SSR) markers are useful in genetic studies of D. huoshanense and other related species and in conservation decision-making. PMID:23222682
Characterization of microsatellite loci in Festuca gautieri (Poaceae) and transferability to F. eskia and F. xpicoeuropeana.

PubMed

Segarra-Moragues, José Gabriel; Catalán, Pilar

2011-12-01

Enriched genomic libraries were used to isolate and characterize microsatellite loci in Festuca gautieri, an important plant component of subalpine calcareous grasslands of the eastern Iberian Peninsula, the Pyrenees, and the Cantabrian Mountains. Microsatellites were required to investigate landscape genetics across its distribution range and at a narrower geographical scale within the Ordesa y Monte Perdido, Aigüestortes, and Picos de Europa Spanish national parks. Ten polymorphic microsatellite loci were characterized. They amplified a total of 116 alleles in a sample of 30 individuals of F. gautieri, showing high levels of genetic diversity (expected heterozygosity = 0.821). Cross-species transferability to two other close congeners, F. eskia and F ×picoeuropeana, increased the total number of alleles to 137. These taxa showed lower numbers of alleles but similar levels of genetic diversity to F. gautieri. These microsatellite primers will be useful in population and landscape genetics and in establishing conservation strategies for these characteristic elements of subalpine pastures.
Isolation of anonymous DNA sequences from within a submicroscopic X chromosomal deletion in a patient with choroideremia, deafness, and mental retardation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nussbaum, R.L.; Lesko, J.G.; Lewis, R.A.

1987-09-01

Choroideremia, an X-chromosome linked retinal dystrophy of unknown pathogenesis, causes progressive nightblindness and eventual central blindness in affected males by the third to fourth decade of life. Choroideremia has been mapped to Xq13-21 by tight linkage to restriction fragment length polymorphism loci. The authors have recently identified two families in which choroideremia is inherited with mental retardation and deafness. In family XL-62, an interstitial deletion Xq21 is visible by cytogenetic analysis and two linked anonymous DNA markers, DXYS1 and DXS72, are deleted. In the second family, XL-45, an interstitial deletion was suspected on phenotypic grounds but could not be confirmedmore » by high-resolution cytogenetic analysis. They used phenol-enhanced reassociation of 48,XXXX DNA in competition with excess XL-45 DNA to generate a library of cloned DNA enriched for sequences that might be deleted in XL-45. Two of the first 83 sequences characterized from the library were found to be deleted in probands from family XL-45 as well as from family XL-62. Isolation of these sequences proves that XL-45 does contain a submicroscopic deletion and provides a starting point for identifying overlapping genomic sequences that span the XL-45 deletion. Each overlapping sequence will be studied to identify exons from the choroideremia locus.« less
Next-generation AAV vectors for clinical use: an ever-accelerating race.

PubMed

Weinmann, Jonas; Grimm, Dirk

2017-10-01

During the past five decades, it has become evident that Adeno-associated virus (AAV) represents one of the most potent, most versatile, and thus most auspicious platforms available for gene delivery into cells, animals and, ultimately, humans. Particularly attractive is the ease with which the viral capsid-the major determinant of virus-host interaction including cell specificity and antibody recognition-can be modified and optimized at will. This has motivated countless researchers to develop high-throughput technologies in which genetically engineered AAV capsid libraries are subjected to a vastly hastened emulation of natural evolution, with the aim to enrich novel synthetic AAV capsids displaying superior features for clinical application. While the power and potential of these forward genetics approaches is undisputed, they are also inherently challenging as success depends on a combination of library quality, fidelity, and complexity. Here, we will describe and discuss two original, very exciting strategies that have emerged over the last three years and that promise to alleviate at least some of these concerns, namely, (i) a reverse genetics approach termed "ancestral AAV sequence reconstruction," and (ii) AAV genome barcoding as a technology that can advance both, forward and reverse genetics stratagems. Notably, despite the conceptual differences of these two technologies, they pursue the same goal which is tailored acceleration of AAV evolution and thus winning the race for the next-generation AAV vectors for clinical use.
Novel high throughput pooled shRNA screening identifies NQO1 as a potential drug target for host directed therapy for tuberculosis

PubMed Central

Li, Qing; Karim, Ahmad F.; Ding, Xuedong; Das, Biswajit; Dobrowolski, Curtis; Gibson, Richard M.; Quiñones-Mateu, Miguel E.; Karn, Jonathan; Rojas, Roxana E.

2016-01-01

Chemical regulation of macrophage function is one key strategy for developing host-directed adjuvant therapies for tuberculosis (TB). A critical step to develop these therapies is the identification and characterization of specific macrophage molecules and pathways with a high potential to serve as drug targets. Using a barcoded lentivirus-based pooled short-hairpin RNA (shRNA) library combined with next generation sequencing, we identified 205 silenced host genes highly enriched in mycobacteria-resistant macrophages. Twenty-one of these “hits” belonged to the oxidoreductase functional category. NAD(P)H:quinone oxidoreductase 1 (NQO1) was the top oxidoreductase “hit”. NQO1 expression was increased after mycobacterial infection, and NQO1 knockdown increased macrophage differentiation, NF-κB activation, and the secretion of pro-inflammatory cytokines TNF-α and IL-1β in response to infection. This suggests that mycobacteria hijacks NQO1 to down-regulate pro-inflammatory and anti-bacterial functions. The competitive inhibitor of NQO1 dicoumarol synergized with rifampin to promote intracellular killing of mycobacteria. Thus, NQO1 is a new host target in mycobacterial infection that could potentially be exploited to increase antibiotic efficacy in vivo. Our findings also suggest that pooled shRNA libraries could be valuable tools for genome-wide screening in the search for novel druggable host targets for adjunctive TB therapies. PMID:27297123
Reconstructing rare soil microbial genomes using in situ enrichments and metagenomics

PubMed Central

Delmont, Tom O.; Eren, A. Murat; Maccario, Lorrie; Prestat, Emmanuel; Esen, Özcan C.; Pelletier, Eric; Le Paslier, Denis; Simonet, Pascal; Vogel, Timothy M.

2015-01-01

Despite extensive direct sequencing efforts and advanced analytical tools, reconstructing microbial genomes from soil using metagenomics have been challenging due to the tremendous diversity and relatively uniform distribution of genomes found in this system. Here we used enrichment techniques in an attempt to decrease the complexity of a soil microbiome prior to sequencing by submitting it to a range of physical and chemical stresses in 23 separate microcosms for 4 months. The metagenomic analysis of these microcosms at the end of the treatment yielded 540 Mb of assembly using standard de novo assembly techniques (a total of 559,555 genes and 29,176 functions), from which we could recover novel bacterial genomes, plasmids and phages. The recovered genomes belonged to Leifsonia (n = 2), Rhodanobacter (n = 5), Acidobacteria (n = 2), Sporolactobacillus (n = 2, novel nitrogen fixing taxon), Ktedonobacter (n = 1, second representative of the family Ktedonobacteraceae), Streptomyces (n = 3, novel polyketide synthase modules), and Burkholderia (n = 2, includes mega-plasmids conferring mercury resistance). Assembled genomes averaged to 5.9 Mb, with relative abundances ranging from rare (<0.0001%) to relatively abundant (>0.01%) in the original soil microbiome. Furthermore, we detected them in samples collected from geographically distant locations, particularly more in temperate soils compared to samples originating from high-latitude soils and deserts. To the best of our knowledge, this study is the first successful attempt to assemble multiple bacterial genomes directly from a soil sample. Our findings demonstrate that developing pertinent enrichment conditions can stimulate environmental genomic discoveries that would have been impossible to achieve with canonical approaches that focus solely upon post-sequencing data treatment. PMID:25983722
Not All Particles Are Equal: The Selective Enrichment of Particle-Associated Bacteria from the Mediterranean Sea.

PubMed

López-Pérez, Mario; Kimes, Nikole E; Haro-Moreno, Jose M; Rodriguez-Valera, Francisco

2016-01-01

We have used two metagenomic approaches, direct sequencing of natural samples and sequencing after enrichment, to characterize communities of prokaryotes associated to particles. In the first approximation, different size filters (0.22 and 5 μm) were used to identify prokaryotic microbes of free-living and particle-attached bacterial communities in the Mediterranean water column. A subtractive metagenomic approach was used to characterize the dominant microbial groups in the large size fraction that were not present in the free-living one. They belonged mainly to Actinobacteria, Planctomycetes, Flavobacteria and Proteobacteria. In addition, marine microbial communities enriched by incubation with different kinds of particulate material have been studied by metagenomic assembly. Different particle kinds (diatomaceous earth, sand, chitin and cellulose) were colonized by very different communities of bacteria belonging to Roseobacter, Vibrio, Bacteriovorax, and Lacinutrix that were distant relatives of genomes already described from marine habitats. Besides, using assembly from deep metagenomic sequencing from the particle-specific enrichments we were able to determine a total of 20 groups of contigs (eight of them with >50% completeness) and reconstruct de novo five new genomes of novel species within marine clades (>79% completeness and <1.8% contamination). We also describe for the first time the genome of a marine Rhizobiales phage that seems to infect a broad range of Alphaproteobacteria and live in habitats as diverse as soil, marine sediment and water column. The metagenomic recruitment of the communities found by direct sequencing of the large size filter and by enrichment had nearly no overlap. These results indicate that these reconstructed genomes are part of the rare biosphere which exists at nominal levels under natural conditions.
Gel-seq: A Method for Simultaneous Sequencing Library Preparation of DNA and RNA Using Hydrogel Matrices.

PubMed

Hoople, Gordon D; Richards, Andrew; Wu, Yan; Pisano, Albert P; Zhang, Kun

2018-03-26

The ability to amplify and sequence either DNA or RNA from small starting samples has only been achieved in the last five years. Unfortunately, the standard protocols for generating genomic or transcriptomic libraries are incompatible and researchers must choose whether to sequence DNA or RNA for a particular sample. Gel-seq solves this problem by enabling researchers to simultaneously prepare libraries for both DNA and RNA starting with 100 - 1000 cells using a simple hydrogel device. This paper presents a detailed approach for the fabrication of the device as well as the biological protocol to generate paired libraries. We designed Gel-seq so that it could be easily implemented by other researchers; many genetics labs already have the necessary equipment to reproduce the Gel-seq device fabrication. Our protocol employs commonly-used kits for both whole-transcript amplification (WTA) and library preparation, which are also likely to be familiar to researchers already versed in generating genomic and transcriptomic libraries. Our approach allows researchers to bring to bear the power of both DNA and RNA sequencing on a single sample without splitting and with negligible added cost.
Gene-enriched draft genome of the cattle tick Rhipicephalus microplus: Assembly by the hybrid Pacific Biosciences/Illumina approach enabled analysis of the highly repetitive genome

USDA-ARS?s Scientific Manuscript database

The genome of the cattle tick R. microplus, an ectoparasite with global distribution, is estimated to be 7.1 Gbp and consists of ~70% repetitive DNA. We report the first assembly of a tick genome that utilized a hybrid sequencing and assembly approach to capture the repetitive fractions of the genom...

Detection of Pseudomonas savastanoi pv. savastanoi in olive plants by enrichment and PCR.

PubMed

Penyalver, R; García, A; Ferrer, A; Bertolini, E; López, M M

2000-06-01

The sequence of the gene iaaL of Pseudomonas savastanoi EW2009 was used to design primers for PCR amplification. The iaaL-derived primers directed the amplification of a 454-bp fragment from genomic DNA isolated from 70 strains of P. savastanoi, whereas genomic DNA from 93 non-P. savastanoi isolates did not yield this amplified product. A previous bacterial enrichment in the semiselective liquid medium PVF-1 improved the PCR sensitivity level, allowing detection of 10 to 100 CFU/ml of plant extract. P. savastanoi was detected by the developed enrichment-PCR method in knots from different varieties of inoculated and naturally infected olive trees. Moreover, P. savastanoi was detected in symptomless stem tissues from naturally infected olive plants. This enrichment-PCR method is more sensitive and less cumbersome than the conventional isolation methods for detection of P. savastanoi.
USE OF COMPETITIVE GENOMIC HYBRIDIZATION TO ENRICH FOR GENOME-SPECIFIC DIFFERENCES BETWEEN TWO CLOSELY RELATED HUMAN FECAL INDICATOR BACTERIA

EPA Science Inventory

Enterococci are frequently used as indicators of fecal pollution in surface waters. To accelerate the identification of Enterococcus faecalis-specific DNA sequences, we employed a comparative genomic strategy utilizing a positive selection process to compare E. faec...

Characterization of Transcriptional Complexity during Adipose Tissue Development in Bovines of Different Ages and Sexes

PubMed Central

Zhou, Yang; Sun, Jiajie; Li, Congjun; Wang, Yanhong; Li, Lan; Cai, Hanfang; Lan, Xianyong; Lei, Chuzhao; Zhao, Xin; Chen, Hong

2014-01-01

Background Adipose tissue has long been recognized to play an extremely important role in development. In bovines, it not only serves a fundamental function but also plays a key role in the quality of beef and, consequently, has drawn much public attention. Age and sex are two key factors that affect the development of adipose tissue, and there has not yet been a global study detailing the effects of these two factors on expressional differences of adipose tissues. Results In this study, total RNA from the back fat of fetal bovines, adult bulls, adult heifers and adult steers were used to construct libraries for Illumina next-generation sequencing. We detected the expression levels of 12,233 genes, with over 3,000 differently expressed genes when comparing fetal and adult patterns and an average of 1000 differently expressed genes when comparing adult patterns. Multiple Gene Ontology terms and pathways were found to be significantly enriched for these differentially expressed genes. Of the 12,233 detected genes, a total of 4,753 genes (38.85%) underwent alternative splicing events, and over 50% were specifically expressed in each library. Over 4,000 novel transcript units were discovered for one library, whereas only approximately 30% were considered to have coding ability, which supplied a large amount of information for the lncRNA study. Additionally, we detected 56,564 (fetal bovine), 65,154 (adult bull), 78,061 (adult heifer) and 86,965 (adult steer) putative single nucleotide polymorphisms located in coding regions of the four pooled libraries. Conclusion Here, we present, for the first time, a complete dataset involving the spatial and temporal transcriptome of bovine adipose tissue using RNA-seq. These data will facilitate the understanding of the effects of age and sex on the development of adipose tissue and supply essential information towards further studies on the genomes of beef cattle and other related mammals. PMID:24983926

Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling

PubMed Central

Morin, Ryan D.; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Kirkpatrick, Robert; Butterfield, Yaron S.; Young, Alice C.; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C.; Matsuo, Corey; Wong, David; Yang, George S.; Smailus, Duane E.; Wetherby, Keith D.; Kwong, Peggy N.; Grimwood, Jane; Brinkley, Charles P.; Brown-John, Mabel; Reddix-Dugue, Natalie D.; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G.; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S.; Jang, Wonhee; Lee, Ed; Klein, Steven L.; Blakesley, Robert W.; Zeeberg, Barry R.; Narasimhan, Sudarshan; Weinstein, John N.; Pennacchio, Christa Prange; Myers, Richard M.; Green, Eric D.; Wagner, Lukas; Gerhard, Daniela S.; Marra, Marco A.; Jones, Steven J.M.; Holt, Robert A.

2006-01-01

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization. PMID:16672307

Global Genomic Analysis of Prostate, Breast and Pancreatic Cancer

DTIC Science & Technology

2012-10-01

fever virus (Lauck et al. 2011). The success of transposon-based genomic library construction for genomic analyses suggests that it should be possible...2011. Novel, divergent simian hemorrhagic Fever viruses in a wild ugandan red colobus Gertz et al. 140 Genome Research www.genome.org Cold Spring...2009. A strand-specific RNA-Seq analysis of the transcriptome of the typhoid bacillus Salmonella typhi. PLoS Genet 5: e1000569. doi: 10.1371

The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

PubMed

Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

2016-10-11

Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

Cpf1-Database: web-based genome-wide guide RNA library design for gene knockout screens using CRISPR-Cpf1.

PubMed

Park, Jeongbin; Bae, Sangsu

2018-03-15

Following the type II CRISPR-Cas9 system, type V CRISPR-Cpf1 endonucleases have been found to be applicable for genome editing in various organisms in vivo. However, there are as yet no web-based tools capable of optimally selecting guide RNAs (gRNAs) among all possible genome-wide target sites. Here, we present Cpf1-Database, a genome-wide gRNA library design tool for LbCpf1 and AsCpf1, which have DNA recognition sequences of 5'-TTTN-3' at the 5' ends of target sites. Cpf1-Database provides a sophisticated but simple way to design gRNAs for AsCpf1 nucleases on the genome scale. One can easily access the data using a straightforward web interface, and using the powerful collections feature one can easily design gRNAs for thousands of genes in short time. Free access at http://www.rgenome.net/cpf1-database/. sangsubae@hanyang.ac.kr.

Genome-scale CRISPR-Cas9 knockout screening in human cells.

PubMed

Shalem, Ophir; Sanjana, Neville E; Hartenian, Ella; Shi, Xi; Scott, David A; Mikkelson, Tarjei; Heckl, Dirk; Ebert, Benjamin L; Root, David E; Doench, John G; Zhang, Feng

2014-01-03

The simplicity of programming the CRISPR (clustered regularly interspaced short palindromic repeats)-associated nuclease Cas9 to modify specific genomic loci suggests a new way to interrogate gene function on a genome-wide scale. We show that lentiviral delivery of a genome-scale CRISPR-Cas9 knockout (GeCKO) library targeting 18,080 genes with 64,751 unique guide sequences enables both negative and positive selection screening in human cells. First, we used the GeCKO library to identify genes essential for cell viability in cancer and pluripotent stem cells. Next, in a melanoma model, we screened for genes whose loss is involved in resistance to vemurafenib, a therapeutic RAF inhibitor. Our highest-ranking candidates include previously validated genes NF1 and MED12, as well as novel hits NF2, CUL3, TADA2B, and TADA1. We observe a high level of consistency between independent guide RNAs targeting the same gene and a high rate of hit confirmation, demonstrating the promise of genome-scale screening with Cas9.

Genome-wide mapping of autonomous promoter activity in human cells

PubMed Central

van Arensbergen, Joris; FitzPatrick, Vincent D.; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J.; van Steensel, Bas

2017-01-01

Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of sequences that could be tested. Here we present Survey of Regulatory Elements (SuRE), a method to assay more than 108 DNA fragments, each 0.2–2kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library is constructed of random genomic fragments upstream of a 20bp barcode and decoded by paired-end sequencing. This library is then transfected into cells and transcribed barcodes are quantified in the RNA by high throughput sequencing. When applied to the human genome, we achieved a 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide. By computational modeling we delineated subregions within promoters that are relevant for their activity. For instance, we show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites. PMID:28024146

Characterization and comparative analysis of the genome of Puccinia sorghi Schwein, the causal agent of maize common rust.

PubMed

Rochi, Lucia; Diéguez, María José; Burguener, Germán; Darino, Martín Alejandro; Pergolesi, María Fernanda; Ingala, Lorena Romina; Cuyeu, Alba Romina; Turjanski, Adrián; Kreff, Enrique Domingo; Sacco, Francisco

2018-03-01

Rust fungi are one of the most devastating pathogens of crop plants. The biotrophic fungus Puccinia sorghi Schwein (Ps) is responsible for maize common rust, an endemic disease of maize (Zea mays L.) in Argentina that causes significant yield losses in corn production. In spite of this, the Ps genomic sequence was not available. We used Illumina sequencing to rapidly produce the 99.6Mbdraft genome sequence of Ps race RO10H11247, derived from a single-uredinial isolate from infected maize leaves collected in the Argentine Corn Belt Region during 2010. High quality reads were obtained from 200bppaired-end and 5000bpmate-paired libraries and assembled in 15,722 scaffolds. A pipeline which combined an ab initio program with homology-based models and homology to in planta enriched ESTs from four cereal pathogenic fungus (the three sequenced wheat rusts and Ustilago maydis) was used to identify 21,087 putative coding sequences, of which 1599 might be part of the Ps RO10H11247 secretome. Among the 458 highly conserved protein families from the euKaryotic Orthologous Groups (KOG) that occur in a wide range of eukaryotic organisms, 97.5% have at least one member with high homology in the Ps assembly (TBlastN, E-value⩽e-10) covering more than 50% of the length of the KOG protein. Comparative studies with the three sequenced wheat rust fungus, and microsynteny analysis involving Puccinia striiformis f. sp. tritici (Pst, wheat stripe rust fungus), support the quality achieved. The results presented here show the effectiveness of the Illumina strategy for sequencing dikaryotic genomes of non-model organisms and provides reliable DNA sequence information for genomic studies, including pathogenic mechanisms of this maize fungus and molecular marker design. Copyright © 2016 Elsevier Inc. All rights reserved.

Genome Sequence of Candidatus Nitrososphaera evergladensis from Group I.1b Enriched from Everglades Soil Reveals Novel Genomic Features of the Ammonia-Oxidizing Archaea

PubMed Central

Zhalnina, Kateryna V.; Dias, Raquel; Leonard, Michael T.; Dorr de Quadros, Patricia; Camargo, Flavio A. O.; Drew, Jennifer C.; Farmerie, William G.; Daroub, Samira H.; Triplett, Eric W.

2014-01-01

The activity of ammonia-oxidizing archaea (AOA) leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group. PMID:24999826

Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

PubMed

Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

2017-11-13

A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

Automated three-component synthesis of a library of γ-lactams

PubMed Central

Fenster, Erik; Hill, David; Reiser, Oliver

2012-01-01

Summary A three-component method for the synthesis of γ-lactams from commercially available maleimides, aldehydes, and amines was adapted to parallel library synthesis. Improvements to the chemistry over previous efforts include the optimization of the method to a one-pot process, the management of by-products and excess reagents, the development of an automated parallel sequence, and the adaption of the method to permit the preparation of enantiomerically enriched products. These efforts culminated in the preparation of a library of 169 γ-lactams. PMID:23209515

Improving microbial fitness in the mammalian gut by in vivo temporal functional metagenomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yaung, Stephanie J.; Deng, Luxue; Li, Ning

Elucidating functions of commensal microbial genes in the mammalian gut is challenging because many commensals are recalcitrant to laboratory cultivation and genetic manipulation. We present Temporal FUnctional Metagenomics sequencing (TFUMseq), a platform to functionally mine bacterial genomes for genes that contribute to fitness of commensal bacteria in vivo. Our approach uses metagenomic DNA to construct large-scale heterologous expression libraries that are tracked over time in vivo by deep sequencing and computational methods. To demonstrate our approach, we built a TFUMseq plasmid library using the gut commensal Bacteroides thetaiotaomicron (Bt) and introduced Escherichia coli carrying this library into germfree mice. Populationmore » dynamics of library clones revealed Bt genes conferring significant fitness advantages in E. coli over time, including carbohydrate utilization genes, with a Bt galactokinase central to early colonization, and subsequent dominance by a Bt glycoside hydrolase enabling sucrose metabolism coupled with co-evolution of the plasmid library and E. coli genome driving increased galactose utilization. Here, our findings highlight the utility of functional metagenomics for engineering commensal bacteria with improved properties, including expanded colonization capabilities in vivo.« less

Improving microbial fitness in the mammalian gut by in vivo temporal functional metagenomics

DOE PAGES

Yaung, Stephanie J.; Deng, Luxue; Li, Ning; ...

2015-03-11

Elucidating functions of commensal microbial genes in the mammalian gut is challenging because many commensals are recalcitrant to laboratory cultivation and genetic manipulation. We present Temporal FUnctional Metagenomics sequencing (TFUMseq), a platform to functionally mine bacterial genomes for genes that contribute to fitness of commensal bacteria in vivo. Our approach uses metagenomic DNA to construct large-scale heterologous expression libraries that are tracked over time in vivo by deep sequencing and computational methods. To demonstrate our approach, we built a TFUMseq plasmid library using the gut commensal Bacteroides thetaiotaomicron (Bt) and introduced Escherichia coli carrying this library into germfree mice. Populationmore » dynamics of library clones revealed Bt genes conferring significant fitness advantages in E. coli over time, including carbohydrate utilization genes, with a Bt galactokinase central to early colonization, and subsequent dominance by a Bt glycoside hydrolase enabling sucrose metabolism coupled with co-evolution of the plasmid library and E. coli genome driving increased galactose utilization. Here, our findings highlight the utility of functional metagenomics for engineering commensal bacteria with improved properties, including expanded colonization capabilities in vivo.« less

Overview of hybridization and detection techniques.

PubMed

Hilario, Elena

2007-01-01

A misconception regarding the sensitivity of nonradioactive methods for screening genomic DNA libraries often hinders the establishment of these environmentally friendly techniques in molecular biology laboratories. Nonradioactive probes, properly prepared and quantified, can detect DNA target molecules to the femtomole range. However, appropriate hybridization techniques and detection methods should also be adopted for an efficient use of nonradioactive techniques. Detailed descriptions of genomic library handling before and during the nonradioactive hybridization and detection are often omitted from publications. This chapter aims to fill this void by providing a collection of technical tips on hybridization and detection techniques.

School Libraries Addressing the Needs of ELL Students: Enhancing Language Acquisition, Confidence, and Cultural Fluency in ELL Students by Developing a Targeted Collection and Enriching Your Makerspace

ERIC Educational Resources Information Center

Murphy, Peggy Henderson

2018-01-01

English Language Learner (ELL) students are sometimes a small constituency. Many resources already in the library can be used to enhance their language acquisition, confidence, and cultural fluency--resources such as graphic novels, hi-lo books, and makerspace materials. This article discusses enhancing language acquisition, confidence, and…

The distribution and impact of common copy-number variation in the genome of the domesticated apple, Malus x domestica Borkh.

PubMed

Boocock, James; Chagné, David; Merriman, Tony R; Black, Michael A

2015-10-23

Copy number variation (CNV) is a common feature of eukaryotic genomes, and a growing body of evidence suggests that genes affected by CNV are enriched in processes that are associated with environmental responses. Here we use next generation sequence (NGS) data to detect copy-number variable regions (CNVRs) within the Malus x domestica genome, as well as to examine their distribution and impact. CNVRs were detected using NGS data derived from 30 accessions of M. x domestica analyzed using the read-depth method, as implemented in the CNVrd2 software. To improve the reliability of our results, we developed a quality control and analysis procedure that involved checking for organelle DNA, not repeat masking, and the determination of CNVR identity using a permutation testing procedure. Overall, we identified 876 CNVRs, which spanned 3.5 % of the apple genome. To verify that detected CNVRs were not artifacts, we analyzed the B- allele-frequencies (BAF) within a single nucleotide polymorphism (SNP) array dataset derived from a screening of 185 individual apple accessions and found the CNVRs were enriched for SNPs having aberrant BAFs (P < 1e-13, Fisher's Exact test). Putative CNVRs overlapped 845 gene models and were enriched for resistance (R) gene models (P < 1e-22, Fisher's exact test). Of note was a cluster of resistance gene models on chromosome 2 near a region containing multiple major gene loci conferring resistance to apple scab. We present the first analysis and catalogue of CNVRs in the M. x domestica genome. The enrichment of the CNVRs with R gene models and their overlap with gene loci of agricultural significance draw attention to a form of unexplored genetic variation in apple. This research will underpin further investigation of the role that CNV plays within the apple genome.

Scribl: an HTML5 Canvas-based graphics library for visualizing genomic data over the web

PubMed Central

Miller, Chase A.; Anthony, Jon; Meyer, Michelle M.; Marth, Gabor

2013-01-01

Motivation: High-throughput biological research requires simultaneous visualization as well as analysis of genomic data, e.g. read alignments, variant calls and genomic annotations. Traditionally, such integrative analysis required desktop applications operating on locally stored data. Many current terabyte-size datasets generated by large public consortia projects, however, are already only feasibly stored at specialist genome analysis centers. As even small laboratories can afford very large datasets, local storage and analysis are becoming increasingly limiting, and it is likely that most such datasets will soon be stored remotely, e.g. in the cloud. These developments will require web-based tools that enable users to access, analyze and view vast remotely stored data with a level of sophistication and interactivity that approximates desktop applications. As rapidly dropping cost enables researchers to collect data intended to answer questions in very specialized contexts, developers must also provide software libraries that empower users to implement customized data analyses and data views for their particular application. Such specialized, yet lightweight, applications would empower scientists to better answer specific biological questions than possible with general-purpose genome browsers currently available. Results: Using recent advances in core web technologies (HTML5), we developed Scribl, a flexible genomic visualization library specifically targeting coordinate-based data such as genomic features, DNA sequence and genetic variants. Scribl simplifies the development of sophisticated web-based graphical tools that approach the dynamism and interactivity of desktop applications. Availability and implementation: Software is freely available online at http://chmille4.github.com/Scribl/ and is implemented in JavaScript with all modern browsers supported. Contact: gabor.marth@bc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23172864

The Total Library *

PubMed Central

Annan, Gertrude L.

1968-01-01

Changing functions and techniques of today's libraries have led to questioning the very substance of the library of the future. Ralph Shaw points out that the total library must be “a living force for the enrichment of mankind.” The Medical Research Library of Brooklyn at its very inauguration is uniquely prepared and equipped to work toward that goal. It imaginatively serves the needs of the immediate area by generous sharing of resources, use of its computerized program, and participation in a state-wide system. The large collection of the Academy of Medicine of Brooklyn offers thousands of volumes for the historian, both great works as highlights of medical achievements and more modest contributions of both early and recent date. The total library must serve as an intellectual resource, as well as a mechanism for the rapid transfer of current information. PMID:5644795

«

21

22

23

24

25

»

«

21

22

23

24

25

»

Euchromatic Transposon Insertions Trigger Production of Novel Pi- and Endo-siRNAs at the Target Sites in the Drosophila Germline

PubMed Central

Olovnikov, Ivan; Abramov, Yuri; Kalmykova, Alla

2014-01-01

The control of transposable element (TE) activity in germ cells provides genome integrity over generations. A distinct small RNA–mediated pathway utilizing Piwi-interacting RNAs (piRNAs) suppresses TE expression in gonads of metazoans. In the fly, primary piRNAs derive from so-called piRNA clusters, which are enriched in damaged repeated sequences. These piRNAs launch a cycle of TE and piRNA cluster transcript cleavages resulting in the amplification of piRNA and TE silencing. Using genome-wide comparison of TE insertions and ovarian small RNA libraries from two Drosophila strains, we found that individual TEs inserted into euchromatic loci form novel dual-stranded piRNA clusters. Formation of the piRNA-generating loci by active individual TEs provides a more potent silencing response to the TE expansion. Like all piRNA clusters, individual TEs are also capable of triggering the production of endogenous small interfering (endo-si) RNAs. Small RNA production by individual TEs spreads into the flanking genomic regions including coding cellular genes. We show that formation of TE-associated small RNA clusters can down-regulate expression of nearby genes in ovaries. Integration of TEs into the 3′ untranslated region of actively transcribed genes induces piRNA production towards the 3′-end of transcripts, causing the appearance of genic piRNA clusters, a phenomenon that has been reported in different organisms. These data suggest a significant role of TE-associated small RNAs in the evolution of regulatory networks in the germline. PMID:24516406

Sequence-based novel genomic microsatellite markers for robust genotyping purposes in foxtail millet [Setaria italica (L.) P. Beauv].

PubMed

Gupta, Sarika; Kumari, Kajal; Sahu, Pranav Pankaj; Vidapu, Sudhakar; Prasad, Manoj

2012-02-01

The unavailability of microsatellite markers and saturated genetic linkage map has restricted the genetic improvement of foxtail millet [Setaria italica (L.) P. Beauv.], despite the fact that in recent times it has been documented as a new model species for biofuel grasses. With the objective to generate a good number of microsatellite markers in foxtail millet cultivar 'Prasad', 690 clones were sequenced which generated 112.95 kb high quality sequences obtained from three genomic libraries each enriched with different microsatellite repeat motifs. Microsatellites were identified in 512 (74.2%) of the 690 positive clones and 172 primer pairs (pp) were successfully designed from 249 (48.6%) unique SSR-containing clones. The efficacies of the microsatellite containing genomic sequences were established by superior primer designing ability (69%), PCR amplification efficiency (85.5%) and polymorphic potential (52%) in the parents of F(2) mapping population. Out of 172 pp, functional 147 markers showed high level of cross-species amplification (~74%) in six grass species. Higher polymorphism rate and broad range of genetic diversity (0.30-0.69 averaging 0.58) obtained in constructed phylogenetic tree using 52 microsatellite markers, demonstrated the utility of markers in germplasm characterizations. In silico comparative mapping of 147 foxtail millet microsatellite containing sequences against the mapping data of sorghum (~18%), maize (~16%) and rice (~5%) indicated the presence of orthologous sequences of the foxtail millet in the respective species. The result thus demonstrates the applicability of microsatellite markers in various genotyping applications, determining phylogenetic relationships and comparative mapping in several important grass species.

A cluster-based strategy for assessing the overlap between large chemical libraries and its application to a recent acquisition.

PubMed

Engels, Michael F M; Gibbs, Alan C; Jaeger, Edward P; Verbinnen, Danny; Lobanov, Victor S; Agrafiotis, Dimitris K

2006-01-01

We report on the structural comparison of the corporate collections of Johnson & Johnson Pharmaceutical Research & Development (JNJPRD) and 3-Dimensional Pharmaceuticals (3DP), performed in the context of the recent acquisition of 3DP by JNJPRD. The main objective of the study was to assess the druglikeness of the 3DP library and the extent to which it enriched the chemical diversity of the JNJPRD corporate collection. The two databases, at the time of acquisition, collectively contained more than 1.1 million compounds with a clearly defined structural description. The analysis was based on a clustering approach and aimed at providing an intuitive quantitative estimate and visual representation of this enrichment. A novel hierarchical clustering algorithm called divisive k-means was employed in combination with Kelley's cluster-level selection method to partition the combined data set into clusters, and the diversity contribution of each library was evaluated as a function of the relative occupancy of these clusters. Typical 3DP chemotypes enriching the diversity of the JNJPRD collection were catalogued and visualized using a modified maximum common substructure algorithm. The joint collection of JNJPRD and 3DP compounds was also compared to other databases of known medicinally active or druglike compounds. The potential of the methodology for the analysis of very large chemical databases is discussed.

Broad-Enrich: functional interpretation of large sets of broad genomic regions.

PubMed

Cavalcante, Raymond G; Lee, Chee; Welch, Ryan P; Patil, Snehal; Weymouth, Terry; Scott, Laura J; Sartor, Maureen A

2014-09-01

Functional enrichment testing facilitates the interpretation of Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) data in terms of pathways and other biological contexts. Previous methods developed and used to test for key gene sets affected in ChIP-seq experiments treat peaks as points, and are based on the number of peaks associated with a gene or a binary score for each gene. These approaches work well for transcription factors, but histone modifications often occur over broad domains, and across multiple genes. To incorporate the unique properties of broad domains into functional enrichment testing, we developed Broad-Enrich, a method that uses the proportion of each gene's locus covered by a peak. We show that our method has a well-calibrated false-positive rate, performing well with ChIP-seq data having broad domains compared with alternative approaches. We illustrate Broad-Enrich with 55 ENCODE ChIP-seq datasets using different methods to define gene loci. Broad-Enrich can also be applied to other datasets consisting of broad genomic domains such as copy number variations. http://broad-enrich.med.umich.edu for Web version and R package. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

Feature co-localization landscape of the human genome

PubMed Central

Ng, Siu-Kin; Hu, Taobo; Long, Xi; Chan, Cheuk-Hin; Tsang, Shui-Ying; Xue, Hong

2016-01-01

Although feature co-localizations could serve as useful guide-posts to genome architecture, a comprehensive and quantitative feature co-localization map of the human genome has been lacking. Herein we show that, in contrast to the conventional bipartite division of genomic sequences into genic and inter-genic regions, pairwise co-localizations of forty-two genomic features in the twenty-two autosomes based on 50-kb to 2,000-kb sequence windows indicate a tripartite zonal architecture comprising Genic zones enriched with gene-related features and Alu-elements; Proximal zones enriched with MIR- and L2-elements, transcription-factor-binding-sites (TFBSs), and conserved-indels (CIDs); and Distal zones enriched with L1-elements. Co-localizations between single-nucleotide-polymorphisms (SNPs) and copy-number-variations (CNVs) reveal a fraction of sequence windows displaying steeply enhanced levels of SNPs, CNVs and recombination rates that point to active adaptive evolution in such pathways as immune response, sensory perceptions, and cognition. The strongest positive co-localization observed between TFBSs and CIDs suggests a regulatory role of CIDs in cooperation with TFBSs. The positive co-localizations of cancer somatic CNVs (CNVT) with all Proximal zone and most Genic zone features, in contrast to the distinctly more restricted co-localizations exhibited by germline CNVs (CNVG), reveal disparate distributions of CNVTs and CNVGs indicative of dissimilarity in their underlying mechanisms. PMID:26854351

Enrichment of individual KIR2DL4 sequences from genomic DNA using long-template PCR and allele-specific hybridization to magnetic bead-bound oligonucleotide probes.

PubMed

Roberts, C H; Turino, C; Madrigal, J A; Marsh, S G E

2007-06-01

DNA enrichment by allele-specific hybridization (DEASH) was used as a means to isolate individual alleles of the killer cell immunoglobulin-like receptor (KIR2DL4) gene from heterozygous genomic DNA. Using long-template polymerase chain reaction (LT-PCR), the complete KIR2DL4 gene was amplified from a cell line that had previously been characterized for its KIR gene content by PCR using sequence-specific primers (PCR-SSP). The whole gene amplicons were sequenced and we identified two heterozygous positions in accordance with the predictions of the PCR-SSP. The amplicons were then hybridized to allele-specific, biotinylated oligonucleotide probes and through binding to streptavidin-coated beads, the targeted alleles were enriched. A second PCR amplified only the exonic regions of the enriched allele, and these were then sequenced in full. We show DEASH to be capable of enriching single alleles from a heterozygous PCR product, and through sequencing the enriched DNA, we are able to produce complete coding sequences of the KIR2DL4 alleles in accordance with the typing predicted by PCR-SSP.

Functional assessment of human enhancer activities using whole-genome STARR-sequencing.

PubMed

Liu, Yuwen; Yu, Shan; Dhiman, Vineet K; Brunetti, Tonya; Eckart, Heather; White, Kevin P

2017-11-20

Genome-wide quantification of enhancer activity in the human genome has proven to be a challenging problem. Recent efforts have led to the development of powerful tools for enhancer quantification. However, because of genome size and complexity, these tools have yet to be applied to the whole human genome. In the current study, we use a human prostate cancer cell line, LNCaP as a model to perform whole human genome STARR-seq (WHG-STARR-seq) to reliably obtain an assessment of enhancer activity. This approach builds upon previously developed STARR-seq in the fly genome and CapSTARR-seq techniques in targeted human genomic regions. With an improved library preparation strategy, our approach greatly increases the library complexity per unit of starting material, which makes it feasible and cost-effective to explore the landscape of regulatory activity in the much larger human genome. In addition to our ability to identify active, accessible enhancers located in open chromatin regions, we can also detect sequences with the potential for enhancer activity that are located in inaccessible, closed chromatin regions. When treated with the histone deacetylase inhibitor, Trichostatin A, genes nearby this latter class of enhancers are up-regulated, demonstrating the potential for endogenous functionality of these regulatory elements. WHG-STARR-seq provides an improved approach to current pipelines for analysis of high complexity genomes to gain a better understanding of the intricacies of transcriptional regulation.

Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma

PubMed Central

Zhao, Ling-Hao; Liu, Xiao; Yan, He-Xin; Li, Wei-Yang; Zeng, Xi; Yang, Yuan; Zhao, Jie; Liu, Shi-Ping; Zhuang, Xue-Han; Lin, Chuan; Qin, Chen-Jie; Zhao, Yi; Pan, Ze-Ya; Huang, Gang; Liu, Hui; Zhang, Jin; Wang, Ruo-Yu; Yang, Yun; Wen, Wen; Lv, Gui-Shuai; Zhang, Hui-Lu; Wu, Han; Huang, Shuai; Wang, Ming-Da; Tang, Liang; Cao, Hong-Zhi; Wang, Ling; Lee, Tin-Lap; Jiang, Hui; Tan, Ye-Xiong; Yuan, Sheng-Xian; Hou, Guo-Jun; Tao, Qi-Fei; Xu, Qin-Guo; Zhang, Xiu-Qing; Wu, Meng-Chao; Xu, Xun; Wang, Jun; Yang, Huan-Ming; Zhou, Wei-Ping; Wang, Hong-Yang

2016-01-01

Hepatitis B virus (HBV) can integrate into the human genome, contributing to genomic instability and hepatocarcinogenesis. Here by conducting high-throughput viral integration detection and RNA sequencing, we identify 4,225 HBV integration events in tumour and adjacent non-tumour samples from 426 patients with HCC. We show that HBV is prone to integrate into rare fragile sites and functional genomic regions including CpG islands. We observe a distinct pattern in the preferential sites of HBV integration between tumour and non-tumour tissues. HBV insertional sites are significantly enriched in the proximity of telomeres in tumours. Recurrent HBV target genes are identified with few that overlap. The overall HBV integration frequency is much higher in tumour genomes of males than in females, with a significant enrichment of integration into chromosome 17. Furthermore, a cirrhosis-dependent HBV integration pattern is observed, affecting distinct targeted genes. Our data suggest that HBV integration has a high potential to drive oncogenic transformation. PMID:27703150

Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

PubMed

Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

2016-11-01

Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.

Genome Sequences of Streptomyces Phages Amela and Verse

PubMed Central

Layton, Sonya R.; Hemenway, Ryan M.; Munyoki, Christine M.; Barnes, Emory B.; Barnett, Sierra E.; Bond, Alec M.; Narvaez, Jessi M.; Sirisakd, Christie D.; Smith, Brandt R.; Swain, Justin; Syed, Orooj; Bowman, Charles A.; Russell, Daniel A.; Bhuiyan, Swapan; Donegan-Quick, Richard; Benjamin, Robert C.

2016-01-01

Amela and Verse are two Streptomyces phages isolated by enrichment on Streptomyces venezuelae (ATCC 10712) from two different soil samples. Amela has a genome length of 49,452, with 75 genes. Verse has a genome length of 49,483, with 75 genes. Both belong to the BD3 subcluster of Actinobacteriophage. PMID:26893416

Optimization and comparative analysis of plant organellar DNA enrichment methods suitable for next generation sequencing

USDA-ARS?s Scientific Manuscript database

Plant organellar genomes contain large repetitive elements that may undergo pairing or recombination to form complex structures and/or sub-genomic fragments. Organellar genomes also exist in admixtures within a given cell or tissue type (heteroplasmy) and abundance of sub-types may change through de...

Population-based rare variant detection via pooled exome or custom hybridization capture with or without individual indexing.

PubMed

Ramos, Enrique; Levinson, Benjamin T; Chasnoff, Sara; Hughes, Andrew; Young, Andrew L; Thornton, Katherine; Li, Allie; Vallania, Francesco L M; Province, Michael; Druley, Todd E

2012-12-06

Rare genetic variation in the human population is a major source of pathophysiological variability and has been implicated in a host of complex phenotypes and diseases. Finding disease-related genes harboring disparate functional rare variants requires sequencing of many individuals across many genomic regions and comparing against unaffected cohorts. However, despite persistent declines in sequencing costs, population-based rare variant detection across large genomic target regions remains cost prohibitive for most investigators. In addition, DNA samples are often precious and hybridization methods typically require large amounts of input DNA. Pooled sample DNA sequencing is a cost and time-efficient strategy for surveying populations of individuals for rare variants. We set out to 1) create a scalable, multiplexing method for custom capture with or without individual DNA indexing that was amenable to low amounts of input DNA and 2) expand the functionality of the SPLINTER algorithm for calling substitutions, insertions and deletions across either candidate genes or the entire exome by integrating the variant calling algorithm with the dynamic programming aligner, Novoalign. We report methodology for pooled hybridization capture with pre-enrichment, indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person. Modified solid phase reversible immobilization bead purification strategies enable no sample transfers from sonication in 96-well plates through adapter ligation, resulting in 50% less library preparation reagent consumption. Custom Y-shaped adapters containing novel 7 base pair index sequences with a Hamming distance of ≥2 were directly ligated onto fragmented source DNA eliminating the need for PCR to incorporate indexes, and was followed by a custom blocking strategy using a single oligonucleotide regardless of index sequence. These results were obtained aligning raw reads against the entire genome using Novoalign followed by variant calling of non-indexed pools using SPLINTER or SAMtools for indexed samples. With these pipelines, we find sensitivity and specificity of 99.4% and 99.7% for pooled exome sequencing. Sensitivity, and to a lesser degree specificity, proved to be a function of coverage. For rare variants (≤2% minor allele frequency), we achieved sensitivity and specificity of ≥94.9% and ≥99.99% for custom capture of 2.5 Mb in multiplexed libraries of 22-48 individuals with only ≥5-fold coverage/chromosome, but these parameters improved to ≥98.7 and 100% with 20-fold coverage/chromosome. This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.

ParallABEL: an R library for generalized parallelization of genome-wide association studies.

PubMed

Sangket, Unitsa; Mahasirimongkol, Surakameth; Chantratita, Wasun; Tandayya, Pichaya; Aulchenko, Yurii S

2010-04-29

Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors. Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL.

The Essential Genome of Escherichia coli K-12.

PubMed

Goodall, Emily C A; Robinson, Ashley; Johnston, Iain G; Jabbari, Sara; Turner, Keith A; Cunningham, Adam F; Lund, Peter A; Cole, Jeffrey A; Henderson, Ian R

2018-02-20

Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. IMPORTANCE Incentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in Escherichia coli , we constructed a transposon mutant library of unprecedented density. Initial automated analysis of the resulting data revealed many discrepancies compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high-density TraDIS sequencing data for each putative essential gene for the E. coli model laboratory organism. This paper is important because it provides a better understanding of the essential genes of E. coli , reveals the limitations of relying on automated analysis alone, and provides a new standard for the analysis of TraDIS data. Copyright © 2018 Goodall et al.

Novel Approaches to Breast Cancer Prevention and Inhibition of Metastases

DTIC Science & Technology

2013-10-01

allow a functional characterization of human candidate breast cancer genes. The transgenic RNAi library is covering the whole Drosophila genome ...W81XWH-12-1-0093 / Penninger 15. SUBJECT TERMS Genome wide functional genetics, haploid stem cells, Drosophila cancer modeling...With the advent of modern genomics hundreds of candidate genes have been associated with breast cancer both in GWAS studies as well as by cancer genome

Adventures in the enormous: a 1.8 million clone BAC library for the 21.7 Gb genome of loblolly pine

Treesearch

Zenaida V. Magbanua; Seval Ozkan; Benjamin D. Bartlett; Philippe Chouvarine; Christopher A. Saski; Aaron Liston; Richard C. Cronn; C. Dana Nelson; Daniel G. Peterson

2011-01-01

Loblolly pine (LP; Pinus taeda L.) is the most economically important tree in the U.S. and a cornerstone species in southeastern forests. However, genomics research on LP and other conifers has lagged behind studies on flowering plants due, in part, to the large size of conifer genomes. As a means to accelerate conifer genome research, we...

Combining yeast display and competitive FACS to select rare hapten-specific clones from recombinant antibody libraries

DOE PAGES

Sun, Yue; Ban, Bhupal; Bradbury, Andrew; ...

2016-08-29

The development of antibodies to low molecular weight haptens remains challenging due to both the low immunogenicity of many haptens and the cross-reactivity of the protein carriers used to generate the immune response. Recombinant antibodies and novel display technologies have greatly advanced antibody development; however, new techniques are still required to select rare hapten-specific antibodies from large recombinant libraries. In the present study, we used a combination of phage and yeast display to screen an immune antibody library (size, 4.4 × 10 6 ) against hapten markers for petroleum contamination (phenanthrene and methylphenanthrenes). Selection via phage display was used firstmore » to enrich the library between 20- and 100- fold for clones that bound to phenanthrene-protein conjugates. The enriched libraries were subsequently transferred to a yeast display system and a newly developed competitive FACS procedure was employed to select rare hapten-specific clones. Competitive FACS increased the frequency of hapten-specific scFvs in our yeast-displayed scFvs from 0.025 to 0.005% in the original library to between 13 and 35% in selected pools. The presence of hapten-specific scFvs was confirmed by competitive ELISA using periplasmic protein. Three distinct antibody clones that recognize phenanthrene and methylphenanthrenes were selected, and their distinctive binding properties were characterized. To our knowledge, these are first antibodies that can distinguish between methylated (petrogenic) versus unmethylated (pyrogenic) phenanthrenes; such antibodies will be useful in detecting the sources of environmental contamination. Furthermore, this selection method could be generally adopted in the selection of other hapten-specific recombinant antibodies.« less

Combining yeast display and competitive FACS to select rare hapten-specific clones from recombinant antibody libraries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Yue; Ban, Bhupal; Bradbury, Andrew

The development of antibodies to low molecular weight haptens remains challenging due to both the low immunogenicity of many haptens and the cross-reactivity of the protein carriers used to generate the immune response. Recombinant antibodies and novel display technologies have greatly advanced antibody development; however, new techniques are still required to select rare hapten-specific antibodies from large recombinant libraries. In the present study, we used a combination of phage and yeast display to screen an immune antibody library (size, 4.4 × 10 6 ) against hapten markers for petroleum contamination (phenanthrene and methylphenanthrenes). Selection via phage display was used firstmore » to enrich the library between 20- and 100- fold for clones that bound to phenanthrene-protein conjugates. The enriched libraries were subsequently transferred to a yeast display system and a newly developed competitive FACS procedure was employed to select rare hapten-specific clones. Competitive FACS increased the frequency of hapten-specific scFvs in our yeast-displayed scFvs from 0.025 to 0.005% in the original library to between 13 and 35% in selected pools. The presence of hapten-specific scFvs was confirmed by competitive ELISA using periplasmic protein. Three distinct antibody clones that recognize phenanthrene and methylphenanthrenes were selected, and their distinctive binding properties were characterized. To our knowledge, these are first antibodies that can distinguish between methylated (petrogenic) versus unmethylated (pyrogenic) phenanthrenes; such antibodies will be useful in detecting the sources of environmental contamination. Furthermore, this selection method could be generally adopted in the selection of other hapten-specific recombinant antibodies.« less

Genome re-annotation of the wild strawberry Fragaria vesca using extensive Illumina- and SMRT-based RNA-seq datasets

PubMed Central

Li, Yongping; Wei, Wei; Feng, Jia; Luo, Huifeng; Pi, Mengting; Liu, Zhongchi; Kang, Chunying

2018-01-01

Abstract The genome of the wild diploid strawberry species Fragaria vesca, an ideal model system of cultivated strawberry (Fragaria × ananassa, octoploid) and other Rosaceae family crops, was first published in 2011 and followed by a new assembly (Fvb). However, the annotation for Fvb mainly relied on ab initio predictions and included only predicted coding sequences, therefore an improved annotation is highly desirable. Here, a new annotation version named v2.0.a2 was created for the Fvb genome by a pipeline utilizing one PacBio library, 90 Illumina RNA-seq libraries, and 9 small RNA-seq libraries. Altogether, 18,641 genes (55.6% out of 33,538 genes) were augmented with information on the 5′ and/or 3′ UTRs, 13,168 (39.3%) protein-coding genes were modified or newly identified, and 7,370 genes were found to possess alternative isoforms. In addition, 1,938 long non-coding RNAs, 171 miRNAs, and 51,714 small RNA clusters were integrated into the annotation. This new annotation of F. vesca is substantially improved in both accuracy and integrity of gene predictions, beneficial to the gene functional studies in strawberry and to the comparative genomic analysis of other horticultural crops in Rosaceae family. PMID:29036429

Clone DB: an integrated NCBI resource for clone-associated data

PubMed Central

Schneider, Valerie A.; Chen, Hsiu-Chuan; Clausen, Cliff; Meric, Peter A.; Zhou, Zhigang; Bouk, Nathan; Husain, Nora; Maglott, Donna R.; Church, Deanna M.

2013-01-01

The National Center for Biotechnology Information (NCBI) Clone DB (http://www.ncbi.nlm.nih.gov/clone/) is an integrated resource providing information about and facilitating access to clones, which serve as valuable research reagents in many fields, including genome sequencing and variation analysis. Clone DB represents an expansion and replacement of the former NCBI Clone Registry and has records for genomic and cell-based libraries and clones representing more than 100 different eukaryotic taxa. Records provide details of library construction, associated sequences, map positions and information about resource distribution. Clone DB is indexed in the NCBI Entrez system and can be queried by fields that include organism, clone name, gene name and sequence identifier. Whenever possible, genomic clones are mapped to reference assemblies and their map positions provided in clone records. Clones mapping to specific genomic regions can also be searched for using the NCBI Clone Finder tool, which accepts queries based on sequence coordinates or features such as gene or transcript names. Clone DB makes reports of library, clone and placement data on its FTP site available for download. With Clone DB, users now have available to them a centralized resource that provides them with the tools they will need to make use of these important research reagents. PMID:23193260

«

21

22

23

24

25

»

«

21

22

23

24

25

»

Clone and genomic repositories at the American Type Culture Collection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maglott, D.R.; Nierman, W.C.

1990-01-01

The American Type Culture Collection (ATCC) has a long history of characterizing, preserving, and distributing biological resource materials for the scientific community. Starting in 1925 as a repository for standard bacterial and fungal strains, its collections have diversified with technologic advances and in response to the requirements of its users. To serve the needs of the human genetics community, the National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), established an international Repository of Human DNA Probes and Libraries at the ATCC in 1985. This repository expanded the existing collections of recombinant clones and librariesmore » at the ATCC, with the specific purposes of (1) obtaining, amplifying, and distribution probes detecting restriction fragment length polymorphisms (RFLPs); (2) obtaining, amplifying, and distributing genomic and cDNA clones from known genes independent of RFLP detection; (3) distributing the chromosome-specific libraries generated by the National Laboratory Gene Library Project at the Lawrence Livermore and Los Alamos National Laboratories and (4) maintaining a public, online database describing the repository materials. Because it was recognized that animal models and comparative mapping can be crucial to genomic characterization, the scope of the repository was broadened in February 1989 to include probes from the mouse genome.« less

Reference quality assembly of the 3.5 Gb genome of Capsicum annuum form a single linked-read library

USDA-ARS?s Scientific Manuscript database

Linked-Read sequencing technology has recently been employed successfully for de novo assembly of multiple human genomes, however the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5 gigabase (Gb) diploid pepper (Cap...

A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of synteny with model fish genomes

USDA-ARS?s Scientific Manuscript database

In this paper we generated DNA fingerprints and end sequences from bacterial artificial chromosomes (BACs) from two new libraries to improve the first generation integrated physical and genetic map of the rainbow trout (Oncorhynchus mykiss) genome. The current version of the physical map is compose...

Transcriptome analysis reveals the genetic basis underlying the biosynthesis of volatile oil, gingerols, and diarylheptanoids in ginger (Zingiber officinale Rosc.).

PubMed

Jiang, Yusong; Liao, Qinhong; Zou, Yong; Liu, Yiqing; Lan, Jianbin

2017-10-23

Ginger (Zingiber officinale Rosc.) is a popular flavoring that widely used in Asian, and the volatile oil in ginger rhizomes adds a special fragrance and taste to foods. The bioactive compounds in ginger, such as gingerols, diarylheptanoids, and flavonoids, are of significant value to human health because of their anticancer, anti-oxidant, and anti-inflammatory properties. However, as a non-model plant, knowledge about the genome sequences of ginger is extremely limited, and this limits molecular studies on this plant. In this study, de novo transcriptome sequencing was performed to investigate the expression of genes associated with the biosynthesis of major bioactive compounds in matured ginger rhizome (MG), young ginger rhizome (YG), and fibrous roots of ginger (FR). A total of 361,876 unigenes were generated by de novo assembly. The expression of genes involved in the pathways responsible for the biosynthesis of major bioactive compounds differed between tissues (MG, YG, and FR). Two pathways that give rise to volatile oil, gingerols, and diarylheptanoids, the "terpenoid backbone biosynthesis" and "stilbenoid, diarylheptanoid and gingerol biosynthesis" pathways, were significantly enriched (adjusted P value < 0.05) for differentially expressed genes (DEGs) (FDR < 0.005) both between the FR and YG libraries, and the FR and MG libraries. Most of the unigenes mapped in these two pathways, including curcumin synthase, phenylpropanoylacetyl-CoA synthase, trans-cinnamate 4-monooxygenase, and 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase, were expressed to a significantly higher level (log 2 (fold-change) ≥ 1) in FR than in YG or MG. This study provides the first insight into the biosynthesis of bioactive compounds in ginger at a molecular level and provides valuable genome resources for future molecular studies on ginger. Moreover, our results establish that bioactive compounds in ginger may predominantly synthesized in the root and then transported to rhizomes, where they accumulate.

Sheep genome functional annotation reveals proximal regulatory elements contributed to the evolution of modern breeds.

PubMed

Naval-Sanchez, Marina; Nguyen, Quan; McWilliam, Sean; Porto-Neto, Laercio R; Tellam, Ross; Vuocolo, Tony; Reverter, Antonio; Perez-Enciso, Miguel; Brauning, Rudiger; Clarke, Shannon; McCulloch, Alan; Zamani, Wahid; Naderi, Saeid; Rezaei, Hamid Reza; Pompanon, Francois; Taberlet, Pierre; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Jhangiani, Shalini N; Cockett, Noelle; Daetwyler, Hans; Kijas, James

2018-02-28

Domestication fundamentally reshaped animal morphology, physiology and behaviour, offering the opportunity to investigate the molecular processes driving evolutionary change. Here we assess sheep domestication and artificial selection by comparing genome sequence from 43 modern breeds (Ovis aries) and their Asian mouflon ancestor (O. orientalis) to identify selection sweeps. Next, we provide a comparative functional annotation of the sheep genome, validated using experimental ChIP-Seq of sheep tissue. Using these annotations, we evaluate the impact of selection and domestication on regulatory sequences and find that sweeps are significantly enriched for protein coding genes, proximal regulatory elements of genes and genome features associated with active transcription. Finally, we find individual sites displaying strong allele frequency divergence are enriched for the same regulatory features. Our data demonstrate that remodelling of gene expression is likely to have been one of the evolutionary forces that drove phenotypic diversification of this common livestock species.

blend4php: a PHP API for galaxy

PubMed Central

Wytko, Connor; Soto, Brian; Ficklin, Stephen P.

2017-01-01

Galaxy is a popular framework for execution of complex analytical pipelines typically for large data sets, and is a commonly used for (but not limited to) genomic, genetic and related biological analysis. It provides a web front-end and integrates with high performance computing resources. Here we report the development of the blend4php library that wraps Galaxy’s RESTful API into a PHP-based library. PHP-based web applications can use blend4php to automate execution, monitoring and management of a remote Galaxy server, including its users, workflows, jobs and more. The blend4php library was specifically developed for the integration of Galaxy with Tripal, the open-source toolkit for the creation of online genomic and genetic web sites. However, it was designed as an independent library for use by any application, and is freely available under version 3 of the GNU Lesser General Public License (LPGL v3.0) at https://github.com/galaxyproject/blend4php. Database URL: https://github.com/galaxyproject/blend4php PMID:28077564

Reprogramming cell fate with a genome-scale library of artificial transcription factors.

PubMed

Eguchi, Asuka; Wleklinski, Matthew J; Spurgat, Mackenzie C; Heiderscheit, Evan A; Kropornicka, Anna S; Vu, Catherine K; Bhimsaria, Devesh; Swanson, Scott A; Stewart, Ron; Ramanathan, Parameswaran; Kamp, Timothy J; Slukvin, Igor; Thomson, James A; Dutton, James R; Ansari, Aseem Z

2016-12-20

Artificial transcription factors (ATFs) are precision-tailored molecules designed to bind DNA and regulate transcription in a preprogrammed manner. Libraries of ATFs enable the high-throughput screening of gene networks that trigger cell fate decisions or phenotypic changes. We developed a genome-scale library of ATFs that display an engineered interaction domain (ID) to enable cooperative assembly and synergistic gene expression at targeted sites. We used this ATF library to screen for key regulators of the pluripotency network and discovered three combinations of ATFs capable of inducing pluripotency without exogenous expression of Oct4 (POU domain, class 5, TF 1). Cognate site identification, global transcriptional profiling, and identification of ATF binding sites reveal that the ATFs do not directly target Oct4; instead, they target distinct nodes that converge to stimulate the endogenous pluripotency network. This forward genetic approach enables cell type conversions without a priori knowledge of potential key regulators and reveals unanticipated gene network dynamics that drive cell fate choices.

Construction, Characterization, and Preliminary BAC-End Sequence Analysis of a Bacterial Artificial Chromosome Library of the Tea Plant (Camellia sinensis)

PubMed Central

Lin, Jinke; Kudrna, Dave; Wing, Rod A.

2011-01-01

We describe the construction and characterization of a publicly available BAC library for the tea plant, Camellia sinensis. Using modified methods, the library was constructed with the aim of developing public molecular resources to advance tea plant genomics research. The library consists of a total of 401,280 clones with an average insert size of 135 kb, providing an approximate coverage of 13.5 haploid genome equivalents. No empty vector clones were observed in a random sampling of 576 BAC clones. Further analysis of 182 BAC-end sequences from randomly selected clones revealed a GC content of 40.35% and low chloroplast and mitochondrial contamination. Repetitive sequence analyses indicated that LTR retrotransposons were the most predominant sequence class (86.93%–87.24%), followed by DNA retrotransposons (11.16%–11.69%). Additionally, we found 25 simple sequence repeats (SSRs) that could potentially be used as genetic markers. PMID:21234344

Assembling short reads from jumping libraries with large insert sizes.

PubMed

Vasilinetc, Irina; Prjibelski, Andrey D; Gurevich, Alexey; Korobeynikov, Anton; Pevzner, Pavel A

2015-10-15

Advances in Next-Generation Sequencing technologies and sample preparation recently enabled generation of high-quality jumping libraries that have a potential to significantly improve short read assemblies. However, assembly algorithms have to catch up with experimental innovations to benefit from them and to produce high-quality assemblies. We present a new algorithm that extends recently described exSPAnder universal repeat resolution approach to enable its applications to several challenging data types, including jumping libraries generated by the recently developed Illumina Nextera Mate Pair protocol. We demonstrate that, with these improvements, bacterial genomes often can be assembled in a few contigs using only a single Nextera Mate Pair library of short reads. Described algorithms are implemented in C++ as a part of SPAdes genome assembler, which is freely available at bioinf.spbau.ru/en/spades. ap@bioinf.spbau.ru Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Reprogramming cell fate with a genome-scale library of artificial transcription factors

PubMed Central

Eguchi, Asuka; Wleklinski, Matthew J.; Spurgat, Mackenzie C.; Heiderscheit, Evan A.; Kropornicka, Anna S.; Vu, Catherine K.; Bhimsaria, Devesh; Swanson, Scott A.; Stewart, Ron; Ramanathan, Parameswaran; Kamp, Timothy J.; Slukvin, Igor; Thomson, James A.; Dutton, James R.; Ansari, Aseem Z.

2016-01-01

Artificial transcription factors (ATFs) are precision-tailored molecules designed to bind DNA and regulate transcription in a preprogrammed manner. Libraries of ATFs enable the high-throughput screening of gene networks that trigger cell fate decisions or phenotypic changes. We developed a genome-scale library of ATFs that display an engineered interaction domain (ID) to enable cooperative assembly and synergistic gene expression at targeted sites. We used this ATF library to screen for key regulators of the pluripotency network and discovered three combinations of ATFs capable of inducing pluripotency without exogenous expression of Oct4 (POU domain, class 5, TF 1). Cognate site identification, global transcriptional profiling, and identification of ATF binding sites reveal that the ATFs do not directly target Oct4; instead, they target distinct nodes that converge to stimulate the endogenous pluripotency network. This forward genetic approach enables cell type conversions without a priori knowledge of potential key regulators and reveals unanticipated gene network dynamics that drive cell fate choices. PMID:27930301

Divergent genome evolution caused by regional variation in DNA gain and loss between human and mouse

PubMed Central

Kortschak, R. Daniel

2018-01-01

The forces driving the accumulation and removal of non-coding DNA and ultimately the evolution of genome size in complex organisms are intimately linked to genome structure and organisation. Our analysis provides a novel method for capturing the regional variation of lineage-specific DNA gain and loss events in their respective genomic contexts. To further understand this connection we used comparative genomics to identify genome-wide individual DNA gain and loss events in the human and mouse genomes. Focusing on the distribution of DNA gains and losses, relationships to important structural features and potential impact on biological processes, we found that in autosomes, DNA gains and losses both followed separate lineage-specific accumulation patterns. However, in both species chromosome X was particularly enriched for DNA gain, consistent with its high L1 retrotransposon content required for X inactivation. We found that DNA loss was associated with gene-rich open chromatin regions and DNA gain events with gene-poor closed chromatin regions. Additionally, we found that DNA loss events tended to be smaller than DNA gain events suggesting that they were able to accumulate in gene-rich open chromatin regions due to their reduced capacity to interrupt gene regulatory architecture. GO term enrichment showed that mouse loss hotspots were strongly enriched for terms related to developmental processes. However, these genes were also located in regions with a high density of conserved elements, suggesting that despite high levels of DNA loss, gene regulatory architecture remained conserved. This is consistent with a model in which DNA gain and loss results in turnover or “churning” in regulatory element dense regions of open chromatin, where interruption of regulatory elements is selected against. PMID:29677183

Divergence of Mammalian Higher Order Chromatin Structure Is Associated with Developmental Loci

PubMed Central

Chambers, Emily V.; Bickmore, Wendy A.; Semple, Colin A.

2013-01-01

Several recent studies have examined different aspects of mammalian higher order chromatin structure – replication timing, lamina association and Hi-C inter-locus interactions — and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution. PMID:23592965

Enrichment analysis in high-throughput genomics - accounting for dependency in the NULL.

PubMed

Gold, David L; Coombes, Kevin R; Wang, Jing; Mallick, Bani

2007-03-01

Translating the overwhelming amount of data generated in high-throughput genomics experiments into biologically meaningful evidence, which may for example point to a series of biomarkers or hint at a relevant pathway, is a matter of great interest in bioinformatics these days. Genes showing similar experimental profiles, it is hypothesized, share biological mechanisms that if understood could provide clues to the molecular processes leading to pathological events. It is the topic of further study to learn if or how a priori information about the known genes may serve to explain coexpression. One popular method of knowledge discovery in high-throughput genomics experiments, enrichment analysis (EA), seeks to infer if an interesting collection of genes is 'enriched' for a Consortium particular set of a priori Gene Ontology Consortium (GO) classes. For the purposes of statistical testing, the conventional methods offered in EA software implicitly assume independence between the GO classes. Genes may be annotated for more than one biological classification, and therefore the resulting test statistics of enrichment between GO classes can be highly dependent if the overlapping gene sets are relatively large. There is a need to formally determine if conventional EA results are robust to the independence assumption. We derive the exact null distribution for testing enrichment of GO classes by relaxing the independence assumption using well-known statistical theory. In applications with publicly available data sets, our test results are similar to the conventional approach which assumes independence. We argue that the independence assumption is not detrimental.

Metagenomic and Metatranscriptomic Analyses Reveal the Structure and Dynamics of a Dechlorinating Community Containing Dehalococcoides mccartyi and Corrinoid-Providing Microorganisms under Cobalamin-Limited Conditions

DOE PAGES

Men, Yujie; Yu, Ke; Bælum, Jacob; ...

2017-02-10

The aim of this paper is to obtain a systems-level understanding of the interactions between Dehalococcoides and corrinoid-supplying microorganisms by analyzing community structures and functional compositions, activities, and dynamics in trichloroethene (TCE)-dechlorinating enrichments. Metagenomes and metatranscriptomes of the dechlorinating enrichments with and without exogenous cobalamin were compared. Seven putative draft genomes were binned from the metagenomes. At an early stage (2 days), more transcripts of genes in the Veillonellaceae bin-genome were detected in the metatranscriptome of the enrichment without exogenous cobalamin than in the one with the addition of cobalamin. Among these genes, sporulation-related genes exhibited the highest differential expressionmore » when cobalamin was not added, suggesting a possible release route of corrinoids from corrinoid producers. Other differentially expressed genes include those involved in energy conservation and nutrient transport (including cobalt transport). The most highly expressed corrinoid de novo biosynthesis pathway was also assigned to the Veillonellaceae bin-genome. Targeted quantitative PCR (qPCR) analyses confirmed higher transcript abundances of those corrinoid biosynthesis genes in the enrichment without exogenous cobalamin than in the enrichment with cobalamin. Furthermore, the corrinoid salvaging and modification pathway of Dehalococcoides was upregulated in response to the cobalamin stress. Finally, this study provides important insights into the microbial interactions and roles played by members of dechlorinating communities under cobalamin-limited conditions.« less

Metagenomic and Metatranscriptomic Analyses Reveal the Structure and Dynamics of a Dechlorinating Community Containing Dehalococcoides mccartyi and Corrinoid-Providing Microorganisms under Cobalamin-Limited Conditions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Men, Yujie; Yu, Ke; Bælum, Jacob

The aim of this paper is to obtain a systems-level understanding of the interactions between Dehalococcoides and corrinoid-supplying microorganisms by analyzing community structures and functional compositions, activities, and dynamics in trichloroethene (TCE)-dechlorinating enrichments. Metagenomes and metatranscriptomes of the dechlorinating enrichments with and without exogenous cobalamin were compared. Seven putative draft genomes were binned from the metagenomes. At an early stage (2 days), more transcripts of genes in the Veillonellaceae bin-genome were detected in the metatranscriptome of the enrichment without exogenous cobalamin than in the one with the addition of cobalamin. Among these genes, sporulation-related genes exhibited the highest differential expressionmore » when cobalamin was not added, suggesting a possible release route of corrinoids from corrinoid producers. Other differentially expressed genes include those involved in energy conservation and nutrient transport (including cobalt transport). The most highly expressed corrinoid de novo biosynthesis pathway was also assigned to the Veillonellaceae bin-genome. Targeted quantitative PCR (qPCR) analyses confirmed higher transcript abundances of those corrinoid biosynthesis genes in the enrichment without exogenous cobalamin than in the enrichment with cobalamin. Furthermore, the corrinoid salvaging and modification pathway of Dehalococcoides was upregulated in response to the cobalamin stress. Finally, this study provides important insights into the microbial interactions and roles played by members of dechlorinating communities under cobalamin-limited conditions.« less

Combining Quantitative Genetic Footprinting and Trait Enrichment Analysis to Identify Fitness Determinants of a Bacterial Pathogen

PubMed Central

Wiles, Travis J.; Norton, J. Paul; Russell, Colin W.; Dalley, Brian K.; Fischer, Kael F.; Mulvey, Matthew A.

2013-01-01

Strains of Extraintestinal Pathogenic Escherichia c oli (ExPEC) exhibit an array of virulence strategies and are a major cause of urinary tract infections, sepsis and meningitis. Efforts to understand ExPEC pathogenesis are challenged by the high degree of genetic and phenotypic variation that exists among isolates. Determining which virulence traits are widespread and which are strain-specific will greatly benefit the design of more effective therapies. Towards this goal, we utilized a quantitative genetic footprinting technique known as transposon insertion sequencing (Tn-seq) in conjunction with comparative pathogenomics to functionally dissect the genetic repertoire of a reference ExPEC isolate. Using Tn-seq and high-throughput zebrafish infection models, we tracked changes in the abundance of ExPEC variants within saturated transposon mutant libraries following selection within distinct host niches. Nine hundred and seventy bacterial genes (18% of the genome) were found to promote pathogen fitness in either a niche-dependent or independent manner. To identify genes with the highest therapeutic and diagnostic potential, a novel Trait Enrichment Analysis (TEA) algorithm was developed to ascertain the phylogenetic distribution of candidate genes. TEA revealed that a significant portion of the 970 genes identified by Tn-seq have homologues more often contained within the genomes of ExPEC and other known pathogens, which, as suggested by the first axiom of molecular Koch's postulates, is considered to be a key feature of true virulence determinants. Three of these Tn-seq-derived pathogen-associated genes—a transcriptional repressor, a putative metalloendopeptidase toxin and a hypothetical DNA binding protein—were deleted and shown to independently affect ExPEC fitness in zebrafish and mouse models of infection. Together, the approaches and observations reported herein provide a resource for future pathogenomics-based research and highlight the diversity of factors required by a single ExPEC isolate to survive within varying host environments. PMID:23990803

A comprehensive proteomics and genomics analysis reveals novel transmembrane proteins in human platelets and mouse megakaryocytes including G6b-B, a novel ITIM protein

PubMed Central

Senis, Yotis A.; Tomlinson, Michael G.; García, Ángel; Dumon, Stephanie; Heath, Victoria L.; Herbert, John; Cobbold, Stephen P.; Spalton, Jennifer C.; Ayman, Sinem; Antrobus, Robin; Zitzmann, Nicole; Bicknell, Roy; Frampton, Jon; Authi, Kalwant; Martin, Ashley; Wakelam, Michael J.O.; Watson, Stephen P.

2007-01-01

Summary The platelet surface is poorly characterized due to the low abundance of many membrane proteins and the lack of specialist tools for their investigation. In this study we have identified novel human platelet and mouse megakaryocyte membrane proteins using specialist proteomic and genomic approaches. Three separate methods were used to enrich platelet surface proteins prior to identification by liquid chromatography and tandem mass spectrometry: lectin affinity chromatography; biotin/NeutrAvidin affinity chromatography; and free flow electrophoresis. Many known, abundant platelet surface transmembrane proteins and several novel proteins were identified using each receptor enrichment strategy. In total, two or more unique peptides were identified for 46, 68 and 22 surface membrane, intracellular membrane and membrane proteins of unknown sub-cellular localization, respectively. The majority of these were single transmembrane proteins. To complement the proteomic studies, we analysed the transcriptome of a highly purified preparation of mature primary mouse megakaryocytes using serial analysis of gene expression in view of the increasing importance of mutant mouse models in establishing protein function in platelets. This approach identified all of the major classes of platelet transmembrane receptors, including multi-transmembrane proteins. Strikingly, 17 of the 25 most megakaryocyte-specific genes (relative to 30 other SAGE libraries) were transmembrane proteins, illustrating the unique nature of the megakaryocyte/platelet surface. The list of novel plasma membrane proteins identified using proteomics includes the immunoglobulin superfamily member G6b, which undergoes extensive alternate splicing. Specific antibodies were used to demonstrate expression of the G6b-B isoform, which contains an immunoreceptor tyrosine-based inhibition motif. G6b-B undergoes tyrosine phosphorylation and association with the SH2-containing phosphatase, SHP-1, in stimulated platelets suggesting that it may play a novel role in limiting platelet activation. PMID:17186946

Genomic Hypomethylation in the Human Germline Associates with Selective Structural Mutability in the Human Genome

PubMed Central

Li, Jian; Harris, R. Alan; Cheung, Sau Wai; Coarfa, Cristian; Jeong, Mira; Goodell, Margaret A.; White, Lisa D.; Patel, Ankita; Kang, Sung-Hae; Shaw, Chad; Chinault, A. Craig; Gambin, Tomasz; Gambin, Anna; Lupski, James R.; Milosavljevic, Aleksandar

2012-01-01

The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR–mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease. PMID:22615578

Expanding Horizons and Encouraging New Perspectives through Myths: Experiments in Interactive Storytelling in an Elementary School Library

ERIC Educational Resources Information Center

Giffard, Sue

2016-01-01

The scenario that the author encountered when she began working in her present position was not unusual for an elementary school library: the students study a culture, and the librarian reads the stories of that culture to them to enrich the study and to make the culture come alive. The fourth-graders studied the Maya in the fall and the ancient…

Host-associated bacterial taxa from Chlorobi, Chloroflexi, GN02, Synergistetes, SR1, TM7, and WPS-2 Phyla/candidate divisions

PubMed Central

Camanocha, Anuj; Dewhirst, Floyd E.

2014-01-01

Background and objective In addition to the well-known phyla Firmicutes, Proteobacteria, Bacteroidetes, Actinobacteria, Spirochaetes, Fusobacteria, Tenericutes, and Chylamydiae, the oral microbiomes of mammals contain species from the lesser-known phyla or candidate divisions, including Synergistetes, TM7, Chlorobi, Chloroflexi, GN02, SR1, and WPS-2. The objectives of this study were to create phyla-selective 16S rDNA PCR primer pairs, create selective 16S rDNA clone libraries, identify novel oral taxa, and update canine and human oral microbiome databases. Design 16S rRNA gene sequences for members of the lesser-known phyla were downloaded from GenBank and Greengenes databases and aligned with sequences in our RNA databases. Primers with potential phylum level selectivity were designed heuristically with the goal of producing nearly full-length 16S rDNA amplicons. The specificity of primer pairs was examined by making clone libraries from PCR amplicons and determining phyla identity by BLASTN analysis. Results Phylum-selective primer pairs were identified that allowed construction of clone libraries with 96–100% specificity for each of the lesser-known phyla. From these clone libraries, seven human and two canine novel oral taxa were identified and added to their respective taxonomic databases. For each phylum, genome sequences closest to human oral taxa were identified and added to the Human Oral Microbiome Database to facilitate metagenomic, transcriptomic, and proteomic studies that involve tiling sequences to the most closely related taxon. While examining ribosomal operons in lesser-known phyla from single-cell genomes and metagenomes, we identified a novel rRNA operon order (23S-5S-16S) in three SR1 genomes and the splitting of the 23S rRNA gene by an I-CeuI-like homing endonuclease in a WPS-2 genome. Conclusions This study developed useful primer pairs for making phylum-selective 16S rRNA clone libraries. Phylum-specific libraries were shown to be useful for identifying previously unrecognized taxa in lesser-known phyla and would be useful for future environmental and host-associated studies. PMID:25317252

«

21

22

23

24

25

»

Some links on this page may take you to non-federal websites. Their policies may differ from this site.