Science.gov

Sample records for enriched genomic libraries

  1. Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries

    PubMed Central

    Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.

    2013-01-01

    Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772

  2. Development of microsatellite markers from an enriched genomic library for genetic analysis of melon (Cucumis melo L.)

    PubMed Central

    Ritschel, Patricia Silva; Lins, Tulio Cesar de Lima; Tristan, Rodrigo Lourenço; Buso, Gláucia Salles Cortopassi; Buso, José Amauri; Ferreira, Márcio Elias

    2004-01-01

    Background Despite the great advances in genomic technology observed in several crop species, the availability of molecular tools such as microsatellite markers has been limited in melon (Cucumis melo L.) and cucurbit species. The development of microsatellite markers will have a major impact on genetic analysis and breeding of melon, especially on the generation of marker saturated genetic maps and implementation of marker assisted breeding programs. Genomic microsatellite enriched libraries can be an efficient alternative for marker development in such species. Results Seven hundred clones containing microsatellite sequences from a Tsp-AG/TC microsatellite enriched library were identified and one-hundred and forty-four primer pairs designed and synthesized. When 67 microsatellite markers were tested on a panel of melon and other cucurbit accessions, 65 revealed DNA polymorphisms among the melon accessions. For some cucurbit species, such as Cucumis sativus, up to 50% of the melon microsatellite markers could be readily used for DNA polymophism assessment, representing a significant reduction of marker development costs. A random sample of 25 microsatellite markers was extracted from the new microsatellite marker set and characterized on 40 accessions of melon, generating an allelic frequency database for the species. The average expected heterozygosity was 0.52, varying from 0.45 to 0.70, indicating that a small set of selected markers should be sufficient to solve questions regarding genotype identity and variety protection. Genetic distances based on microsatellite polymorphism were congruent with data obtained from RAPD marker analysis. Mapping analysis was initiated with 55 newly developed markers and most primers showed segregation according to Mendelian expectations. Linkage analysis detected linkage between 56% of the markers, distributed in nine linkage groups. Conclusions Genomic library microsatellite enrichment is an efficient procedure for marker

  3. Development of microsatellite markers for common bean (Phaseolus vulgaris L.) based on screening of non-enriched, small-insert genomic libraries.

    PubMed

    Blair, Matthew W; Torres, Monica Muñoz; Pedraza, Fabio; Giraldo, Martha C; Buendía, Hector F; Hurtado, Natalia

    2009-09-01

    Microsatellite markers are useful genetic tools for a wide array of genomic analyses although their development is time-consuming and requires the identification of simple sequence repeats (SSRs) from genomic sequences. Screening of non-enriched, small-insert libraries is an effective method of SSR isolation that can give an unbiased picture of motif frequency. Here we adapt high-throughput protocols for the screening of plasmid-based libraries using robotic colony picking and filter preparation. Seven non-enriched genomic libraries from common bean genomic DNA were made by digestion with four frequently cutting restriction enzymes, double digestion with a frequently cutting restriction enzyme and a less frequently cutting restriction enzyme, or sonication. Library quality was compared and three of the small-insert libraries were selected for further analysis. Each library was plated and picked into 384-well plates that were used to create high-density filter arrays of over 18 000 clones each, which were screened with oligonucleotide probes for various SSR motifs. Positive clones were found to have low redundancy. One hundred SSR markers were developed and 80 were tested for polymorphism in a standard parental survey. These microsatellite markers derived from non-SSR-enriched libraries should be useful additions to previous markers developed from enriched libraries. PMID:19935925

  4. Optimized construction of microsatellite-enriched libraries

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The construction of simple sequence repeat (SSR) libraries is an indispensable tool to search for molecular markers as complete genome sequences are still not available for the majority of species of interest. Numerous protocols are available in the literature for the construction of SSR-enriched l...

  5. Work Enrichment for Academic Libraries.

    ERIC Educational Resources Information Center

    Martell, Charles; Untawale, Mercedes

    1983-01-01

    Explores important quality of work life strategy--job redesign--and discusses job enlargement and job enrichment. A case study of academic library personnel demonstrates how introduction of automated systems at University of California, Berkeley led to restructuring and enrichment of jobs. References and list of selected resources are appended.…

  6. Libraries for genomic SELEX.

    PubMed Central

    Singer, B S; Shtatland, T; Brown, D; Gold, L

    1997-01-01

    An increasing number of proteins are being identified that regulate gene expression by binding specific nucleic acidsin vivo. A method termed genomic SELEX facilitates the rapid identification of networks of protein-nucleic acid interactions by identifying within the genomic sequences of an organism the highest affinity sites for any protein of the organism. As with its progenitor, SELEX of random-sequence nucleic acids, genomic SELEX involves iterative binding, partitioning, and amplification of nucleic acids. The two methods differ in that the variable region of the nucleic acid library for genomic SELEX is derived from the genome of an organism. We have used a quick and simple method to construct Escherichia coli, Saccharomyces cerevisiae, and human genomic DNA PCR libraries that can be transcribed with T7 RNA polymerase. We present evidence that the libraries contain overlapping inserts starting at most of the positions within the genome, making these libraries suitable for genomic SELEX. PMID:9016629

  7. Large-scale sequencing based on full-length-enriched cDNA libraries in pigs: contribution to annotation of the pig genome draft sequence

    PubMed Central

    2012-01-01

    Background Along with the draft sequencing of the pig genome, which has been completed by an international consortium, collection of the nucleotide sequences of genes expressed in various tissues and determination of entire cDNA sequences are necessary for investigations of gene function. The sequences of expressed genes are also useful for genome annotation, which is important for isolating the genes responsible for particular traits. Results We performed a large-scale expressed sequence tag (EST) analysis in pigs by using 32 full-length-enriched cDNA libraries derived from 28 kinds of tissues and cells, including seven tissues (brain, cerebellum, colon, hypothalamus, inguinal lymph node, ovary, and spleen) derived from pigs that were cloned from a sow subjected to genome sequencing. We obtained more than 330,000 EST reads from the 5′-ends of the cDNA clones. Comparison with human and bovine gene catalogs revealed that the ESTs corresponded to at least 15,000 genes. cDNA clones representing contigs and singlets generated by assembly of the EST reads were subjected to full-length determination of inserts. We have finished sequencing 31,079 cDNA clones corresponding to more than 12,000 genes. Mapping of the sequences of these cDNA clones on the draft sequence of the pig genome has indicated that the clones are derived from about 15,000 independent loci on the pig genome. Conclusions ESTs and cDNA sequences derived from full-length-enriched libraries are valuable for annotation of the draft sequence of the pig genome. This information will also contribute to the exploration of promoter sequences on the genome and to molecular biology-based analyses in pigs. PMID:23150988

  8. Enriching screening libraries with bioactive fragment space.

    PubMed

    Zhang, Na; Zhao, Hongtao

    2016-08-01

    By deconvoluting 238,073 bioactive molecules in the ChEMBL library into extended Murcko ring systems, we identified a set of 2245 ring systems present in at least 10 molecules. These ring systems belong to 2221 clusters by ECFP4 fingerprints with a minimum intracluster similarity of 0.8. Their overlap with ring systems in commercial libraries was further quantified. Our findings suggest that success of a small fragment library is driven by the convergence of effective coverage of bioactive ring systems (e.g., 10% coverage by 1000 fragments vs. 40% by 2million HTS compounds), high enrichment of bioactive ring systems, and low molecular complexity enhancing the probability of a match with the protein targets. Reconciling with the previous studies, bioactive ring systems are underrepresented in screening libraries. As such, we propose a library of virtual fragments with key functionalities via fragmentation of bioactive molecules. Its utility is exemplified by a prospective application on protein kinase CK2, resulting in the discovery of a series of novel inhibitors with the most potent compound having an IC50 of 0.5μM and a ligand efficiency of 0.41kcal/mol per heavy atom. PMID:27311891

  9. REEF: searching REgionally Enriched Features in genomes

    PubMed Central

    Coppe, Alessandro; Danieli, Gian Antonio; Bortoluzzi, Stefania

    2006-01-01

    Background In Eukaryotic genomes, different features including genes are not uniformly distributed. The integration of annotation information and genomic position of functional DNA elements in the Eukaryotic genomes opened the way to test novel hypotheses of higher order genome organization and regulation of expression. Results REEF is a new tool, aimed at identifying genomic regions enriched in specific features, such as a class or group of genes homogeneous for expression and/or functional characteristics. The method for the calculation of local feature enrichment uses test statistic based on the Hypergeometric Distribution applied genome-wide by using a sliding window approach and adopting the False Discovery Rate for controlling multiplicity. REEF software, source code and documentation are freely available at . Conclusion REEF can aid to shed light on the role of organization of specific genomic regions in the determination of their functional role. PMID:17042935

  10. Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability.

    PubMed

    Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B

    2015-01-01

    Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance. PMID:25240849

  11. Hybridization Capture Using Short PCR Products Enriches Small Genomes by Capturing Flanking Sequences (CapFlank)

    PubMed Central

    Tsangaras, Kyriakos; Wales, Nathan; Sicheritz-Pontén, Thomas; Rasmussen, Simon; Michaux, Johan; Ishida, Yasuko; Morand, Serge; Kampmann, Marie-Louise; Gilbert, M. Thomas P.; Greenwood, Alex D.

    2014-01-01

    Solution hybridization capture methods utilize biotinylated oligonucleotides as baits to enrich homologous sequences from next generation sequencing (NGS) libraries. Coupled with NGS, the method generates kilo to gigabases of high confidence consensus targeted sequence. However, in many experiments, a non-negligible fraction of the resulting sequence reads are not homologous to the bait. We demonstrate that during capture, the bait-hybridized library molecules add additional flanking library sequences iteratively, such that baits limited to targeting relatively short regions (e.g. few hundred nucleotides) can result in enrichment across entire mitochondrial and bacterial genomes. Our findings suggest that some of the off-target sequences derived in capture experiments are non-randomly enriched, and that CapFlank will facilitate targeted enrichment of large contiguous sequences with minimal prior target sequence information. PMID:25275614

  12. Selective enrichment of damaged DNA molecules for ancient genome sequencing

    PubMed Central

    2014-01-01

    Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA—the presence of deoxyuracils—for selective enrichment of endogenous DNA against a complex background of contamination during DNA library preparation. By applying the method to Neanderthal DNA extracts that are heavily contaminated with present-day human DNA, we show that the fraction of useful sequence information increases ∼10-fold and that the resulting sequences are more efficiently depleted of human contamination than when using purely computational approaches. Furthermore, we show that U selection can lead to a four- to fivefold increase in the proportion of endogenous DNA sequences relative to those of microbial contaminants in some samples. U selection may thus help to lower the costs for ancient genome sequencing of nonhuman samples also. PMID:25081630

  13. Ancient whole genome enrichment using baits built from modern DNA.

    PubMed

    Enk, Jacob M; Devault, Alison M; Kuch, Melanie; Murgha, Yusuf E; Rouillard, Jean-Marie; Poinar, Hendrik N

    2014-05-01

    We report metrics from complete genome capture of nuclear DNA from extinct mammoths using biotinylated RNAs transcribed from an Asian elephant DNA extract. Enrichment of the nuclear genome ranged from 1.06- to 18.65-fold, to an apparent maximum threshold of ∼80% on-target. This projects an order of magnitude less costly complete genome sequencing from long-dead organisms, even when a reference genome is unavailable for bait design. PMID:24531081

  14. Consequences of Normalizing Transcriptomic and Genomic Libraries of Plant Genomes Using a Duplex-Specific Nuclease and Tetramethylammonium Chloride

    PubMed Central

    Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088

  15. Enriched domain detector: a program for detection of wide genomic enrichment domains robust against local variations

    PubMed Central

    Lund, Eivind; Oldenburg, Anja R.; Collas, Philippe

    2014-01-01

    Nuclear lamins contact the genome at the nuclear periphery through large domains and are involved in chromatin organization. Among broad peak calling algorithms available to date, none are suited for mapping lamin–genome interactions genome wide. We disclose a novel algorithm, enriched domain detector (EDD), for analysis of broad enrichment domains from chromatin immunoprecipitation (ChIP)-seq data. EDD enables discovery of genomic domains interacting with broadly distributed proteins, such as A- and B-type lamins affinity isolated by ChIP. The advantages of EDD over existing broad peak callers are sensitivity to domain width rather than enrichment strength at a particular site, and robustness against local variations. PMID:24782521

  16. Mimicking nature: Phosphopeptide enrichment using combinatorial libraries of affinity ligands.

    PubMed

    Batalha, Iris L; Zhou, Houjiang; Lilley, Kathryn; Lowe, Christopher R; Roque, Ana C A

    2016-07-29

    Phosphorylation is a reversible post-translational modification of proteins that controls a plethora of cellular processes and triggers specific physiological responses, for which there is a need to develop tools to characterize phosphorylated targets efficiently. Here, a combinatorial library of triazine-based synthetic ligands comprising 64 small molecules has been rationally designed, synthesized and screened for the enrichment of phosphorylated peptides. The lead candidate (coined A8A3), composed of histidine and phenylalanine mimetic components, showed high binding capacity and selectivity for binding mono- and multi-phosphorylated peptides at pH 3. Ligand A8A3 was coupled onto both cross-linked agarose and magnetic nanoparticles, presenting higher binding capacities (100-fold higher) when immobilized on the magnetic support. The magnetic adsorbent was further screened against a tryptic digest of two phosphorylated proteins (α- and β-caseins) and one non-phosphorylated protein (bovine serum albumin, BSA). The MALDI-TOF mass spectra of the eluted peptides allowed the identification of nine phosphopeptides, comprising both mono- and multi-phosphorylated peptides. PMID:27345211

  17. cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing

    PubMed Central

    Hartwig, Benjamin; Reinhardt, Richard; Schneeberger, Korbinian

    2016-01-01

    The utility of genome assemblies does not only rely on the quality of the assembled genome sequence, but also on the quality of the gene annotations. The Pacific Biosciences Iso-Seq technology is a powerful support for accurate eukaryotic gene model annotation as it allows for direct readout of full-length cDNA sequences without the need for noisy short read-based transcript assembly. We propose the implementation of the TeloPrime Full Length cDNA Amplification kit to the Pacific Biosciences Iso-Seq technology in order to enrich for genuine full-length transcripts in the cDNA libraries. We provide evidence that TeloPrime outperforms the commonly used SMARTer PCR cDNA Synthesis Kit in identifying transcription start and end sites in Arabidopsis thaliana. Furthermore, we show that TeloPrime-based Pacific Biosciences Iso-Seq can be successfully applied to the polyploid genome of bread wheat (Triticum aestivum) not only to efficiently annotate gene models, but also to identify novel transcription sites, gene homeologs, splicing isoforms and previously unidentified gene loci. PMID:27327613

  18. [An optimized method for construction of genomic library].

    PubMed

    Xu, Song; Zhang, Juan; Ma, Li-Xin

    2006-06-01

    Construction of genomic libraries is basic and important. Because of the laboriousness and high background of traditional methods for constructing genomic libraries, we improved them by overcoming these disadvantages. Two Ear I sites were chosen as the cloning sites, which can produce variable 3-base cohesive ends. Therefore the two overhangs could be devised to prevent a match and to avoid self-ligation of vector. Genomic DNA is cleaved partially with Sau3A I and subsequently incubated with dGTP and Klenow fragment of DNA polymerase Iso the self-ligation of fragments and ligation between them are blocked. In this study, the ARS probe vector (pHBM803/Trp) based on the improved method was constructed and then we constructed the Oryza sativa genomic library separately with the traditional method and improved method and compared them. The result of experiment indicated that the improved method could optimize the quality of library. PMID:16818436

  19. TECHNIQUE FOR SCREENING AND MAINTAINING SMALLER GENOMIC LIBRARIES

    EPA Science Inventory

    A technique for screening and simultaneously maintaining individual clones of the gene library for long-term storage is described. his method is particularly useful for identification and cloning of genes from cosmid-based genomic libraries of prokaryotes that constitute a smalle...

  20. Adaptation of a commercial robot for genome library replication

    SciTech Connect

    Uber, D.C.; Searles, W.L.

    1994-01-01

    This report describes tools and fixtures developed at the Human Genome Center at Lawrence Berkeley Laboratory for the Hewlett-Packard ORCA{trademark} (Optimized Robot for Chemical Analysis) to replicate large genome libraries. Photographs and engineering drawings of the various custom-designed components are included.

  1. Enriching User-Oriented Class Associations for Library Classification Schemes.

    ERIC Educational Resources Information Center

    Pu, Hsiao-Tieh; Yang, Chyan

    2003-01-01

    Explores the possibility of adding user-oriented class associations to hierarchical library classification schemes. Analyses a log of book circulation records from a university library in Taiwan and shows that classification schemes can be made more adaptable by analyzing circulation patterns of similar users. (Author/LRW)

  2. Enriching Critical Thinking and Language Learning with Educational Digital Libraries

    ERIC Educational Resources Information Center

    Lu, Hsin-lin

    2012-01-01

    As the amount of information available in online digital libraries increases exponentially, questions arise concerning the most productive way to use that information to advance learning. Applying the earlier information seeking theories advocated by Kelly (1963), Taylor (1968), and Belkin (1980) to the digital libraries experience, Carol Kuhlthau…

  3. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits.

    PubMed

    Karamitros, Timokratis; Magiorkinis, Gkikas

    2015-12-15

    The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage over the targets and their flanking regions. Oxford Nanopore Technologies recently released an USB3.0-interfaced sequencer, the MinION. To date no particular method for enriching MinION libraries has been standardized. Here, using biotinylated PCR-generated baits in a novel approach, we describe a simple and efficient way for multiplexed enrichment of MinION libraries, overcoming technical limitations related with the chemistry of the sequencing-adapters and the length of the DNA fragments. Using Phage Lambda and Escherichia coli as models we selectively enrich for specific targets, significantly increasing the corresponding read-coverage, eliminating unwanted regions. We show that by capturing genomic fragments, which contain the target sequences, we recover reads extending targeted regions and thus can be used for the determination of potentially unknown flanking sequences. By pooling enriched libraries derived from two distinct E. coli strains and analyzing them in parallel, we demonstrate the efficiency of this method in multiplexed format. Crucially we evaluated the optimal bait size for large fragment libraries and we describe for the first time a standardized method for target enrichment in MinION platform. PMID:26240383

  4. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits

    PubMed Central

    Karamitros, Timokratis; Magiorkinis, Gkikas

    2015-01-01

    The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage over the targets and their flanking regions. Oxford Nanopore Technologies recently released an USB3.0-interfaced sequencer, the MinION. To date no particular method for enriching MinION libraries has been standardized. Here, using biotinylated PCR-generated baits in a novel approach, we describe a simple and efficient way for multiplexed enrichment of MinION libraries, overcoming technical limitations related with the chemistry of the sequencing-adapters and the length of the DNA fragments. Using Phage Lambda and Escherichia coli as models we selectively enrich for specific targets, significantly increasing the corresponding read-coverage, eliminating unwanted regions. We show that by capturing genomic fragments, which contain the target sequences, we recover reads extending targeted regions and thus can be used for the determination of potentially unknown flanking sequences. By pooling enriched libraries derived from two distinct E. coli strains and analyzing them in parallel, we demonstrate the efficiency of this method in multiplexed format. Crucially we evaluated the optimal bait size for large fragment libraries and we describe for the first time a standardized method for target enrichment in MinION platform. PMID:26240383

  5. Deep Subsurface Life from North Pond: Enrichment, Isolation, Characterization and Genomes of Heterotrophic Bacteria

    PubMed Central

    Russell, Joseph A.; León-Zayas, Rosa; Wrighton, Kelly; Biddle, Jennifer F.

    2016-01-01

    Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic water-column west of the Mid-Atlantic Ridge at 22°N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sediment column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. The cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface. PMID:27242705

  6. Deep subsurface life from North Pond: Enrichment, isolation, characterization and genomes of heterotrophic bacteria

    DOE PAGESBeta

    Russell, Joseph A.; Leon-Zayas, Rosa; Wrighton, Kelly; Biddle, Jennifer F.

    2016-05-10

    Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic watercolumn west of the Mid-Atlantic Ridge at 22° N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sedimentmore » column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. Furthermore, the cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface.« less

  7. Deep Subsurface Life from North Pond: Enrichment, Isolation, Characterization and Genomes of Heterotrophic Bacteria.

    PubMed

    Russell, Joseph A; León-Zayas, Rosa; Wrighton, Kelly; Biddle, Jennifer F

    2016-01-01

    Studies of subsurface microorganisms have yielded few environmentally relevant isolates for laboratory studies. In order to address this lack of cultivated microorganisms, we initiated several enrichments on sediment and underlying basalt samples from North Pond, a sediment basin ringed by basalt outcrops underlying an oligotrophic water-column west of the Mid-Atlantic Ridge at 22°N. In contrast to anoxic enrichments, growth was observed in aerobic, heterotrophic enrichments from sediment of IODP Hole U1382B at 4 and 68 m below seafloor (mbsf). These sediment depths, respectively, correspond to the fringes of oxygen penetration from overlying seawater in the top of the sediment column and upward migration of oxygen from oxic seawater from the basalt aquifer below the sediment. Here we report the enrichment, isolation, initial characterization and genomes of three isolated aerobic heterotrophs from North Pond sediments; an Arthrobacter species from 4 mbsf, and Paracoccus and Pseudomonas species from 68 mbsf. These cultivated bacteria are represented in the amplicon 16S rRNA gene libraries created from whole sediments, albeit at low (up to 2%) relative abundance. We provide genomic evidence from our isolates demonstrating that the Arthrobacter and Pseudomonas isolates have the potential to respire nitrate and oxygen, though dissimilatory nitrate reduction could not be confirmed in laboratory cultures. The cultures from this study represent members of abundant phyla, as determined by amplicon sequencing of environmental DNA extracts, and allow for further studies into geochemical factors impacting life in the deep subsurface. PMID:27242705

  8. Semantically Enriching the Search System of a Music Digital Library

    NASA Astrophysics Data System (ADS)

    de Juan, Paloma; Iglesias, Carlos

    Traditional search systems are usually based on keywords, a very simple and convenient mechanism to express a need for information. This is the most popular way of searching the Web, although it is not always an easy task to accurately summarize a natural language query in a few keywords. Working with keywords means losing the context, which is the only thing that can help us deal with ambiguity. This is the biggest problem of keyword-based systems. Semantic Web technologies seem a perfect solution to this problem, since they make it possible to represent the semantics of a given domain. In this chapter, we present three projects, Harmos, Semusici and Cantiga, whose aim is to provide access to a music digital library. We will describe two search systems, a traditional one and a semantic one, developed in the context of these projects and compare them in terms of usability and effectiveness.

  9. Raman spectroscopy detects phenotypic differences among Escherichia coli enriched for 1-butanol tolerance using a metagenomic DNA library.

    PubMed

    Freedman, Benjamin G; Zu, Theresah N K; Wallace, Robert S; Senger, Ryan S

    2016-07-01

    Advances in Raman spectroscopy are enabling more comprehensive measurement of microbial cell chemical composition. Advantages include results returned in near real-time and minimal sample preparation. In this research, Raman spectroscopy is used to analyze E. coli with engineered solvent tolerance, which is a multi-genic trait associated with complex and uncharacterized phenotypes that are of value to industrial microbiology. To generate solvent tolerant phenotypes, E. coli transformed with DNA libraries are serially enriched in the presence of 0.9% (v/v) and 1.1% (v/v) 1-butanol. DNA libraries are created using degenerate oligonucleotide primed PCR (DOP-PCR) from the genomic DNA of E. coli, Clostridium acetobutylicum ATCC 824, and the metagenome of a stream bank soil sample, which contained DNA from 72 different phyla. DOP-PCR enabled high efficiency library cloning (with no DNA shearing or end-polishing) and the inclusion un-culturable organisms. Nine strains with improved tolerance are analyzed by Raman spectroscopy and vastly different solvent-tolerant phenotypes are characterized. Common among these are improved membrane rigidity from increasing the fraction of unsaturated fatty acids at the expense of cyclopropane fatty acids. Raman spectroscopy offers the ability to monitor cell phenotype changes in near real-time and is adaptable to high-throughput screening, making it relevant to metabolic engineering. PMID:26814030

  10. Full-Length Enriched cDNA Libraries and ORFeome Analysis of Sugarcane Hybrid and Ancestor Genotypes

    PubMed Central

    Becker, Scott; Pörtner-Taliana, Antje; Souza, Glaucia Mendes

    2014-01-01

    Sugarcane is a major crop used for food and bioenergy production. Modern cultivars are hybrids derived from crosses between Saccharum officinarum and Saccharum spontaneum. Hybrid cultivars combine favorable characteristics from ancestral species and contain a genome that is highly polyploid and aneuploid, containing 100–130 chromosomes. These complex genomes represent a huge challenge for molecular studies and for the development of biotechnological tools that can facilitate sugarcane improvement. Here, we describe full-length enriched cDNA libraries for Saccharum officinarum, Saccharum spontaneum, and one hybrid genotype (SP803280) and analyze the set of open reading frames (ORFs) in their genomes (i.e., their ORFeomes). We found 38,195 (19%) sugarcane-specific transcripts that did not match transcripts from other databases. Less than 1.6% of all transcripts were ancestor-specific (i.e., not expressed in SP803280). We also found 78,008 putative new sugarcane transcripts that were absent in the largest sugarcane expressed sequence tag database (SUCEST). Functional annotation showed a high frequency of protein kinases and stress-related proteins. We also detected natural antisense transcript expression, which mapped to 94% of all plant KEGG pathways; however, each genotype showed different pathways enriched in antisense transcripts. Our data appeared to cover 53.2% (17,563 genes) and 46.8% (937 transcription factors) of all sugarcane full-length genes and transcription factors, respectively. This work represents a significant advancement in defining the sugarcane ORFeome and will be useful for protein characterization, single nucleotide polymorphism and splicing variant identification, evolutionary and comparative studies, and sugarcane genome assembly and annotation. PMID:25222706

  11. Inexpensive Multiplexed Library Preparation for Megabase-Sized Genomes

    PubMed Central

    Desai, Michael M.; Kishony, Roy

    2015-01-01

    Whole-genome sequencing has become an indispensible tool of modern biology. However, the cost of sample preparation relative to the cost of sequencing remains high, especially for small genomes where the former is dominant. Here we present a protocol for rapid and inexpensive preparation of hundreds of multiplexed genomic libraries for Illumina sequencing. By carrying out the Nextera tagmentation reaction in small volumes, replacing costly reagents with cheaper equivalents, and omitting unnecessary steps, we achieve a cost of library preparation of $8 per sample, approximately 6 times cheaper than the standard Nextera XT protocol. Furthermore, our procedure takes less than 5 hours for 96 samples. Several hundred samples can then be pooled on the same HiSeq lane via custom barcodes. Our method will be useful for re-sequencing of microbial or viral genomes, including those from evolution experiments, genetic screens, and environmental samples, as well as for other sequencing applications including large amplicon, open chromosome, artificial chromosomes, and RNA sequencing. PMID:26000737

  12. Selective enrichment of environmental DNA libraries for genes encoding nonribosomal peptides and polyketides by phosphopantetheine transferase-dependent complementation of siderophore biosynthesis

    PubMed Central

    Charlop-Powers, Zachary; Banik, Jacob J.; Owen, Jeremy G.; Craig, Jeffrey W.; Brady, Sean F.

    2012-01-01

    The cloning of DNA directly from environmental samples provides a means to functionally access biosynthetic gene clusters present in the genomes of the large fraction of bacteria that remains recalcitrant to growth in the laboratory. Herein we demonstrate a method by which complementation of phosphopantetheine transferase deletion mutants can be used to restore siderophore biosynthesis and to therefore selectively enrich eDNA libraries for nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) gene sequences to unprecedented levels. The common use of NRPS/PKS-derived siderophores across bacterial taxa makes this method generalizable and should allow for the facile selective enrichment of NRPS/PKS-containing biosynthetic gene clusters from large environmental DNA libraries using a wide variety of phylogenetically diverse bacterial hosts. PMID:23072412

  13. Chromosome region-specific libraries for human genome analysis

    SciTech Connect

    Kao, Fa-Ten.

    1991-01-01

    We have made important progress since the beginning of the current grant year. We have further developed the microdissection and PCR- assisted microcloning techniques using the linker-adaptor method. We have critically evaluated the microdissection libraries constructed by this microtechnology and proved that they are of high quality. We further demonstrated that these microdissection clones are useful in identifying corresponding YAC clones for a thousand-fold expansion of the genomic coverage and for contig construction. We are also improving the technique of cloning the dissected fragments in test tube by the TDT method. We are applying both of these PCR cloning technique to human chromosomes 2 and 5 to construct region-specific libraries for physical mapping purposes of LLNL and LANL. Finally, we are exploring efficient procedures to use unique sequence microclones to isolate cDNA clones from defined chromosomal regions as valuable resources for identifying expressed gene sequences in the human genome. We believe that we are making important progress under the auspices of this DOE human genome program grant and we will continue to make significant contributions in the coming year. 4 refs., 4 figs.

  14. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    SciTech Connect

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  15. Library construction for ancient genomics: single strand or double strand?

    PubMed

    Bennett, E Andrew; Massilani, Diyendo; Lizzo, Giulia; Daligault, Julien; Geigl, Eva-Maria; Grange, Thierry

    2014-06-01

    A novel method of library construction that takes advantage of a single-stranded DNA ligase has been recently described and used to generate high-resolution genomes from ancient DNA samples. While this method is effective and appears to recover a greater fraction of endogenous ancient material, there has been no direct comparison of results from different library construction methods on a diversity of ancient DNA samples. In addition, the single-stranded method is limited by high cost and lengthy preparation time and is restricted to the Illumina sequencing platform. Here we present in-depth comparisons of the different available library construction methods for DNA purified from 16 ancient and modern faunal and human remains, covering a range of different taphonomic and climatic conditions. We further present a DNA purification method for ancient samples that permits the concentration of a large volume of dissolved extract with minimal manipulation and methodological improvements to the single-stranded method to render it more economical and versatile, in particular to expand its use to both the Illumina and the Ion Torrent sequencing platforms. We show that the single-stranded library construction method improves the relative recovery of endogenous to exogenous DNA for most, but not all, of our ancient extracts. PMID:24924389

  16. GenomeTools: a comprehensive software library for efficient processing of structured genome annotations.

    PubMed

    Gremme, Gordon; Steinbiss, Sascha; Kurtz, Stefan

    2013-01-01

    Genome annotations are often published as plain text files describing genomic features and their subcomponents by an implicit annotation graph. In this paper, we present the GenomeTools, a convenient and efficient software library and associated software tools for developing bioinformatics software intended to create, process or convert annotation graphs. The GenomeTools strictly follow the annotation graph approach, offering a unified graph-based representation. This gives the developer intuitive and immediate access to genomic features and tools for their manipulation. To process large annotation sets with low memory overhead, we have designed and implemented an efficient pull-based approach for sequential processing of annotations. This allows to handle even the largest annotation sets, such as a complete catalogue of human variations. Our object-oriented C-based software library enables a developer to conveniently implement their own functionality on annotation graphs and to integrate it into larger workflows, simultaneously accessing compressed sequence data if required. The careful C implementation of the GenomeTools does not only ensure a light-weight memory footprint while allowing full sequential as well as random access to the annotation graph, but also facilitates the creation of bindings to a variety of script programming languages (like Python and Ruby) sharing the same interface. PMID:24091398

  17. The infectious BAC genomic DNA expression library: a high capacity vector system for functional genomics.

    PubMed

    Lufino, Michele M P; Edser, Pauline A H; Quail, Michael A; Rice, Stephen; Adams, David J; Wade-Martins, Richard

    2016-01-01

    Gene dosage plays a critical role in a range of cellular phenotypes, yet most cellular expression systems use heterologous cDNA-based vectors which express proteins well above physiological levels. In contrast, genomic DNA expression vectors generate physiologically-relevant levels of gene expression by carrying the whole genomic DNA locus of a gene including its regulatory elements. Here we describe the first genomic DNA expression library generated using the high-capacity herpes simplex virus-1 amplicon technology to deliver bacterial artificial chromosomes (BACs) into cells by viral transduction. The infectious BAC (iBAC) library contains 184,320 clones with an average insert size of 134.5 kb. We show in a Chinese hamster ovary (CHO) disease model cell line and mouse embryonic stem (ES) cells that this library can be used for genetic rescue studies in a range of contexts including the physiological restoration of Ldlr deficiency, and viral receptor expression. The iBAC library represents an important new genetic analysis tool openly available to the research community. PMID:27353647

  18. The infectious BAC genomic DNA expression library: a high capacity vector system for functional genomics

    PubMed Central

    Lufino, Michele M. P.; Edser, Pauline A. H.; Quail, Michael A.; Rice, Stephen; Adams, David J.; Wade-Martins, Richard

    2016-01-01

    Gene dosage plays a critical role in a range of cellular phenotypes, yet most cellular expression systems use heterologous cDNA-based vectors which express proteins well above physiological levels. In contrast, genomic DNA expression vectors generate physiologically-relevant levels of gene expression by carrying the whole genomic DNA locus of a gene including its regulatory elements. Here we describe the first genomic DNA expression library generated using the high-capacity herpes simplex virus-1 amplicon technology to deliver bacterial artificial chromosomes (BACs) into cells by viral transduction. The infectious BAC (iBAC) library contains 184,320 clones with an average insert size of 134.5 kb. We show in a Chinese hamster ovary (CHO) disease model cell line and mouse embryonic stem (ES) cells that this library can be used for genetic rescue studies in a range of contexts including the physiological restoration of Ldlr deficiency, and viral receptor expression. The iBAC library represents an important new genetic analysis tool openly available to the research community. PMID:27353647

  19. Optimizing restriction fragment fingerprinting methods for ordering large genomic libraries

    SciTech Connect

    Branscomb, E.; Slezak, T.; Pae, R.; Carrano, A.V. ); Galas, D.; Waterman, M. )

    1990-01-01

    The authors present a statistical analysis of the problem of ordering large genomic cloned libraries through overlap detection based on restriction fingerprinting. Such ordering projects involve a large investment of effort involving many repetitious experiments. Their primary purpose here is to provide methods of maximizing the efficiency of such efforts. To this end, they adopt a statistical approach that uses the likelihood ratio as a statistic to detect overlap. The main advantages of this approach are that (1) it allows the relatively straightforward incorporation of the observed statistical properties of the data; (2) it permits the efficiency of a particular experimental method for detecting overlap to be quantitatively defined so that alternative experimental designs may be compared and optimized; and (3) it yields a direct estimate of the probability that any two library members overlap. This estimate is a critical tool for the accurate, automatic assembly of overlapping sets of fragments into islands called contigs.' These contigs must subsequently be connected by other methods to provide an ordered set of overlapping fragments covering the entire genome.

  20. Selection of chromosome 22-specific clones from human genomic BAC library using a chromosome-specific cosmid library pool

    SciTech Connect

    Kim, U.J.; Shizuya, H.; Birren, B.

    1994-07-15

    A new approach to rapidly identify chromosome-specific subsets of clones from a total human genomic library is described. The authors report here the results of screening a human bacterial artificial chromosome (BAC) library using the total pool of clones from a chromosome 22-specific cosmid library as a composite probe. The human BAC library was gridded on filters at high density and hybridized with DNA from the pooled chromosome 22-specific Lawrist library under suppressive conditions. In a single hybridization, they picked 280 candidates from the BAC library representing over 30,000 clones (or 1.2 x coverage of human genome). This subset contained more than 60% of the chromosome 22-specific BAC clones that were previously found to be present in the original BAC library. In principle, this approach can be applied to select a subset of clones from other global libraries with relatively large inserts using a pool from a regional library as a composite probe. It is important to note that the target and probe libraries must be based on vectors that share no homology with each other. 8 refs., 2 figs., 2 tabs.

  1. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    PubMed

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. PMID:27237775

  2. LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor

    PubMed Central

    Sheffield, Nathan C.; Bock, Christoph

    2016-01-01

    Summary: Genomic datasets are often interpreted in the context of large-scale reference databases. One approach is to identify significantly overlapping gene sets, which works well for gene-centric data. However, many types of high-throughput data are based on genomic regions. Locus Overlap Analysis (LOLA) provides easy and automatable enrichment analysis for genomic region sets, thus facilitating the interpretation of functional genomics and epigenomics data. Availability and Implementation: R package available in Bioconductor and on the following website: http://lola.computational-epigenetics.org. Contact: nsheffield@cemm.oeaw.ac.at or cbock@cemm.oeaw.ac.at PMID:26508757

  3. Construction and characterization of a bovine BAC library with four genome-equivalent coverage.

    PubMed

    Eggen, A; Gautier, M; Billaut, A; Petit, E; Hayes, H; Laurent, P; Urban, C; Pfister-Genskow, M; Eilertsen, K; Bishop, M D

    2001-01-01

    A bovine artificial chromosome (BAC) library of 105 984 clones has been constructed in the vector pBeloBAC11 and organized in 3-dimension pools and high density membranes for screening by PCR and hybridization. The average insert size, determined after analysis of 388 clones, was estimated at 120 kb corresponding to a four genome coverage. Given the fact that a male was used to construct the library, the probability of finding any given autosomal and X or Y locus is respectively 0.98 and 0.86. The library was screened for 164 microsatellite markers and an average of 3.9 superpools was positive for each PCR system. None of the 50 or so BAC clones analysed by FISH was chimeric. This BAC library increases the international genome coverage for cattle to around 28 genome equivalents and extends the coverage of the ruminant genomes available at the Inra resource center to 15 genome equivalents. PMID:11712974

  4. Targeted enrichment of genomic DNA regions for next-generation sequencing

    PubMed Central

    ElSharawy, Abdou; Sauer, Sascha; van Helvoort, Joop M.L.M.; van der Zaag, P.J.; Franke, Andre; Nilsson, Mats; Lehrach, Hans; Brookes, Anthony J.

    2011-01-01

    In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings. PMID:22121152

  5. In-Solution Hybridization for the Targeted Enrichment of the Whole Mitochondrial Genome.

    PubMed

    Bekaert, B; Ellerington, R; Van den Abbeele, L; Decorte, R

    2016-01-01

    A detailed protocol is presented for the targeted enrichment of whole mitochondrial genomes based on an in-solution hybridization strategy. Bait is produced in-house by sonication of two long-range PCR amplicons and ligation of biotinylated double-stranded adapters. Indexed target DNA is hybridized with the bait in a multiplex enrichment reaction and pulled down using magnetic streptavidin beads followed by subsequent post-enrichment PCR and sequencing on an Illumina MiSeq. This strategy removes the need for expensive commercial bait probes while allowing enrichment of multiple samples in a single hybridization reaction. The method is particularly suitable for degraded DNA as it is able to enrich short DNA fragments and is not susceptible to polymerase artifacts introduced during PCR-based assays. PMID:27259740

  6. Gap Closing/Finishing by Targeted Genomic Region Enrichment and Sequencing

    SciTech Connect

    Singh, Kanwar; Froula, Jeff; Trice, Hope; Pennacchio, Len A.; Chen, Feng

    2010-05-27

    Gap Closing/Finishing of draft genome assemblies is a labor and cost intensive process where several rounds of repetitious amplification and sequencing are required. Here we demonstrate a high throughput procedure where custom primers flanking gaps in draft genomes are designed. Primer libraries containing up to 4,000 unique pairs in independent droplets are merged with a fragmented genomic template. From this millions of picoliter scale droplets are formed, each one being the functional equivalent of an individual PCR reaction. The PCR products are concatenated and sequenced by Illumina which is then assembled and used for gap closure. Here we present an overall experimental strategy, primer design algorithm and initial results.

  7. Construction of bacterial artificial chromosome libraries from the parasitic nematode Brugia malayi and physical mapping of the genome of its Wolbachia endosymbiont.

    PubMed

    Foster, Jeremy M; Kumar, Sanjay; Ganatra, Mehul B; Kamal, Ibrahim H; Ware, Jennifer; Ingram, Jessica; Pope-Chappell, Jesse; Guiliano, David; Whitton, Claire; Daub, Jennifer; Blaxter, Mark L; Slatko, Barton E

    2004-05-01

    The parasitic nematode, Brugia malayi, causes lymphatic filariasis in humans, which in severe cases leads to the condition known as elephantiasis. The parasite contains an endosymbiotic alpha-proteobacterium of the genus Wolbachia that is required for normal worm development and fecundity and is also implicated in the pathology associated with infections by these filarial nematodes. Bacterial artificial chromosome libraries were constructed from B. malayi DNA and provide over 11-fold coverage of the nematode genome. Wolbachia genomic fragments were simultaneously cloned into the libraries giving over 5-fold coverage of the 1.1 Mb bacterial genome. A physical framework for the Wolbachia genome was developed by construction of a plasmid library enriched for Wolbachia DNA as a source of sequences to hybridise to high-density bacterial artificial chromosome colony filters. Bacterial artificial chromosome end sequencing provided additional Wolbachia probe sequences to facilitate assembly of a contig that spanned the entire genome. The Wolbachia sequences provided a marker approximately every 10 kb. Four rare-cutting restriction endonucleases were used to restriction map the genome to a resolution of approximately 60 kb and demonstrate concordance between the bacterial artificial chromosome clones and native Wolbachia genomic DNA. Comparison of Wolbachia sequences to public databases using BLAST algorithms under stringent conditions allowed confident prediction of 69 Wolbachia peptide functions and two rRNA genes. Comparison to closely related complete genomes revealed that while most sequences had orthologs in the genome of the Wolbachia endosymbiont from Drosophila melanogaster, there was no evidence for long-range synteny. Rather, there were a few cases of short-range conservation of gene order extending over regions of less than 10 kb. The molecular scaffold produced for the genome of the Wolbachia from B. malayi forms the basis of a genomic sequencing effort for

  8. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  9. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  10. Human genome libraries. Final progress report, February 1, 1994--August 31, 1997

    SciTech Connect

    Kao, Fa-Ten

    1998-01-01

    The goal of this program is to use a novel technology of chromosome microdissection and microcloning to construct chromosome region-specific libraries as resources for various human genome program studies. Region specific libraries have been constructed for the entire human chromosomes 2 and 18.

  11. Democratizing Human Genome Project Information: A Model Program for Education, Information and Debate in Public Libraries.

    ERIC Educational Resources Information Center

    Pollack, Miriam

    The "Mapping the Human Genome" project demonstrated that librarians can help whomever they serve in accessing information resources in the areas of biological and health information, whether it is the scientists who are developing the information or a member of the public who is using the information. Public libraries can guide library users…

  12. Expression of heterologous sigma factors enables functional screening of metagenomic and heterologous genomic libraries

    PubMed Central

    Gaida, Stefan M.; Sandoval, Nicholas R.; Nicolaou, Sergios A.; Chen, Yili; Venkataramanan, Keerthi P.; Papoutsakis, Eleftherios T.

    2015-01-01

    A key limitation in using heterologous genomic or metagenomic libraries in functional genomics and genome engineering is the low expression of heterologous genes in screening hosts, such as Escherichia coli. To overcome this limitation, here we generate E. coli strains capable of recognizing heterologous promoters by expressing heterologous sigma factors. Among seven sigma factors tested, RpoD from Lactobacillus plantarum (Lpl) appears to be able of initiating transcription from all sources of DNA. Using the promoter GFP-trap concept, we successfully screen several heterologous and metagenomic DNA libraries, thus enlarging the genomic space that can be functionally sampled in E. coli. For an application, we show that screening fosmid-based Lpl genomic libraries in an E. coli strain with a chromosomally integrated Lpl rpoD enables the identification of Lpl genetic determinants imparting strong ethanol tolerance in E. coli. Transcriptome analysis confirms increased expression of heterologous genes in the engineered strain. PMID:25944046

  13. Application of targeted enrichment to next-generation sequencing of retroviruses integrated into the host human genome.

    PubMed

    Miyazato, Paola; Katsuya, Hiroo; Fukuda, Asami; Uchiyama, Yoshikazu; Matsuo, Misaki; Tokunaga, Michiyo; Hino, Shinjiro; Nakao, Mitsuyoshi; Satou, Yorifumi

    2016-01-01

    The recent development and advancement of next-generation sequencing (NGS) technologies have enabled the characterization of the human genome at extremely high resolution. In the retrovirology field, NGS technologies have been applied to integration-site analysis and deep sequencing of viral genomes in combination with PCR amplification using virus-specific primers. However, virus-specific primers are not available for some epigenetic analyses, like chromatin immunoprecipitation sequencing (ChIP-seq) assays. Viral sequences are poorly detected without specific PCR amplification because proviral DNA is very scarce compared to human genomic DNA. Here, we have developed and evaluated the use of biotinylated DNA probes for the capture of viral genetic fragments from a library prepared for NGS. Our results demonstrated that viral sequence detection was hundreds or thousands of times more sensitive after enrichment, enabling us to reduce the economic burden that arises when attempting to analyze the epigenetic landscape of proviruses by NGS. In addition, the method is versatile enough to analyze proviruses that have mismatches compared to the DNA probes. Taken together, we propose that this approach is a powerful tool to clarify the mechanisms of transcriptional and epigenetic regulation of retroviral proviruses that have, until now, remained elusive. PMID:27321866

  14. Application of targeted enrichment to next-generation sequencing of retroviruses integrated into the host human genome

    PubMed Central

    Miyazato, Paola; Katsuya, Hiroo; Fukuda, Asami; Uchiyama, Yoshikazu; Matsuo, Misaki; Tokunaga, Michiyo; Hino, Shinjiro; Nakao, Mitsuyoshi; Satou, Yorifumi

    2016-01-01

    The recent development and advancement of next-generation sequencing (NGS) technologies have enabled the characterization of the human genome at extremely high resolution. In the retrovirology field, NGS technologies have been applied to integration-site analysis and deep sequencing of viral genomes in combination with PCR amplification using virus-specific primers. However, virus-specific primers are not available for some epigenetic analyses, like chromatin immunoprecipitation sequencing (ChIP-seq) assays. Viral sequences are poorly detected without specific PCR amplification because proviral DNA is very scarce compared to human genomic DNA. Here, we have developed and evaluated the use of biotinylated DNA probes for the capture of viral genetic fragments from a library prepared for NGS. Our results demonstrated that viral sequence detection was hundreds or thousands of times more sensitive after enrichment, enabling us to reduce the economic burden that arises when attempting to analyze the epigenetic landscape of proviruses by NGS. In addition, the method is versatile enough to analyze proviruses that have mismatches compared to the DNA probes. Taken together, we propose that this approach is a powerful tool to clarify the mechanisms of transcriptional and epigenetic regulation of retroviral proviruses that have, until now, remained elusive. PMID:27321866

  15. Genomic Library Screens for Genes Involved in n-Butanol Tolerance in Escherichia coli

    PubMed Central

    Reyes, Luis H.; Almario, Maria P.; Kao, Katy C.

    2011-01-01

    Background n-Butanol is a promising emerging biofuel, and recent metabolic engineering efforts have demonstrated the use of several microbial hosts for its production. However, most organisms have very low tolerance to n-butanol (up to 2% (v/v)), limiting the economic viability of this biofuel. The rational engineering of more robust n-butanol production hosts relies upon understanding the mechanisms involved in tolerance. However, the existing knowledge of genes involved in n-butanol tolerance is limited. The goal of this study is therefore to identify E. coli genes that are involved in n-butanol tolerance. Methodology/Principal Findings Using a genomic library enrichment strategy, we identified approximately 270 genes that were enriched or depleted in n-butanol challenge. The effects of these candidate genes on n-butanol tolerance were experimentally determined using overexpression or deletion libraries. Among the 55 enriched genes tested, 11 were experimentally shown to confer enhanced tolerance to n-butanol when overexpressed compared to the wild-type. Among the 84 depleted genes tested, three conferred increased n-butanol resistance when deleted. The overexpressed genes that conferred the largest increase in n-butanol tolerance were related to iron transport and metabolism, entC and feoA, which increased the n-butanol tolerance by 32.8±4.0% and 49.1±3.3%, respectively. The deleted gene that resulted in the largest increase in resistance to n-butanol was astE, which enhanced n-butanol tolerance by 48.7±6.3%. Conclusions/Significance We identified and experimentally verified 14 genes that decreased the inhibitory effect of n-butanol tolerance on E. coli. From the data, we were able to expand the current knowledge on the genes involved in n-butanol tolerance; the results suggest that an increased iron transport and metabolism and decreased acid resistance may enhance n-butanol tolerance. The genes and mechanisms identified in this study will be helpful in the

  16. Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for illumina genome sequencing

    SciTech Connect

    Angelova, Angelina; Park, Sang-Hycuk; Kyndt, John; Fitzsimmons, Kevin; Brown, Judith K

    2013-09-01

    With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis. The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.

  17. Comparison of surrogate reporter systems for enrichment of cells with mutations induced by genome editors.

    PubMed

    He, Zuyong; Shi, Xuan; Liu, Meirui; Sun, Guangjie; Proudfoot, Chris; Whitelaw, C Bruce A; Lillico, Simon G; Chen, Yaosheng

    2016-03-10

    Genome editors are powerful tools that allow modification of the nuclear DNA in eukaryotic cells both in vitro and in vivo. In vitro modified cells are often phenotypically indistinguishable from unmodified cells, hampering their isolation for analysis. Episomal reporters encoding fluorescent proteins can be used for enrichment of modified cells by flow cytometry. Here we compare two surrogate reporters, RGS and SSA, for the enrichment of porcine embryonic fibroblasts containing mutations induced by ZFNs or CRISPR/Cas9. Both systems were effective for enrichment of edited porcine cells with the RGS reporter proving more effective than the SSA reporter. We noted a higher-fold enrichment when editing events were induced by Cas9 compared to those induced by ZFNs, allowing selection at frequencies as high as 70%. PMID:26778541

  18. Construction and characterization of a Lipotes vexillifer genomic DNA BAC library.

    PubMed

    Du, Bo; Zhang, Xian-Feng; Fang, Sheng-Guo; Wang, Ding

    2007-04-01

    We constructed a genomic DNA library for Lipotes vexillifer (L. vexillifer), the Baiji or Yangtze River dolphin, one of the most endangered mammals in the world. The library consists of 149,000 BAC clones, with an average insert size of 83 kb, representing approximately 3.4 haploid genome equivalents. PCR amplification of four known L. vexillifer genes yielded two to four positive clones each. To demonstrate the utility of this library, we isolated and sequenced the L. vexillifer alpha lactalbumin gene, which is a gene specific to mammals and one which has been widely used as molecular tool in phylogenetic analysis. We also end-sequenced 20 randomly selected clones, resulting in the identification of at least five new L. vexillifer genes, five SSR loci, and one SINE locus. These results suggest that this library is a valuable resource for candidate gene cloning, physical mapping, and genome sequencing of this important and threatened species. PMID:17867838

  19. Robotic Enrichment Processing of Roche 454 Titanium Emlusion PCR at the DOE Joint Genome Institute

    SciTech Connect

    Hamilton, Matthew; Wilson, Steven; Bauer, Diane; Miller, Don; Duffy-Wei, Kecia; Hammon, Nancy; Lucas, Susan; Pollard, Martin; Cheng, Jan-Fang

    2010-05-28

    Enrichment of emulsion PCR product is the most laborious and pipette-intensive step in the 454 Titanium process, posing the biggest obstacle for production-oriented scale up. The Joint Genome Institute has developed a pair of custom-made robots based on the Microlab Star liquid handling deck manufactured by Hamilton to mediate the complexity and ergonomic demands of the 454 enrichment process. The robot includes a custom built centrifuge, magnetic deck positions, as well as heating and cooling elements. At present processing eight emulsion cup samples in a single 2.5 hour run, these robots are capable of processing up to 24 emulsion cup samples. Sample emulsions are broken using the standard 454 breaking process and transferred from a pair of 50ml conical tubes to a single 2ml tube and loaded on the robot. The robot performs the enrichment protocol and produces beads in 2ml tubes ready for counting. The robot follows the Roche 454 enrichment protocol with slight exceptions to the manner in which it resuspends beads via pipette mixing rather than vortexing and a set number of null bead removal washes. The robotic process is broken down in similar discrete steps: First Melt and Neutralization, Enrichment Primer Annealing, Enrichment Bead Incubation, Null Bead Removal, Second Melt and Neutralization and Sequencing Primer Annealing. Data indicating our improvements in enrichment efficiency and total number of bases per run will also be shown.

  20. Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing.

    PubMed

    Teer, Jamie K; Bonnycastle, Lori L; Chines, Peter S; Hansen, Nancy F; Aoyama, Natsuyo; Swift, Amy J; Abaan, Hatice Ozel; Albert, Thomas J; Margulies, Elliott H; Green, Eric D; Collins, Francis S; Mullikin, James C; Biesecker, Leslie G

    2010-10-01

    Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data. PMID:20810667

  1. High Capsid–Genome Correlation Facilitates Creation of AAV Libraries for Directed Evolution

    PubMed Central

    Nonnenmacher, Mathieu; van Bakel, Harm; Hajjar, Roger J; Weber, Thomas

    2015-01-01

    Directed evolution of adeno-associated virus (AAV) through successive rounds of phenotypic selection is a powerful method to isolate variants with improved properties from large libraries of capsid mutants. Importantly, AAV libraries used for directed evolution are based on the “natural” AAV genome organization where the capsid proteins are encoded in cis from replicating genomes. This is necessary to allow the recovery of the capsid DNA after each step of phenotypic selection. For directed evolution to be used successfully, it is essential to minimize the random mixing of capsomers and the encapsidation of nonmatching viral genomes during the production of the viral libraries. Here, we demonstrate that multiple AAV capsid variants expressed from Rep/Cap containing viral genomes result in near-homogeneous capsids that display an unexpectedly high capsid–DNA correlation. Next-generation sequencing of AAV progeny generated by bulk transfection of a semi-random peptide library showed a strong counter-selection of capsid variants encoding premature stop codons, which further supports a strong capsid–genome identity correlation. Overall, our observations demonstrate that production of “natural” AAVs results in low capsid mosaicism and high capsid–genome correlation. These unique properties allow the production of highly diverse AAV libraries in a one-step procedure with a minimal loss in phenotype–genotype correlation. PMID:25586687

  2. Genome Enablement of the Notothenioidei: Genome Size Estimates from 11 Species and BAC Libraries from 2 Representative Taxa

    PubMed Central

    DETRICH, H. WILLIAM; STUART, ANDREW; SCHOENBORN, MICHAEL; PARKER, SANDRA K.; METHÉ, BARBARA A.; AMEMIYA, CHRIS T.

    2013-01-01

    The perciform suborder Notothenoidei provides a compelling opportunity to study the adaptive radiation of a marine species flock in the cold Southern Ocean surrounding Antarctica. To enable genome-level studies of these psychrophilic fishes, we estimated the sizes of the genomes of 11 Antarctic species and generated high-quality BAC libraries for 2, the notothen Notothenia coriiceps and the icefish Chaenocephalus aceratus. Our results indicate that evolution of phylogenetically derived notothenioid families, [e.g., the icefishes (Channichthyidae)], was accompanied by genome expansion. Species (n = 6) of the basal family Nototheniidae had C values that ranged between 0.98 and 1.20 pg, whereas those of the icefishes, the notothenioid crown group, were 1.66–1.83 pg (n = 4 species). The BAC libraries VMRC-19 (N. coriiceps) and VMRC-21 (C. aceratus) comprised 12X and 10X coverage of the respective genomes and had average insert sizes of 138 and 168 kb. Greater than 60% of paired BAC ends sampled from each library (~0.1% of each genome) contained repetitive sequences, and the repetitive element landscapes of the 2 genomes (13.4% of the N. coriiceps genome and 14.5% for C. aceratus) were similar. The representation and depth of coverage of the libraries were verified by identification of multiple Hox gene contigs: six discrete Hox clusters were found in N. coriiceps and at least five Hox clusters were found in C. aceratus. Given the unusual anatomical and physiological adaptations of the notothenioids, the availability of these BAC libraries sets the stage for expanded analysis of the psychrophilic mode of life. PMID:20235119

  3. Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS

    PubMed Central

    Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.

    2016-01-01

    Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560

  4. Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS.

    PubMed

    Wang, Yunpeng; Thompson, Wesley K; Schork, Andrew J; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R; Djurovic, Srdjan; O'Donovan, Michael; Visscher, Peter M; Andreassen, Ole A; Dale, Anders M

    2016-01-01

    Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic ("z-score") of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a "relative enrichment score" for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560

  5. A new strategy for genome assembly using short sequence reads and reduced representation libraries.

    PubMed

    Young, Andrew L; Abaan, Hatice Ozel; Zerbino, Daniel; Mullikin, James C; Birney, Ewan; Margulies, Elliott H

    2010-02-01

    We have developed a novel approach for using massively parallel short-read sequencing to generate fast and inexpensive de novo genomic assemblies comparable to those generated by capillary-based methods. The ultrashort (<100 base) sequences generated by this technology pose specific biological and computational challenges for de novo assembly of large genomes. To account for this, we devised a method for experimentally partitioning the genome using reduced representation (RR) libraries prior to assembly. We use two restriction enzymes independently to create a series of overlapping fragment libraries, each containing a tractable subset of the genome. Together, these libraries allow us to reassemble the entire genome without the need of a reference sequence. As proof of concept, we applied this approach to sequence and assembled the majority of the 125-Mb Drosophila melanogaster genome. We subsequently demonstrate the accuracy of our assembly method with meaningful comparisons against the current available D. melanogaster reference genome (dm3). The ease of assembly and accuracy for comparative genomics suggest that our approach will scale to future mammalian genome-sequencing efforts, saving both time and money without sacrificing quality. PMID:20123915

  6. Genomes of two new ammonia-oxidizing archaea enriched from deep marine sediments.

    PubMed

    Park, Soo-Je; Ghai, Rohit; Martín-Cuadrado, Ana-Belén; Rodríguez-Valera, Francisco; Chung, Won-Hyong; Kwon, KaeKyoung; Lee, Jung-Hyun; Madsen, Eugene L; Rhee, Sung-Keun

    2014-01-01

    Ammonia-oxidizing archaea (AOA) are ubiquitous and abundant and contribute significantly to the carbon and nitrogen cycles in the ocean. In this study, we assembled AOA draft genomes from two deep marine sediments from Donghae, South Korea, and Svalbard, Arctic region, by sequencing the enriched metagenomes. Three major microorganism clusters belonging to Thaumarchaeota, Epsilonproteobacteria, and Gammaproteobacteria were deduced from their 16S rRNA genes, GC contents, and oligonucleotide frequencies. Three archaeal genomes were identified, two of which were distinct and were designated Ca. "Nitrosopumilus koreensis" AR1 and "Nitrosopumilus sediminis" AR2. AR1 and AR2 exhibited average nucleotide identities of 85.2% and 79.5% to N. maritimus, respectively. The AR1 and AR2 genomes contained genes pertaining to energy metabolism and carbon fixation as conserved in other AOA, but, conversely, had fewer heme-containing proteins and more copper-containing proteins than other AOA. Most of the distinctive AR1 and AR2 genes were located in genomic islands (GIs) that were not present in other AOA genomes or in a reference water-column metagenome from the Sargasso Sea. A putative gene cluster involved in urea utilization was found in the AR2 genome, but not the AR1 genome, suggesting niche specialization in marine AOA. Co-cultured bacterial genome analysis suggested that bacterial sulfur and nitrogen metabolism could be involved in interactions with AOA. Our results provide fundamental information concerning the metabolic potential of deep marine sedimentary AOA. PMID:24798206

  7. Enrichment of sequencing targets from the human genome by solution hybridization

    PubMed Central

    2009-01-01

    To exploit fully the potential of current sequencing technologies for population-based studies, one must enrich for loci from the human genome. Here we evaluate the hybridization-based approach by using oligonucleotide capture probes in solution to enrich for approximately 3.9 Mb of sequence target. We demonstrate that the tiling probe frequency is important for generating sequence data with high uniform coverage of targets. We obtained 93% sensitivity to detect SNPs, with a calling accuracy greater than 99%. PMID:19835619

  8. Target enrichment of ultraconserved elements from arthropods provides a genomic perspective on relationships among Hymenoptera

    PubMed Central

    Faircloth, Brant C; Branstetter, Michael G; White, Noor D; Brady, Seán G

    2015-01-01

    Gaining a genomic perspective on phylogeny requires the collection of data from many putatively independent loci across the genome. Among insects, an increasingly common approach to collecting this class of data involves transcriptome sequencing, because few insects have high-quality genome sequences available; assembling new genomes remains a limiting factor; the transcribed portion of the genome is a reasonable, reduced subset of the genome to target; and the data collected from transcribed portions of the genome are similar in composition to the types of data with which biologists have traditionally worked (e.g. exons). However, molecular techniques requiring RNA as a template, including transcriptome sequencing, are limited to using very high-quality source materials, which are often unavailable from a large proportion of biologically important insect samples. Recent research suggests that DNA-based target enrichment of conserved genomic elements offers another path to collecting phylogenomic data across insect taxa, provided that conserved elements are present in and can be collected from insect genomes. Here, we identify a large set (n = 1510) of ultraconserved elements (UCEs) shared among the insect order Hymenoptera. We used in silico analyses to show that these loci accurately reconstruct relationships among genome-enabled hymenoptera, and we designed a set of RNA baits (n = 2749) for enriching these loci that researchers can use with DNA templates extracted from a variety of sources. We used our UCE bait set to enrich an average of 721 UCE loci from 30 hymenopteran taxa, and we used these UCE loci to reconstruct phylogenetic relationships spanning very old (≥220 Ma) to very young (≤1 Ma) divergences among hymenopteran lineages. In contrast to a recent study addressing hymenopteran phylogeny using transcriptome data, we found ants to be sister to all remaining aculeate lineages with complete support, although this result could be explained by

  9. Comparative genomics of "Dehalococcoides ethenogenes" 195 and an enrichment culture containing unsequenced "Dehalococcoides" strains.

    PubMed

    West, Kimberlee A; Johnson, David R; Hu, Ping; DeSantis, Todd Z; Brodie, Eoin L; Lee, Patrick K H; Feil, Helene; Andersen, Gary L; Zinder, Stephen H; Alvarez-Cohen, Lisa

    2008-06-01

    Tetrachloroethene (PCE) and trichloroethene (TCE) are prevalent groundwater contaminants that can be completely reductively dehalogenated by some "Dehalococcoides" organisms. A Dehalococcoides-organism-containing microbial consortium (referred to as ANAS) with the ability to degrade TCE to ethene, an innocuous end product, was previously enriched from contaminated soil. A whole-genome photolithographic microarray was developed based on the genome of "Dehalococcoides ethenogenes" 195. This microarray contains probes designed to hybridize to >99% of the predicted protein-coding sequences in the strain 195 genome. DNA from ANAS was hybridized to the microarray to characterize the genomic content of the ANAS enrichment. The microarray results revealed that the genes associated with central metabolism, including an apparently incomplete carbon fixation pathway, cobalamin-salvaging system, nitrogen fixation pathway, and five hydrogenase complexes, are present in both strain 195 and ANAS. Although the gene encoding the TCE reductase, tceA, was detected, 13 of the 19 reductive dehalogenase genes present in strain 195 were not detected in ANAS. Additionally, 88% of the genes in predicted integrated genetic elements in strain 195 were not detected in ANAS, consistent with these elements being genetically mobile. Sections of the tryptophan operon and an operon encoding an ABC transporter in strain 195 were also not detected in ANAS. These insights into the diversity of Dehalococcoides genomes will improve our understanding of the physiology and evolution of these bacteria, which is essential in developing effective strategies for the bioremediation of PCE and TCE in the environment. PMID:18359838

  10. Rapid enrichment of leucocytes and genomic DNA from blood based on bifunctional core shell magnetic nanoparticles

    NASA Astrophysics Data System (ADS)

    Xie, Xin; Nie, Xiaorong; Yu, Bingbin; Zhang, Xu

    2007-04-01

    A series of protocols are proposed to extract genomic DNA from whole blood at different scales using carboxyl-functionalized magnetic nanoparticles as solid-phase absorbents. The enrichment of leucocytes and the adsorption of genomic DNA can be achieved with the same carboxyl-functionalized magnetic nanoparticles. The DNA bound to the bead surfaces can be used directly as PCR templates. By coupling cell separation and DNA purification, the whole operation can be accomplished in a few minutes. Our simplified protocols proved to be rapid, low cost, and biologically and chemically non-hazardous, and are therefore promising for microfabrication of a DNA-preparation chip and routine laboratory use.

  11. Biomedical applications and studies of molecular evolution: a proposal for a primate genomic library resource.

    PubMed

    Eichler, Evan E; DeJong, Pieter J

    2002-05-01

    The anticipated completion of two of the most biomedically relevant genomes, mouse and human, within the next three years provides an unparalleled opportunity for the large-scale exploration of genome evolution. Targeted sequencing of genomic regions in a panel of primate species and comparison to reference genomes will provide critical insight into the nature of single-base pair variation, mechanisms of chromosomal rearrangement, patterns of selection, and species adaptation. Although not recognized as model "genetic organisms" because of their longevity and low fecundity, 30 of the approximately 300 primate species are targets of biomedical research. The existence of a human reference sequence and genomic primate BAC libraries greatly facilitates the recovery of genes/genomic regions of high biological interest because of an estimated maximum neutral nucleotide sequence divergence of 25%. Primate species, therefore, may be regarded as the ideal model "genomic organisms". Based on existing BAC library resources, we propose the construction of a panel of primate BAC libraries from phylogenetic anchor species for the purpose of comparative medicine as well as studies of genome evolution. PMID:11997334

  12. Genomic features of uncultured methylotrophs in activated-sludge microbiomes grown under different enrichment procedures.

    PubMed

    Fujinawa, Kazuki; Asai, Yusuke; Miyahara, Morio; Kouzuma, Atsushi; Abe, Takashi; Watanabe, Kazuya

    2016-01-01

    Methylotrophs are organisms that are able to grow on C1 compounds as carbon and energy sources. They play important roles in the global carbon cycle and contribute largely to industrial wastewater treatment. To identify and characterize methylotrophs that are involved in methanol degradation in wastewater-treatment plants, methanol-fed activated-sludge (MAS) microbiomes were subjected to phylogenetic and metagenomic analyses, and genomic features of dominant methylotrophs in MAS were compared with those preferentially grown in laboratory enrichment cultures (LECs). These analyses consistently indicate that Hyphomicrobium plays important roles in MAS, while Methylophilus occurred predominantly in LECs. Comparative analyses of bin genomes reconstructed for the Hyphomicrobium and Methylophilus methylotrophs suggest that they have different C1-assimilation pathways. In addition, function-module analyses suggest that their cell-surface structures are different. Comparison of the MAS bin genome with genomes of closely related Hyphomicrobium isolates suggests that genes unnecessary in MAS (for instance, genes for anaerobic respiration) have been lost from the genome of the dominant methylotroph. We suggest that genomic features and coded functions in the MAS bin genome provide us with insights into how this methylotroph adapts to activated-sludge ecosystems. PMID:27221669

  13. Genomic features of uncultured methylotrophs in activated-sludge microbiomes grown under different enrichment procedures

    PubMed Central

    Fujinawa, Kazuki; Asai, Yusuke; Miyahara, Morio; Kouzuma, Atsushi; Abe, Takashi; Watanabe, Kazuya

    2016-01-01

    Methylotrophs are organisms that are able to grow on C1 compounds as carbon and energy sources. They play important roles in the global carbon cycle and contribute largely to industrial wastewater treatment. To identify and characterize methylotrophs that are involved in methanol degradation in wastewater-treatment plants, methanol-fed activated-sludge (MAS) microbiomes were subjected to phylogenetic and metagenomic analyses, and genomic features of dominant methylotrophs in MAS were compared with those preferentially grown in laboratory enrichment cultures (LECs). These analyses consistently indicate that Hyphomicrobium plays important roles in MAS, while Methylophilus occurred predominantly in LECs. Comparative analyses of bin genomes reconstructed for the Hyphomicrobium and Methylophilus methylotrophs suggest that they have different C1-assimilation pathways. In addition, function-module analyses suggest that their cell-surface structures are different. Comparison of the MAS bin genome with genomes of closely related Hyphomicrobium isolates suggests that genes unnecessary in MAS (for instance, genes for anaerobic respiration) have been lost from the genome of the dominant methylotroph. We suggest that genomic features and coded functions in the MAS bin genome provide us with insights into how this methylotroph adapts to activated-sludge ecosystems. PMID:27221669

  14. Local Assemblies of Paired-End Reduced Representation Libraries Sequenced with the Illumina Genome Analyzer in Maize

    PubMed Central

    Deschamps, Stéphane; Nannapaneni, Kishore; Zhang, Yun; Hayes, Kevin

    2012-01-01

    The use of next-generation DNA sequencing technologies has greatly facilitated reference-guided variant detection in complex plant genomes. However, complications may arise when regions adjacent to a read of interest are used for marker assay development, or when reference sequences are incomplete, as short reads alone may not be long enough to ascertain their uniqueness. Here, the possibility of generating longer sequences in discrete regions of the large and complex genome of maize is demonstrated, using a modified version of a paired-end RAD library construction strategy. Reads are generated from DNA fragments first digested with a methylation-sensitive restriction endonuclease, sheared, enriched with biotin and a selective PCR amplification step, and then sequenced at both ends. Sequences are locally assembled into contigs by subgrouping pairs based on the identity of the read anchored by the restriction site. This strategy applied to two maize inbred lines (B14 and B73) generated 183,609 and 129,018 contigs, respectively, out of which at least 76% were >200 bps in length. A subset of putative single nucleotide polymorphisms from contigs aligning to the B73 reference genome with at least one mismatch was resequenced, and 90% of those in B14 were confirmed, indicating that this method is a potent approach for variant detection and marker development in species with complex genomes or lacking extensive reference sequences. PMID:23093955

  15. Chromosome region-specific libraries for human genome analysis

    SciTech Connect

    Kao, Fa-Ten.

    1992-08-01

    During the grant period progress has been made in the successful demonstration of regional mapping of microclones derived from microdissection libraries; successful demonstration of the feasibility of converting microclones with short inserts into yeast artificial chromosome clones with very large inserts for high resolution physical mapping of the dissected region; Successful demonstration of the usefulness of region-specific microclones to isolate region-specific cDNA clones as candidate genes to facilitate search for the crucial genes underlying genetic diseases assigned to the dissected region; and the successful construction of four region-specific microdissection libraries for human chromosome 2, including 2q35-q37, 2q33-q35, 2p23-p25 and 2p2l-p23. The 2q35-q37 library has been characterized in detail. The characterization of the other three libraries is in progress. These region-specific microdissection libraries and the unique sequence microclones derived from the libraries will be valuable resources for investigators engaged in high resolution physical mapping and isolation of disease-related genes residing in these chromosomal regions.

  16. Reconstructing rare soil microbial genomes using in situ enrichments and metagenomics

    PubMed Central

    Delmont, Tom O.; Eren, A. Murat; Maccario, Lorrie; Prestat, Emmanuel; Esen, Özcan C.; Pelletier, Eric; Le Paslier, Denis; Simonet, Pascal; Vogel, Timothy M.

    2015-01-01

    Despite extensive direct sequencing efforts and advanced analytical tools, reconstructing microbial genomes from soil using metagenomics have been challenging due to the tremendous diversity and relatively uniform distribution of genomes found in this system. Here we used enrichment techniques in an attempt to decrease the complexity of a soil microbiome prior to sequencing by submitting it to a range of physical and chemical stresses in 23 separate microcosms for 4 months. The metagenomic analysis of these microcosms at the end of the treatment yielded 540 Mb of assembly using standard de novo assembly techniques (a total of 559,555 genes and 29,176 functions), from which we could recover novel bacterial genomes, plasmids and phages. The recovered genomes belonged to Leifsonia (n = 2), Rhodanobacter (n = 5), Acidobacteria (n = 2), Sporolactobacillus (n = 2, novel nitrogen fixing taxon), Ktedonobacter (n = 1, second representative of the family Ktedonobacteraceae), Streptomyces (n = 3, novel polyketide synthase modules), and Burkholderia (n = 2, includes mega-plasmids conferring mercury resistance). Assembled genomes averaged to 5.9 Mb, with relative abundances ranging from rare (<0.0001%) to relatively abundant (>0.01%) in the original soil microbiome. Furthermore, we detected them in samples collected from geographically distant locations, particularly more in temperate soils compared to samples originating from high-latitude soils and deserts. To the best of our knowledge, this study is the first successful attempt to assemble multiple bacterial genomes directly from a soil sample. Our findings demonstrate that developing pertinent enrichment conditions can stimulate environmental genomic discoveries that would have been impossible to achieve with canonical approaches that focus solely upon post-sequencing data treatment. PMID:25983722

  17. A deep coverage Dictyostelium discoideum genomic DNA library replicates stably in Escherichia coli.

    PubMed

    Rosengarten, Rafael D; Beltran, Pamela R; Shaulsky, Gad

    2015-10-01

    The natural history of the amoeba Dictyostelium discoideum has inspired scientific inquiry for seventy-five years. A genetically tractable haploid eukaryote, D. discoideum appeals as a laboratory model as well. However, certain rote molecular genetic tasks, such as PCR and cloning, are difficult due to the AT-richness and low complexity of its genome. Here we report on the construction of a ~20 fold coverage D. discoideum genomic library in Escherichia coli, cloning 4-10 kilobase partial restriction fragments into a linear vector. End-sequencing indicates that most clones map to the six chromosomes in an unbiased distribution. Over 70% of these clones contain at least one complete open reading frame. We demonstrate that individual clones and library composition are stable over multiple replication cycles. Our library will enable numerous molecular biological applications and the completion of additional species' genome sequences, and suggests a path towards the long-elusive goal of genetic complementation. PMID:26028264

  18. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    PubMed Central

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  19. Construction and analysis of Siberian tiger bacterial artificial chromosome library with approximately 6.5-fold genome equivalent coverage.

    PubMed

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  20. Cell Context Dependent p53 Genome-Wide Binding Patterns and Enrichment at Repeats

    PubMed Central

    Botcheva, Krassimira; McCorkle, Sean R.

    2014-01-01

    The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We report distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). Our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways. PMID:25415302

  1. Cell Context Dependent p53 Genome-Wide Binding Patterns and Enrichment at Repeats

    DOE PAGESBeta

    Botcheva, Krassimira; McCorkle, Sean R.

    2014-11-21

    The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We reportmore » distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). In conclusion, our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways.« less

  2. Cell Context Dependent p53 Genome-Wide Binding Patterns and Enrichment at Repeats

    SciTech Connect

    Botcheva, Krassimira; McCorkle, Sean R.

    2014-11-21

    The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We report distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). In conclusion, our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways.

  3. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future. PMID:24630959

  4. Targeted genome enrichment for efficient purification of endosymbiont DNA from host DNA.

    PubMed

    Geniez, Sandrine; Foster, Jeremy M; Kumar, Sanjay; Moumen, Bouziane; Leproust, Emily; Hardy, Owen; Guadalupe, Moraima; Thomas, Stephen J; Boone, Braden; Hendrickson, Cynthia; Bouchon, Didier; Grève, Pierre; Slatko, Barton E

    2012-12-01

    Wolbachia endosymbionts are widespread in arthropods and are generally considered reproductive parasites, inducing various phenotypes including cytoplasmic incompatibility, parthenogenesis, feminization and male killing, which serve to promote their spread through populations. In contrast, Wolbachia infecting filarial nematodes that cause human diseases, including elephantiasis and river blindness, are obligate mutualists. DNA purification methods for efficient genomic sequencing of these unculturable bacteria have proven difficult using a variety of techniques. To efficiently capture endosymbiont DNA for studies that examine the biology of symbiosis, we devised a parallel strategy to an earlier array-based method by creating a set of SureSelect™ (Agilent) 120-mer target enrichment RNA oligonucleotides ("baits") for solution hybrid selection. These were designed from Wolbachia complete and partial genome sequences in GenBank and were tiled across each genomic sequence with 60 bp overlap. Baits were filtered for homology against host genomes containing Wolbachia using BLAT and sequences with significant host homology were removed from the bait pool. Filarial parasite Brugia malayi DNA was used as a test case, as the complete sequence of both Wolbachia and its host are known. DNA eluted from capture was size selected and sequencing samples were prepared using the NEBNext® Sample Preparation Kit. One-third of a 50 nt paired-end sequencing lane on the HiSeq™ 2000 (Illumina) yielded 53 million reads and the entirety of the Wolbachia genome was captured. We then used the baits to isolate more than 97.1 % of the genome of a distantly related Wolbachia strain from the crustacean Armadillidium vulgare, demonstrating that the method can be used to enrich target DNA from unculturable microbes over large evolutionary distances. PMID:23482460

  5. Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

    PubMed Central

    2010-01-01

    Background Complete chloroplast genome sequences provide a valuable source of molecular markers for studies in molecular ecology and evolution of plants. To obtain complete genome sequences, recent studies have made use of the polymerase chain reaction to amplify overlapping fragments from conserved gene loci. However, this approach is time consuming and can be more difficult to implement where gene organisation differs among plants. An alternative approach is to first isolate chloroplasts and then use the capacity of high-throughput sequencing to obtain complete genome sequences. We report our findings from studies of the latter approach, which used a simple chloroplast isolation procedure, multiply-primed rolling circle amplification of chloroplast DNA, Illumina Genome Analyzer II sequencing, and de novo assembly of paired-end sequence reads. Results A modified rapid chloroplast isolation protocol was used to obtain plant DNA that was enriched for chloroplast DNA, but nevertheless contained nuclear and mitochondrial DNA. Multiply-primed rolling circle amplification of this mixed template produced sufficient quantities of chloroplast DNA, even when the amount of starting material was small, and improved the template quality for Illumina Genome Analyzer II (hereafter Illumina GAII) sequencing. We demonstrate, using independent samples of karaka (Corynocarpus laevigatus), that there is high fidelity in the sequence obtained from this template. Although less than 20% of our sequenced reads could be mapped to chloroplast genome, it was relatively easy to assemble complete chloroplast genome sequences from the mixture of nuclear, mitochondrial and chloroplast reads. Conclusions We report successful whole genome sequencing of chloroplast DNA from karaka, obtained efficiently and with high fidelity. PMID:20920211

  6. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    PubMed Central

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Michael S.; Yang, Zamin K.; Klingeman, Dawn M.; Land, Miriam L.; Allman, Steve L.; Lu, Tse-Yuan S.; Brown, Steven D.; Schadt, Christopher W.; Podar, Mircea; Doktycz, Mitchel J.

    2016-01-01

    ABSTRACT Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. Here, we describe a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from the plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria. IMPORTANCE Plant roots harbor a diverse collection of microbes that live within host tissues. To gain a comprehensive understanding of microbial adaptations to this endophytic lifestyle from strains that cannot be cultivated, it is necessary to separate bacterial cells from the predominance of plant tissue. This study provides a valuable approach for the separation and isolation of endophytic bacteria from plant root tissue. Isolated live bacteria provide material for microbiome sequencing, single-cell genomics, and analyses

  7. Discovery of User-Oriented Class Associations for Enriching Library Classification Schemes.

    ERIC Educational Resources Information Center

    Pu, Hsiao-Tieh

    2002-01-01

    Presents a user-based approach to exploring the possibility of adding user-oriented class associations to hierarchical library classification schemes. Classes not grouped in the same subject hierarchies yet relevant to users' knowledge are obtained by analyzing a log book of a university library's circulation records, using collaborative filtering…

  8. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    DOE PAGESBeta

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.; Yang, Zamin Koo; Klingeman, Dawn Marie; Land, Miriam L.; Allman, Steve L.; Lu, Tse-Yuan S.; Brown, Steven D.; Schadt, Christopher Warren; et al

    2016-07-15

    Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from themore » plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.« less

  9. Construction of the BAC Library of Small Abalone (Haliotis diversicolor) for Gene Screening and Genome Characterization.

    PubMed

    Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng

    2016-02-01

    The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor. PMID:26438131

  10. AIMS Library - A community resource for sorghum genomic studies and breeding

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The sorghum genome sequence is completed. A systematically mutagenized population linking gene function to sequence is becoming increasingly urgent. A project was initiated to develop an Annotated Individually-pedigreed Mutagenized Sorghum (AIMS) library using (EMS) ethyl methane sulfonate for sel...

  11. Construction and characterization of an eightfold redundant dog genomic bacterial artificial chromosome library.

    PubMed

    Li, R; Mignot, E; Faraco, J; Kadotani, H; Cantanese, J; Zhao, B; Lin, X; Hinton, L; Ostrander, E A; Patterson, D F; de Jong, P J

    1999-05-15

    A large insert canine genomic bacterial artificial chromosome (BAC) library was built from a Doberman pinscher. Approximately 166,000 clones were gridded on nine high-density hybridization filters. Insert analysis of randomly selected clones indicated a mean insert size of 155 kb and predicted 8.1 coverage of the canine genome. Two percent of the clones were nonrecombinant. Chromosomal fluorescence in situ hybridization studies of 60 BAC clones indicated no chimerism. The library was hybridized with dog PCR products representing eight genes (ADA, TNFA, GCA, MYB, HOXA, GUSB, THY1, and TOP1). The resulting positive clones were characterized and shown to be compatible with an eightfold redundant library. PMID:10331940

  12. Preparation of a phage DNA fragment library for whole genome shotgun sequencing.

    PubMed

    Summer, Elizabeth J

    2009-01-01

    The most efficient method to determine the genomic sequence of a dsDNA phage is to use a whole genome shotgun approach (WGSA). Preparation of a library where each genomic fragment has an equal chance of being represented is critical to the success of the WGSA. For many phages, there are regions of the genome likely to be under-represented in the shotgun library, which results in more gaps in the shotgun assembly than predicted by the Poisson distribution. However, as phage genomes are relatively small, this increased number of gaps does not present an insurmountable impediment to using the WGSA. This chapter will focus on construction of a high-quality random library and sequence analysis of this library in a 96-well format. Techniques are described for the mechanical fragmentation of genomic DNA into 2 kb average size fragments, preparation of the fragmented DNA for shotgun cloning, and advice on the choice of cloning vector for library preparation. Protocols for deepwell block culture, plasmid isolation, and sequencing in 96-well format are given. The rationale for determining the total number of random clones from a library to sequence for a 50 and 150 kb genome is explained. The steps involved in going from hundreds of shotgun sequencing traces to generating contigs will be outlined as well as how to close gaps in the sequence by primer walking on phage DNA and PCR-generated templates. Finally, examples will be given of how biological information about the phage genomic termini can be derived by analysis of the organization of individual clones in the shotgun sequence assembly. Specific examples are given for the circularly permuted termini of pac type phages, the direct terminal repeats found in most T7-like phages, variable host DNA at either end as in the Mu-like phages, and the 5' and 3' overhanging ends of cos type phages. The end result of these steps is the entire DNA sequence of a novel phage, ready for gene prediction. PMID:19082550

  13. Enhancing genome investigations in the mosquito Culex quinquefasciatus via BAC library construction and characterization

    PubMed Central

    2011-01-01

    Background Culex quinquefasciatus (Say) is a major species in the Culex pipiens complex and an important vector for several human pathogens including West Nile virus and parasitic filarial nematodes causing lymphatic filariasis. It is common throughout tropical and subtropical regions and is among the most geographically widespread mosquito species. Although the complete genome sequence is now available, additional genomic tools are needed to improve the sequence assembly. Findings We constructed a bacterial artificial chromosome (BAC) library using the pIndigoBAC536 vector and HindIII partially digested DNA isolated from Cx. quinquefasciatus pupae, Johannesburg strain (NDJ). Insert size was estimated by NotI digestion and pulsed-field gel electrophoresis of 82 randomly selected clones. To estimate genome coverage, each 384-well plate was pooled for screening with 29 simple sequence repeat (SSR) and five gene markers. The NDJ library consists of 55,296 clones arrayed in 144 384-well microplates. Fragment insert size ranged from 50 to 190 kb in length (mean = 106 kb). Based on a mean insert size of 106 kb and a genome size of 579 Mbp, the BAC library provides ~10.1-fold coverage of the Cx. quinquefasciatus genome. PCR screening of BAC DNA plate pools for SSR loci from the genetic linkage map and for four genes associated with reproductive diapause in Culex pipiens resulted in a mean of 9.0 positive plate pools per locus. Conclusion The NDJ library represents an excellent resource for genome assembly enhancement and characterization in Culex pipiens complex mosquitoes. PMID:21914202

  14. Genome-wide screen for miRNA targets using the MISSION target ID library.

    PubMed

    Coussens, Matthew J; Forbes, Kevin; Kreader, Carol; Sago, Jack; Cupp, Carrie; Swarthout, John

    2012-01-01

    The Target ID Library is designed to assist in discovery and identification of microRNA (miRNA) targets. The Target ID Library is a plasmid-based, genome-wide cDNA library cloned into the 3'UTR downstream from the dual-selection fusion protein, thymidine kinase-zeocin (TKzeo). The first round of selection is for stable transformants, followed with introduction of a miRNA of interest, and finally, selecting for cDNAs containing the miRNA's target. Selected cDNAs are identified by sequencing (see Figure 1-3 for Target ID Library Workflow and details). To ensure broad coverage of the human transcriptome, Target ID Library cDNAs were generated via oligo-dT priming using a pool of total RNA prepared from multiple human tissues and cell lines. Resulting cDNA range from 0.5 to 4 kb, with an average size of 1.2 kb, and were cloned into the p3Î"TKzeo dual-selection plasmid (see Figure 4 for plasmid map). The gene targets represented in the library can be found on the Sigma-Aldrich webpage. Results from Illumina sequencing (Table 3), show that the library includes 16,922 of the 21,518 unique genes in UCSC RefGene (79%), or 14,000 genes with 10 or more reads (66%). PMID:22508434

  15. Construction and characterization of a 10-genome equivalent yeast artificial chromosome library for the laboratory rat, Rattus norvegicus

    SciTech Connect

    Cai, L.; Zee, R.Y.L.; Schalkwyk, L.C.

    1997-02-01

    Increasing attention has been focused in recent years on the rat as a model organism for genetic studies, in particular for the investigation of complex traits, but progress has been limited by the lack of availability of large-insert genomic libraries. Here, we report the construction and characterization of an arrayed yeast artificial chromosome (YAC) library for the rat genome containing approximately 40,000 clones in the AB1380 host using the pCGS966 vector. An average size of 736 kb was estimated from 166 randomly chosen clones; thus the library provides 10-fold coverage of the genome, with a 99.99% probability of containing a unique sequence. Eight of 39 YACs analyzed by fluorescence in situ hybridization were found to be chimeric, indicating a proportion of about 20-30% of chimeric clones. The library was spotted on high-density filters to allow the identification of YAC clones by hybridization and was pooled using a 3-dimensional scheme for screening by PCR. Among 48 probes used to screen the library, an average of 9.3 positive clones were found, consistent with the calculated 10-fold genomic coverage of the library. This YAC library represents the first large-insert genomic library for the rat. It will be made available to the research community at large as an important new resource for complex genome analysis in this species. 35 refs., 4 figs.

  16. Characterization of expressed sequence tags from a full-length enriched cDNA library of Cryptomeria japonica male strobili

    PubMed Central

    Futamura, Norihiro; Totoki, Yasushi; Toyoda, Atsushi; Igasaki, Tomohiro; Nanjo, Tokihiko; Seki, Motoaki; Sakaki, Yoshiyuki; Mari, Adriano; Shinozaki, Kazuo; Shinohara, Kenji

    2008-01-01

    Background Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages. Results We obtained 36,011 expressed sequence tags (ESTs) from either one or both ends of 19,437 clones derived from the cDNA library of C. japonica male strobili at various developmental stages. The 19,437 cDNA clones corresponded to 10,463 transcripts. Approximately 80% of the transcripts resembled ESTs from Pinus and Picea, while approximately 75% had homologs in Arabidopsis. An analysis of homologies between ESTs from C. japonica male strobili and known pollen allergens in the Allergome Database revealed that products of 180 transcripts exhibited significant homology. Approximately 2% of the transcripts appeared to encode transcription factors. We identified twelve genes for MADS-box proteins among these transcription factors. The twelve MADS-box genes were classified as DEF/GLO/GGM13-, AG-, AGL6-, TM3- and TM8-like MIKCC genes and type I MADS-box genes. Conclusion Our full-length enriched cDNA library derived from C. japonica male strobili provides information on expression of genes during the development of male reproductive organs. We provided potential allergens in C. japonica. We also provided new information about transcription factors including MADS-box genes expressed in male strobili of C. japonica. Large-scale gene discovery using full-length cDNAs is a valuable tool for studies of gymnosperm species. PMID:18691438

  17. A novel ammonia-oxidizing archaeon from wastewater treatment plant: Its enrichment, physiological and genomic characteristics

    NASA Astrophysics Data System (ADS)

    Li, Yuyang; Ding, Kun; Wen, Xianghua; Zhang, Bing; Shen, Bo; Yang, Yunfeng

    2016-03-01

    Ammonia-oxidizing archaea (AOA) are recently found to participate in the ammonia removal processes in wastewater treatment plants (WWTPs), similar to their bacterial counterparts. However, due to lack of cultivated AOA strains from WWTPs, their functions and contributions in these systems remain unclear. Here we report a novel AOA strain SAT1 enriched from activated sludge, with its physiological and genomic characteristics investigated. The maximal 16S rRNA gene similarity between SAT1 and other reported AOA strain is 96% (with “Ca. Nitrosotenuis chungbukensis”), and it is affiliated with Wastewater Cluster B (WWC-B) based on amoA gene phylogeny, a cluster within group I.1a and specific for activated sludge. Our strain is autotrophic, mesophilic (25 °C–33 °C) and neutrophilic (pH 5.0–7.0). Its genome size is 1.62 Mb, with a large fragment inversion (accounted for 68% genomic size) inside. The strain could not utilize urea due to truncation of the urea transporter gene. The lack of the pathways to synthesize usual compatible solutes makes it intolerant to high salinity (>0.03%), but could adapt to low salinity (0.005%) environments. This adaptation, together with possibly enhanced cell-biofilm attachment ability, makes it suitable for WWTPs environment. We propose the name “Candidatus Nitrosotenuis cloacae” for the strain SAT1.

  18. A novel ammonia-oxidizing archaeon from wastewater treatment plant: Its enrichment, physiological and genomic characteristics

    PubMed Central

    Li, Yuyang; Ding, Kun; Wen, Xianghua; Zhang, Bing; Shen, Bo; Yang, Yunfeng

    2016-01-01

    Ammonia-oxidizing archaea (AOA) are recently found to participate in the ammonia removal processes in wastewater treatment plants (WWTPs), similar to their bacterial counterparts. However, due to lack of cultivated AOA strains from WWTPs, their functions and contributions in these systems remain unclear. Here we report a novel AOA strain SAT1 enriched from activated sludge, with its physiological and genomic characteristics investigated. The maximal 16S rRNA gene similarity between SAT1 and other reported AOA strain is 96% (with “Ca. Nitrosotenuis chungbukensis”), and it is affiliated with Wastewater Cluster B (WWC-B) based on amoA gene phylogeny, a cluster within group I.1a and specific for activated sludge. Our strain is autotrophic, mesophilic (25 °C–33 °C) and neutrophilic (pH 5.0–7.0). Its genome size is 1.62 Mb, with a large fragment inversion (accounted for 68% genomic size) inside. The strain could not utilize urea due to truncation of the urea transporter gene. The lack of the pathways to synthesize usual compatible solutes makes it intolerant to high salinity (>0.03%), but could adapt to low salinity (0.005%) environments. This adaptation, together with possibly enhanced cell-biofilm attachment ability, makes it suitable for WWTPs environment. We propose the name “Candidatus Nitrosotenuis cloacae” for the strain SAT1. PMID:27030530

  19. Resequencing diverse Chinese indigenous breeds to enrich the map of genomic variations in swine.

    PubMed

    Kang, Huimin; Wang, Haifei; Fan, Ziyao; Zhao, Pengju; Khan, Amjad; Yin, Zongjun; Wang, Jiafu; Bao, Wenbin; Wang, Aiguo; Zhang, Qin; Liu, Jian-Feng

    2015-11-01

    To enrich the map of genomic variations in swine, we randomly sequenced 13 domestic and wild individuals from China and Europe. We detected approximately 28.1 million single nucleotide variants (SNVs) and 3.6 million short insertions and deletions (INDELs), of which 2,530,248 SNVs and 3,456,626 INDELs were firstly identified compared with dbSNP 143. Moreover, 208,687 SNVs and 24,161 INDELs were uniquely observed in Chinese pigs, potentially accounting for phenotypic differences between Chinese and European pigs. Furthermore, significantly high correlation between SNV and INDEL was witnessed, which indicated that these two distinct variants may share similar etiologies. We also predicted loss of function genes and found that they were under weaker evolutionary constraints. This study gives interesting insights into the genomic features of the Chinese pig breeds. These data would be useful in the establishment of high-density SNP map and would lay a foundation for facilitating pig functional genomics study. PMID:26296457

  20. A novel ammonia-oxidizing archaeon from wastewater treatment plant: Its enrichment, physiological and genomic characteristics.

    PubMed

    Li, Yuyang; Ding, Kun; Wen, Xianghua; Zhang, Bing; Shen, Bo; Yang, Yunfeng

    2016-01-01

    Ammonia-oxidizing archaea (AOA) are recently found to participate in the ammonia removal processes in wastewater treatment plants (WWTPs), similar to their bacterial counterparts. However, due to lack of cultivated AOA strains from WWTPs, their functions and contributions in these systems remain unclear. Here we report a novel AOA strain SAT1 enriched from activated sludge, with its physiological and genomic characteristics investigated. The maximal 16S rRNA gene similarity between SAT1 and other reported AOA strain is 96% (with "Ca. Nitrosotenuis chungbukensis"), and it is affiliated with Wastewater Cluster B (WWC-B) based on amoA gene phylogeny, a cluster within group I.1a and specific for activated sludge. Our strain is autotrophic, mesophilic (25 °C-33 °C) and neutrophilic (pH 5.0-7.0). Its genome size is 1.62 Mb, with a large fragment inversion (accounted for 68% genomic size) inside. The strain could not utilize urea due to truncation of the urea transporter gene. The lack of the pathways to synthesize usual compatible solutes makes it intolerant to high salinity (>0.03%), but could adapt to low salinity (0.005%) environments. This adaptation, together with possibly enhanced cell-biofilm attachment ability, makes it suitable for WWTPs environment. We propose the name "Candidatus Nitrosotenuis cloacae" for the strain SAT1. PMID:27030530

  1. A new age in functional genomics using CRISPR/Cas9 in arrayed library screening

    PubMed Central

    Agrotis, Alexander; Ketteler, Robin

    2015-01-01

    CRISPR technology has rapidly changed the face of biological research, such that precise genome editing has now become routine for many labs within several years of its initial development. What makes CRISPR/Cas9 so revolutionary is the ability to target a protein (Cas9) to an exact genomic locus, through designing a specific short complementary nucleotide sequence, that together with a common scaffold sequence, constitute the guide RNA bridging the protein and the DNA. Wild-type Cas9 cleaves both DNA strands at its target sequence, but this protein can also be modified to exert many other functions. For instance, by attaching an activation domain to catalytically inactive Cas9 and targeting a promoter region, it is possible to stimulate the expression of a specific endogenous gene. In principle, any genomic region can be targeted, and recent efforts have successfully generated pooled guide RNA libraries for coding and regulatory regions of human, mouse and Drosophila genomes with high coverage, thus facilitating functional phenotypic screening. In this review, we will highlight recent developments in the area of CRISPR-based functional genomics and discuss potential future directions, with a special focus on mammalian cell systems and arrayed library screening. PMID:26442115

  2. Library+

    ERIC Educational Resources Information Center

    Merrill, Alex

    2011-01-01

    This article discusses possible future directions for academic libraries in the post Web/Library 2.0 world. These possible directions include areas such as data literacy, linked data sets, and opportunities for libraries in support of digital humanities. The author provides a brief sketch of the background information regarding the topics and…

  3. The first insight into the Taxus genome via fosmid library construction and end sequencing.

    PubMed

    Hao, DaCheng; Yang, Ling; Xiao, PeiGen

    2011-03-01

    Taxus mairei is a critically endangered and commercially important cultured medicinal gymnosperm in China and forms an important medicinal resource, but the research of its genome is absent. In this study, we constructed a T. mairei fosmid library and analyzed the fosmid end sequences to provide a preliminary assessment of the genome. The library consists of one million clones with an average insert size of about 39 kb, amounting to 3.9 genome equivalents. Fosmid stability assays indicate that T. mairei DNA was stable during propagation in the fosmid system. End sequencing of both 5' and 3' ends of 968 individual clones generated 1,923 sequences after trimming, with an average sequence length of 839 bp. BLASTN searches of the nr and EST databases of GenBank and BLASTX searches of the nr database resulted in 560 (29.1%) significant hits (E < e(-5)). Repetitive sequences analysis revealed that 20.8% of end sequences are repetitive elements, which were composed of retroelements, DNA transposons, satellites, simple repeats, and low complexity sequences. The distribution pattern of various repeat types was found to be more similar to the gymnosperm Pinus and Picea than to the monocot and dicot. The satellites of T. mairei were significantly longer than those of P. taeda and P. glauca. The tetra-nucleotide repeats of T. mairei were much longer than those of P. glauca and P. taeda. The fosmid library and the fosmid end sequences, for the first time, will serve as a useful resource for large-scale genome sequencing, physical mapping, SSR marker development and positional cloning, and provide a better understanding of the Taxus genome. PMID:21207064

  4. Genome-wide enrichment analysis between endometriosis and obesity-related traits reveals novel susceptibility loci

    PubMed Central

    Rahmioglu, Nilufer; Macgregor, Stuart; Drong, Alexander W.; Hedman, Åsa K.; Harris, Holly R.; Randall, Joshua C.; Prokopenko, Inga; Nyholt, Dale R.; Morris, Andrew P.; Montgomery, Grant W.; Missmer, Stacey A.; Lindgren, Cecilia M.; Zondervan, Krina T.

    2015-01-01

    Endometriosis is a chronic inflammatory condition in women that results in pelvic pain and subfertility, and has been associated with decreased body mass index (BMI). Genetic variants contributing to the heritable component have started to emerge from genome-wide association studies (GWAS), although the majority remain unknown. Unexpectedly, we observed an intergenic locus on 7p15.2 that was genome-wide significantly associated with both endometriosis and fat distribution (waist-to-hip ratio adjusted for BMI; WHRadjBMI) in an independent meta-GWAS of European ancestry individuals. This led us to investigate the potential overlap in genetic variants underlying the aetiology of endometriosis, WHRadjBMI and BMI using GWAS data. Our analyses demonstrated significant enrichment of common variants between fat distribution and endometriosis (P = 3.7 × 10−3), which was stronger when we restricted the investigation to more severe (Stage B) cases (P = 4.5 × 10−4). However, no genetic enrichment was observed between endometriosis and BMI (P = 0.79). In addition to 7p15.2, we identify four more variants with statistically significant evidence of involvement in both endometriosis and WHRadjBMI (in/near KIFAP3, CAB39L, WNT4, GRB14); two of these, KIFAP3 and CAB39L, are novel associations for both traits. KIFAP3, WNT4 and 7p15.2 are associated with the WNT signalling pathway; formal pathway analysis confirmed a statistically significant (P = 6.41 × 10−4) overrepresentation of shared associations in developmental processes/WNT signalling between the two traits. Our results demonstrate an example of potential biological pleiotropy that was hitherto unknown, and represent an opportunity for functional follow-up of loci and further cross-phenotype comparisons to assess how fat distribution and endometriosis pathogenesis research fields can inform each other. PMID:25296917

  5. Characterization of a BAC Library from Channel Catfish Ictalurus punctatus: Indications of High Rates of Evolution Among Teleost Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The CHORI-212 bacterial artificial chromosome (BAC) library was constructed by cloning EcoRI/EcoRI partially digested DNA into the pTARBAC2.1 vector. The library has an average insert size of 161 kb, and provides 10.6-fold coverage of the channel catfish haploid genome. Screening of 32 genes using o...

  6. Core and region-enriched networks of behaviorally regulated genes and the singing genome

    PubMed Central

    Whitney, Osceola; Pfenning, Andreas R.; Howard, Jason T.; Blatti, Charles A; Liu, Fang; Ward, James M.; Wang, Rui; Audet, Jean-Nicolas; Kellis, Manolis; Mukherjee, Sayan; Sinha, Saurabh; Hartemink, Alexander J.; West, Anne E.; Jarvis, Erich D.

    2015-01-01

    Songbirds represent an important model organism for elucidating molecular mechanisms that link genes with complex behaviors, in part because they have discrete vocal learning circuits that have parallels with those that mediate human speech. We found that ~10% of the genes in the avian genome were regulated by singing, and we found a striking regional diversity of both basal and singing-induced programs in the four key song nuclei of the zebra finch, a vocal learning songbird. The region-enriched patterns were a result of distinct combinations of region-enriched transcription factors (TFs), their binding motifs, and presinging acetylation of histone 3 at lysine 27 (H3K27ac) enhancer activity in the regulatory regions of the associated genes. RNA interference manipulations validated the role of the calcium-response transcription factor (CaRF) in regulating genes preferentially expressed in specific song nuclei in response to singing. Thus, differential combinatorial binding of a small group of activity-regulated TFs and predefined epigenetic enhancer activity influences the anatomical diversity of behaviorally regulated gene networks. PMID:25504732

  7. A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks.

    PubMed

    Merlet, Benjamin; Paulhe, Nils; Vinson, Florence; Frainay, Clément; Chazalviel, Maxime; Poupin, Nathalie; Gloaguen, Yoann; Giacomoni, Franck; Jourdan, Fabien

    2016-01-01

    This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities. PMID:26909353

  8. A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks

    PubMed Central

    Merlet, Benjamin; Paulhe, Nils; Vinson, Florence; Frainay, Clément; Chazalviel, Maxime; Poupin, Nathalie; Gloaguen, Yoann; Giacomoni, Franck; Jourdan, Fabien

    2016-01-01

    This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities. PMID:26909353

  9. MethylC-seq library preparation for base-resolution whole-genome bisulfite sequencing

    PubMed Central

    Urich, Mark A; Nery, Joseph R; Lister, Ryan; Schmitz, Robert J; Ecker, Joseph R

    2015-01-01

    Current high-throughput DNA sequencing technologies enable acquisition of billions of data points through which myriad biological processes can be interrogated, including genetic variation, chromatin structure, gene expression patterns, small RNAs and protein–DNA interactions. Here we describe the MethylC-sequencing (MethylC-seq) library preparation method, a 2-d protocol that enables the genome-wide identification of cytosine DNA methylation states at single-base resolution. The technique involves fragmentation of genomic DNA followed by adapter ligation, bisulfite conversion and limited amplification using adapter-specific PCR primers in preparation for sequencing. To date, this protocol has been successfully applied to genomic DNA isolated from primary cell culture, sorted cells and fresh tissue from over a thousand plant and animal samples. PMID:25692984

  10. Screening of an E. coli O157:H7 Bacterial Artificial Chromosome Library by Comparative Genomic Hybridization to Identify Genomic Regions Contributing to Growth in Bovine Gastrointestinal Mucus and Epithelial Cell Colonization

    PubMed Central

    Bai, Jianing; McAteer, Sean P.; Paxton, Edith; Mahajan, Arvind; Gally, David L.; Tree, Jai J.

    2011-01-01

    Enterohemorrhagic E. coli (EHEC) O157:H7 can cause serious gastrointestinal and systemic disease in humans following direct or indirect exposure to ruminant feces containing the bacterium. The main colonization site of EHEC O157:H7 in cattle is the terminal rectum where the bacteria intimately attach to the epithelium and multiply in the intestinal mucus. This study aimed to identify genomic regions of EHEC O157:H7 that contribute to colonization and multiplication at this site. A bacterial artificial chromosome (BAC) library was generated from a derivative of the sequenced E. coli O157:H7 Sakai strain. The library contains 1152 clones averaging 150 kbp. To verify the library, clones containing a complete locus of enterocyte effacement (LEE) were identified by DNA hybridization. In line with a previous report, these did not confer a type III secretion (T3S) capacity to the K-12 host strain. However, conjugation of one of the BAC clones into a strain containing a partial LEE deletion restored T3S. Three hundred eighty-four clones from the library were subjected to two different selective screens; one involved three rounds of adherence assays to bovine primary rectal epithelial cells while the other competed the clones over three rounds of growth in bovine rectal mucus. The input strain DNA was then compared with the selected strains using comparative genomic hybridization (CGH) on an E. coli microarray. The adherence assay enriched for pO157 DNA indicating the importance of this plasmid for colonization of rectal epithelial cells. The mucus assay enriched for multiple regions involved in carbohydrate utilization, including hexuronate uptake, indicating that these regions provide a competitive growth advantage in bovine mucus. This BAC-CGH approach provides a positive selection screen that complements negative selection transposon-based screens. As demonstrated, this may be of particular use for identifying genes with redundant functions such as adhesion and carbon

  11. Scribl: an HTML5 Canvas-based graphics library for visualizing genomic data over the web

    PubMed Central

    Miller, Chase A.; Anthony, Jon; Meyer, Michelle M.; Marth, Gabor

    2013-01-01

    Motivation: High-throughput biological research requires simultaneous visualization as well as analysis of genomic data, e.g. read alignments, variant calls and genomic annotations. Traditionally, such integrative analysis required desktop applications operating on locally stored data. Many current terabyte-size datasets generated by large public consortia projects, however, are already only feasibly stored at specialist genome analysis centers. As even small laboratories can afford very large datasets, local storage and analysis are becoming increasingly limiting, and it is likely that most such datasets will soon be stored remotely, e.g. in the cloud. These developments will require web-based tools that enable users to access, analyze and view vast remotely stored data with a level of sophistication and interactivity that approximates desktop applications. As rapidly dropping cost enables researchers to collect data intended to answer questions in very specialized contexts, developers must also provide software libraries that empower users to implement customized data analyses and data views for their particular application. Such specialized, yet lightweight, applications would empower scientists to better answer specific biological questions than possible with general-purpose genome browsers currently available. Results: Using recent advances in core web technologies (HTML5), we developed Scribl, a flexible genomic visualization library specifically targeting coordinate-based data such as genomic features, DNA sequence and genetic variants. Scribl simplifies the development of sophisticated web-based graphical tools that approach the dynamism and interactivity of desktop applications. Availability and implementation: Software is freely available online at http://chmille4.github.com/Scribl/ and is implemented in JavaScript with all modern browsers supported. Contact: gabor.marth@bc.edu Supplementary information: Supplementary data are available at Bioinformatics

  12. Fidelity by design: Yoctoreactor and binder trap enrichment for small-molecule DNA-encoded libraries and drug discovery.

    PubMed

    Blakskjaer, Peter; Heitner, Tara; Hansen, Nils Jakob Vest

    2015-06-01

    DNA-encoded small-molecule library (DEL) technology allows vast drug-like small molecule libraries to be efficiently synthesized in a combinatorial fashion and screened in a single tube method for binding, with an assay readout empowered by advances in next generation sequencing technology. This approach has increasingly been applied as a viable technology for the identification of small-molecule modulators to protein targets and as precursors to drugs in the past decade. Several strategies for producing and for screening DELs have been devised by both academic and industrial institutions. This review highlights some of the most significant and recent strategies along with important results. A special focus on the production of high fidelity DEL technologies with the ability to eliminate screening noise and false positives is included: using a DNA junction called the Yoctoreactor, building blocks (BBs) are spatially confined at the center of the junction facilitating both the chemical reaction between BBs and encoding of the synthetic route. A screening method, known as binder trap enrichment, permits DELs to be screened robustly in a homogeneous manner delivering clean data sets and potent hits for even the most challenging targets. PMID:25732963

  13. Analysis of expression sequence tags from a full-length-enriched cDNA library of developing sesame seeds (Sesamum indicum)

    PubMed Central

    2011-01-01

    Background Sesame (Sesamum indicum) is one of the most important oilseed crops with high oil contents and rich nutrient value. However, genetic improvement efforts in sesame could not get benefit from molecular biology technology due to poor DNA and RNA sequence resources. In this study, we carried out a large scale of expressed sequence tags (ESTs) sequencing from developing sesame seeds and further conducted analysis on seed storage products-related genes. Results A normalized and full-length enriched cDNA library from 5 ~ 30 days old immature seeds was constructed and randomly sequenced, leading to generation of 41,248 expressed sequence tags (ESTs) which then formed 4,713 contigs and 27,708 singletons with 44.9% uniESTs being putative full-length open reading frames. Approximately 26,091 of all these uniESTs have significant matches to the counterparts in Nr database of GenBank, and 21,628 of them were assigned to one or more Gene ontology (GO) terms. Homologous genes involved in oil biosynthesis were identified including some conservative transcription factors regulating oil biosynthesis such as LEAFY COTYLEDON1 (LEC1), PICKLE (PKL), WRINKLED1 (WRI1) and majority of them were found for the first time in sesame seeds. One hundred and 17 ESTs were identified possibly involved in biosynthesis of sesame lignans, sesamin and sesamolin. In total, 9,347 putative functional genes from developing seeds were identified, which accounts for one third of total genes in the sesame genome. Further analysis of the uniESTs identified 1,949 non-redundant simple sequence repeats (SSRs). Conclusions This study has provided an overview of genes expressed during sesame seed development. This collection of sesame full-length cDNAs covered a wide variety of genes in seeds, in particular, candidate genes involved in biosynthesis of sesame oils and lignans. These EST sequences enriched with full length will contribute to comparative genomic studies on sesame and other oilseed plants

  14. Validation of SCALE 4.0 -- CSAS25 module and the 27-group ENDF/B-IV cross-section library for low-enriched uranium systems

    SciTech Connect

    Jordan, W.C.

    1993-02-01

    A version of KENO V.a and the 27-group library in SCALE-4.0 were validated for use in evaluating the nuclear criticality safety of low-enriched uranium systems. A total of 59 critical systems were analyzed. A statistical analysis of the results was performed, and subcritical acceptanced criteria are established.

  15. Validation of SCALE 4. 0 -- CSAS25 module and the 27-group ENDF/B-IV cross-section library for low-enriched uranium systems

    SciTech Connect

    Jordan, W.C.

    1993-02-01

    A version of KENO V.a and the 27-group library in SCALE-4.0 were validated for use in evaluating the nuclear criticality safety of low-enriched uranium systems. A total of 59 critical systems were analyzed. A statistical analysis of the results was performed, and subcritical acceptanced criteria are established.

  16. From the ORFeome concept to highly comprehensive, full-genome screening libraries.

    PubMed

    Rid, Raphaela; Abdel-Hadi, Omar; Maier, Richard; Wagner, Martin; Hundsberger, Harald; Hintner, Helmut; Bauer, Johann; Onder, Kamil

    2013-02-01

    Recombination-based cloning techniques have in recent times facilitated the establishment of genome-scale single-gene ORFeome repositories. Their further handling and downstream application in systematic fashion is, however, practically impeded because of logistical plus economic challenges. At this juncture, simultaneously transferring entire gene collections in compiled pool format could represent an advanced compromise between systematic ORFeome (an organism's entire set of protein-encoding open reading frames) projects and traditional random library approaches, but has not yet been considered in great detail. In our endeavor to merge the comprehensiveness of ORFeomes with a basically simple, streamlined, and easily executable single-tube design, we have here produced five different pooled screening-ready libraries for both Staphylococcus aureus and Homo sapiens. By evaluating the parallel transfer efficiencies of differentially sized genes from initial polymerase chain reaction (PCR) product amplification to entry and final destination library construction via quantitative real-time PCR, we found that the complexity of the gene population is fairly stably maintained once an entry resource has been successfully established, and that no apparent size-selection bias loss of large inserts takes place. Recombinational transfer processes are hence robust enough for straightforwardly achieving such pooled screening libraries. PMID:22621725

  17. Next-generation libraries for robust RNA interference-based genome-wide screens

    PubMed Central

    Kampmann, Martin; Horlbeck, Max A.; Chen, Yuwen; Tsai, Jordan C.; Bassik, Michael C.; Gilbert, Luke A.; Villalta, Jacqueline E.; Kwon, S. Chul; Chang, Hyeshik; Kim, V. Narry; Weissman, Jonathan S.

    2015-01-01

    Genetic screening based on loss-of-function phenotypes is a powerful discovery tool in biology. Although the recent development of clustered regularly interspaced short palindromic repeats (CRISPR)-based screening approaches in mammalian cell culture has enormous potential, RNA interference (RNAi)-based screening remains the method of choice in several biological contexts. We previously demonstrated that ultracomplex pooled short-hairpin RNA (shRNA) libraries can largely overcome the problem of RNAi off-target effects in genome-wide screens. Here, we systematically optimize several aspects of our shRNA library, including the promoter and microRNA context for shRNA expression, selection of guide strands, and features relevant for postscreen sample preparation for deep sequencing. We present next-generation high-complexity libraries targeting human and mouse protein-coding genes, which we grouped into 12 sublibraries based on biological function. A pilot screen suggests that our next-generation RNAi library performs comparably to current CRISPR interference (CRISPRi)-based approaches and can yield complementary results with high sensitivity and high specificity. PMID:26080438

  18. Construction and analysis of Pst I DNA library for RFLP mapping of the rye genome

    SciTech Connect

    Korzun, V.N.; Kartel, N.A.; Boerner, A.

    1995-06-01

    Pst I, a methylation-sensitive restriction enzyme, was used for producing a library of rye genome DNA rich in low-copy sequences, and intended as probes for genetic mapping. Dot-hybridization and Southern blot analysis showed that 43.6% of the library is represented by low-copy DNA sequences. To locate these sequences on chromosomes and determine the degree of their repetitiveness, 11 clones were hybridized with DNA of nulli-tetrasomic lines of Chinese Spring wheat, wheat-rye addition lines, and barley cleaved by Hind III, EcoR I, EcoR V, Dra I, and BamH I restriction enzymes. Each of the rye DNA clones studied hybridized with wheat and barley DNA, suggesting that low-copy Pst I clones of rye correspond to the evolutionary conservative DNA fraction in cereals. 21 refs., 3 figs., 2 tabs.

  19. Development of a Genomic Microsatellite Library in Perennial Ryegrass (Lolium perenne) and its Use in Trait Mapping

    PubMed Central

    King, J.; Thorogood, D.; Edwards, K. J.; Armstead, I. P.; Roberts, L.; Skøt, K.; Hanley, Z.; King, I. P.

    2008-01-01

    Background and Aims Perennial ryegrass (Lolium perenne) is one of the key forage and amenity grasses throughout the world. In the UK it accounts for 70 % of all agricultural land use with an estimated farm gate value of £6 billion per annum. However, in terms of the genetic resources available, L. perenne has lagged behind other major crops in Poaceae. The aim of this project was therefore the construction of a microsatellite-enriched genomic library for L. perenne to increase the number of genetic markers available for both marker-assisted selection in breeding programmes and gene isolation. Methods Primers for 229 non-redundant microsatellite markers were designed and used to screen two L. perenne genotypes, one amenity and one forage. Of the 229 microsatellites, 95 were found to show polymorphism between amenity and forage genotypes. A selection of microsatellite primers was selected from these 95 and used to screen two mapping populations derived from intercrossing and backcrossing the two forage and amenity grass genotypes. Key Results and Conclusions The utility of the resulting genetic maps for analysis of the genetic control of target traits was demonstrated by the mapping of genes associated with heading date to linkage groups 4 and 7. PMID:18281692

  20. Chronic periodontitis genome-wide association studies: gene-centric and gene set enrichment analyses.

    PubMed

    Rhodin, K; Divaris, K; North, K E; Barros, S P; Moss, K; Beck, J D; Offenbacher, S

    2014-09-01

    Recent genome-wide association studies (GWAS) of chronic periodontitis (CP) offer rich data sources for the investigation of candidate genes, functional elements, and pathways. We used GWAS data of CP (n = 4,504) and periodontal pathogen colonization (n = 1,020) from a cohort of adult Americans of European descent participating in the Atherosclerosis Risk in Communities study and employed a MAGENTA approach (i.e., meta-analysis gene set enrichment of variant associations) to obtain gene-centric and gene set association results corrected for gene size, number of single-nucleotide polymorphisms, and local linkage disequilibrium characteristics based on the human genome build 18 (National Center for Biotechnology Information build 36). We used the Gene Ontology, Ingenuity, KEGG, Panther, Reactome, and Biocarta databases for gene set enrichment analyses. Six genes showed evidence of statistically significant association: 4 with severe CP (NIN, p = 1.6 × 10(-7); ABHD12B, p = 3.6 × 10(-7); WHAMM, p = 1.7 × 10(-6); AP3B2, p = 2.2 × 10(-6)) and 2 with high periodontal pathogen colonization (red complex-KCNK1, p = 3.4 × 10(-7); Porphyromonas gingivalis-DAB2IP, p = 1.0 × 10(-6)). Top-ranked genes for moderate CP were HGD (p = 1.4 × 10(-5)), ZNF675 (p = 1.5 × 10(-5)), TNFRSF10C (p = 2.0 × 10(-5)), and EMR1 (p = 2.0 × 10(-5)). Loci containing NIN, EMR1, KCNK1, and DAB2IP had showed suggestive evidence of association in the earlier single-nucleotide polymorphism-based analyses, whereas WHAMM and AP2B2 emerged as novel candidates. The top gene sets included severe CP ("endoplasmic reticulum membrane," "cytochrome P450," "microsome," and "oxidation reduction") and moderate CP ("regulation of gene expression," "zinc ion binding," "BMP signaling pathway," and "ruffle"). Gene-centric analyses offer a promising avenue for efficient interrogation of large-scale GWAS data. These results highlight genes in previously identified loci and new candidate genes and pathways

  1. Genome sequence of Candidatus Nitrososphaera evergladensis from group I.1b enriched from Everglades soil reveals novel genomic features of the ammonia-oxidizing archaea.

    PubMed

    Zhalnina, Kateryna V; Dias, Raquel; Leonard, Michael T; Dorr de Quadros, Patricia; Camargo, Flavio A O; Drew, Jennifer C; Farmerie, William G; Daroub, Samira H; Triplett, Eric W

    2014-01-01

    The activity of ammonia-oxidizing archaea (AOA) leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group. PMID:24999826

  2. Genome Sequence of Candidatus Nitrososphaera evergladensis from Group I.1b Enriched from Everglades Soil Reveals Novel Genomic Features of the Ammonia-Oxidizing Archaea

    PubMed Central

    Zhalnina, Kateryna V.; Dias, Raquel; Leonard, Michael T.; Dorr de Quadros, Patricia; Camargo, Flavio A. O.; Drew, Jennifer C.; Farmerie, William G.; Daroub, Samira H.; Triplett, Eric W.

    2014-01-01

    The activity of ammonia-oxidizing archaea (AOA) leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group. PMID:24999826

  3. Comparative genomics of Lupinus angustifolius gene-rich regions: BAC library exploration, genetic mapping and cytogenetics

    PubMed Central

    2013-01-01

    Background The narrow-leafed lupin, Lupinus angustifolius L., is a grain legume species with a relatively compact genome. The species has 2n = 40 chromosomes and its genome size is 960 Mbp/1C. During the last decade, L. angustifolius genomic studies have achieved several milestones, such as molecular-marker development, linkage maps, and bacterial artificial chromosome (BAC) libraries. Here, these resources were integratively used to identify and sequence two gene-rich regions (GRRs) of the genome. Results The genome was screened with a probe representing the sequence of a microsatellite fragment length polymorphism (MFLP) marker linked to Phomopsis stem blight resistance. BAC clones selected by hybridization were subjected to restriction fingerprinting and contig assembly, and 232 BAC-ends were sequenced and annotated. BAC fluorescence in situ hybridization (BAC-FISH) identified eight single-locus clones. Based on physical mapping, cytogenetic localization, and BAC-end annotation, five clones were chosen for sequencing. Within the sequences of clones that hybridized in FISH to a single-locus, two large GRRs were identified. The GRRs showed strong and conserved synteny to Glycine max duplicated genome regions, illustrated by both identical gene order and parallel orientation. In contrast, in the clones with dispersed FISH signals, more than one-third of sequences were transposable elements. Sequenced, single-locus clones were used to develop 12 genetic markers, increasing the number of L. angustifolius chromosomes linked to appropriate linkage groups by five pairs. Conclusions In general, probes originating from MFLP sequences can assist genome screening and gene discovery. However, such probes are not useful for positional cloning, because they tend to hybridize to numerous loci. GRRs identified in L. angustifolius contained a low number of interspersed repeats and had a high level of synteny to the genome of the model legume G. max. Our results showed that

  4. SNP-based pathway enrichment analysis for genome-wide association studies

    PubMed Central

    2011-01-01

    Background Recently we have witnessed a surge of interest in using genome-wide association studies (GWAS) to discover the genetic basis of complex diseases. Many genetic variations, mostly in the form of single nucleotide polymorphisms (SNPs), have been identified in a wide spectrum of diseases, including diabetes, cancer, and psychiatric diseases. A common theme arising from these studies is that the genetic variations discovered by GWAS can only explain a small fraction of the genetic risks associated with the complex diseases. New strategies and statistical approaches are needed to address this lack of explanation. One such approach is the pathway analysis, which considers the genetic variations underlying a biological pathway, rather than separately as in the traditional GWAS studies. A critical challenge in the pathway analysis is how to combine evidences of association over multiple SNPs within a gene and multiple genes within a pathway. Most current methods choose the most significant SNP from each gene as a representative, ignoring the joint action of multiple SNPs within a gene. This approach leads to preferential identification of genes with a greater number of SNPs. Results We describe a SNP-based pathway enrichment method for GWAS studies. The method consists of the following two main steps: 1) for a given pathway, using an adaptive truncated product statistic to identify all representative (potentially more than one) SNPs of each gene, calculating the average number of representative SNPs for the genes, then re-selecting the representative SNPs of genes in the pathway based on this number; and 2) ranking all selected SNPs by the significance of their statistical association with a trait of interest, and testing if the set of SNPs from a particular pathway is significantly enriched with high ranks using a weighted Kolmogorov-Smirnov test. We applied our method to two large genetically distinct GWAS data sets of schizophrenia, one from European

  5. Microsatellite markers isolated from Cabomba aquatica s.l. (Cabombaceae) from an enriched genomic library1

    PubMed Central

    Barbosa, Tiago D. M.; Trad, Rafaela J.; Bajay, Miklos M.; Amaral, Maria C. E.

    2015-01-01

    Premise of the study: Microsatellite primers were designed for the submersed aquatic plant Cabomba aquatica s.l. (Cabombaceae) and characterized to estimate genetic diversity parameters. Methods and Results: Using a selective hybridization method, we designed and tested 30 simple sequence repeat loci using two natural populations of C. aquatica s.l., resulting in 13 amplifiable loci. Twelve loci were polymorphic, and alleles per locus ranged from two to four across the 49 C. aquatica s.l. individuals. Observed heterozygosity, expected heterozygosity, and fixation index varied from 0.0 to 1.0, 0.0 to 0.5, and −1.0 to −0.0667, respectively, for the Manaus population and from 0.0 to 1.0, 0.0 to 0.6, and −1.0 to 0.4643 for the Viruá population. Conclusions: The developed markers will be used in further taxonomic and population studies within Cabomba. This set of microsatellite primers represents the first report on rapid molecular markers in the genus. PMID:26649271

  6. Identifying microbial fitness determinants by Insertion Sequencing (INSeq) using genome-wide transposon mutant libraries

    PubMed Central

    Goodman, Andrew L.; Wu, Meng; Gordon, Jeffrey I.

    2012-01-01

    Insertion Sequencing (INSeq) is a method for determining the insertion site and relative abundance of large numbers of transposon mutants in a mixed population of isogenic mutants of a sequenced microbial species. INSeq is based on a modified mariner transposon containing MmeI sites at its ends, allowing cleavage at chromosomal sites 16–17bp from the inserted transposon. Genomic regions adjacent to the transposons are amplified by linear PCR with a biotinylated primer. Products are bound to magnetic beads, digested with MmeI, and barcoded with sample-specific linkers appended to each restriction fragment. After limited PCR amplification, fragments are sequenced using a high-throughput instrument. The sequence of each read can be used to map the location of a transposon in the genome. Read count measures the relative abundance of that mutant in the population. Solid-phase library preparation makes this protocol rapid (18h), easy to scale-up, amenable to automation, and useful for a variety of samples. A protocol for characterizing libraries of transposon mutant strains clonally arrayed in multi-well format is provided. PMID:22094732

  7. Complete genome sequence of Methylophilus sp. TWE2 isolated from methane oxidation enrichment culture of tap-water.

    PubMed

    Xia, Fei; Zou, Bin; Shen, Cong; Zhu, Ting; Gao, Xin-Hua; Quan, Zhe-Xue

    2015-10-10

    The non-methane-utilizing methylotroph, Methylophilus sp. TWE2, was isolated from tap-water during the enrichment of methanotrophs with methane. The complete genome sequence of strain TWE2 showed that this bacterium may convert methanol to formaldehyde via catalysis of methanol dehydrogenase (MDH), after which formaldehyde would be assimilated to biomass through the ribulose monophosphate (RuMP) pathway or dissimilated via the tetrahydromethanopterin (H4MPT) pathway. The deficiency of glycolysis and the TCA cycle indicate that strain TWE2 may be an obligate methylotroph. This is the first complete genome sequence of the genus Methylophilus. PMID:26253961

  8. Genomic Libraries and a Host Strain Designed for Highly Efficient Two-Hybrid Selection in Yeast

    PubMed Central

    James, P.; Halladay, J.; Craig, E. A.

    1996-01-01

    The two-hybrid system is a powerful technique for detecting protein-protein interactions that utilizes the well-developed molecular genetics of the yeast Saccharomyces cerevisiae. However, the full potential of this technique has not been realized due to limitations imposed by the components available for use in the system. These limitations include unwieldy plasmid vectors, incomplete or poorly designed two-hybrid libraries, and host strains that result in the selection of large numbers of false positives. We have used a novel multienzyme approach to generate a set of highly representative genomic libraries from S. cerevisiae. In addition, a unique host strain was created that contains three easily assayed reporter genes, each under the control of a different inducible promoter. This host strain is extremely sensitive to weak interactions and eliminates nearly all false positives using simple plate assays. Improved vectors were also constructed that simplify the construction of the gene fusions necessary for the two-hybrid system. Our analysis indicates that the libraries and host strain provide significant improvements in both the number of interacting clones identified and the efficiency of two-hybrid selections. PMID:8978031

  9. Discovery of new Mycoplasma pneumoniae antigens by use of a whole-genome lambda display library.

    PubMed

    Beghetto, Elisa; De Paolis, Francesca; Montagnani, Francesca; Cellesi, Carla; Gargano, Nicola

    2009-01-01

    Mycoplasma pneumoniae is the leading cause of atypical pneumonia in children and young adults. Bacterial colonization can occur in both the upper and the lower respiratory tracts and take place both endemically and epidemically worldwide. Characteristically, the infection is chronic in onset and recovery and both humoral and cell-mediated mechanisms are involved in the response to bacterial colonization. To identify bacterial proteins recognized by host antibody responses, a whole-genome M. pneumoniae library was created and displayed on lambda bacteriophage. The challenge of such a library with sera from individuals hospitalized for mycoplasmal pneumonia allowed the identification of a panel of recombinant bacteriophages carrying B-cell epitopes. Among the already known M. pneumoniae B-cell antigens, our results confirmed the immunogenicity of P1 and P30 adhesins. Also, the data presented in this study localized, within their sequences, the immunodominant epitopes recognized by human immunoglobulins. Furthermore, library screening allowed the identification of four novel immunogenic polypeptides, respectively, encoded by fragments of the MPN152, MPN426, MPN456 and MPN-500 open reading frames, highlighting and further confirming the potential of lambda display technology in antigen and epitope discovery. PMID:18992837

  10. A kingdom-specific protein domain HMM library for improved annotation of fungal genomes

    PubMed Central

    Alam, Intikhab; Hubbard, Simon J; Oliver, Stephen G; Rattray, Magnus

    2007-01-01

    Background Pfam is a general-purpose database of protein domain alignments and profile Hidden Markov Models (HMMs), which is very popular for the annotation of sequence data produced by genome sequencing projects. Pfam provides models that are often very general in terms of the taxa that they cover and it has previously been suggested that such general models may lack some of the specificity or selectivity that would be provided by kingdom-specific models. Results Here we present a general approach to create domain libraries of HMMs for sub-taxa of a kingdom. Taking fungal species as an example, we construct a domain library of HMMs (called Fungal Pfam or FPfam) using sequences from 30 genomes, consisting of 24 species from the ascomycetes group and two basidiomycetes, Ustilago maydis, a fungal pathogen of maize, and the white rot fungus Phanerochaete chrysosporium. In addition, we include the Microsporidion Encephalitozoon cuniculi, an obligate intracellular parasite, and two non-fungal species, the oomycetes Phytophthora sojae and Phytophthora ramorum, both plant pathogens. We evaluate the performance in terms of coverage against the original 30 genomes used in training FPfam and against five more recently sequenced fungal genomes that can be considered as an independent test set. We show that kingdom-specific models such as FPfam can find instances of both novel and well characterized domains, increases overall coverage and detects more domains per sequence with typically higher bitscores than Pfam for the same domain families. An evaluation of the effect of changing E-values on the coverage shows that the performance of FPfam is consistent over the range of E-values applied. Conclusion Kingdom-specific models are shown to provide improved coverage. However, as the models become more specific, some sequences found by Pfam may be missed by the models in FPfam and some of the families represented in the test set are not present in FPfam. Therefore, we recommend

  11. ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions

    PubMed Central

    2011-01-01

    ZINBA (Zero-Inflated Negative Binomial Algorithm) identifies genomic regions enriched in a variety of ChIP-seq and related next-generation sequencing experiments (DNA-seq), calling both broad and narrow modes of enrichment across a range of signal-to-noise ratios. ZINBA models and accounts for factors that co-vary with background or experimental signal, such as G/C content, and identifies enrichment in genomes with complex local copy number variations. ZINBA provides a single unified framework for analyzing DNA-seq experiments in challenging genomic contexts. Software website: http://code.google.com/p/zinba/ PMID:21787385

  12. Toward an Integrated BAC Library Resource for Genome Sequencing and Analysis

    SciTech Connect

    Simon, M. I.; Kim, U.-J.

    2002-02-26

    We developed a great deal of expertise in building large BAC libraries from a variety of DNA sources including humans, mice, corn, microorganisms, worms, and Arabidopsis. We greatly improved the technology for screening these libraries rapidly and for selecting appropriate BACs and mapping BACs to develop large overlapping contigs. We became involved in supplying BACs and BAC contigs to a variety of sequencing and mapping projects and we began to collaborate with Drs. Adams and Venter at TIGR and with Dr. Leroy Hood and his group at University of Washington to provide BACs for end sequencing and for mapping and sequencing of large fragments of chromosome 16. Together with Dr. Ian Dunham and his co-workers at the Sanger Center we completed the mapping and they completed the sequencing of the first human chromosome, chromosome 22. This was published in Nature in 1999 and our BAC contigs made a major contribution to this sequencing effort. Drs. Shizuya and Ding invented an automated highly accurate BAC mapping technique. We also developed long-term collaborations with Dr. Uli Weier at UCSF in the design of BAC probes for characterization of human tumors and specific chromosome deletions and breakpoints. Finally the contribution of our work to the human genome project has been recognized in the publication both by the international consortium and the NIH of a draft sequence of the human genome in Nature last year. Dr. Shizuya was acknowledged in the authorship of that landmark paper. Dr. Simon was also an author on the Venter/Adams Celera project sequencing the human genome that was published in Science last year.

  13. Functional Screening of Metagenome and Genome Libraries for Detection of Novel Flavonoid-Modifying Enzymes

    PubMed Central

    Rabausch, U.; Juergensen, J.; Ilmberger, N.; Böhnke, S.; Fischer, S.; Schubach, B.; Schulte, M.

    2013-01-01

    The functional detection of novel enzymes other than hydrolases from metagenomes is limited since only a very few reliable screening procedures are available that allow the rapid screening of large clone libraries. For the discovery of flavonoid-modifying enzymes in genome and metagenome clone libraries, we have developed a new screening system based on high-performance thin-layer chromatography (HPTLC). This metagenome extract thin-layer chromatography analysis (META) allows the rapid detection of glycosyltransferase (GT) and also other flavonoid-modifying activities. The developed screening method is highly sensitive, and an amount of 4 ng of modified flavonoid molecules can be detected. This novel technology was validated against a control library of 1,920 fosmid clones generated from a single Bacillus cereus isolate and then used to analyze more than 38,000 clones derived from two different metagenomic preparations. Thereby we identified two novel UDP glycosyltransferase (UGT) genes. The metagenome-derived gtfC gene encoded a 52-kDa protein, and the deduced amino acid sequence was weakly similar to sequences of putative UGTs from Fibrisoma and Dyadobacter. GtfC mediated the transfer of different hexose moieties and exhibited high activities on flavones, flavonols, flavanones, and stilbenes and also accepted isoflavones and chalcones. From the control library we identified a novel macroside glycosyltransferase (MGT) with a calculated molecular mass of 46 kDa. The deduced amino acid sequence was highly similar to sequences of MGTs from Bacillus thuringiensis. Recombinant MgtB transferred the sugar residue from UDP-glucose effectively to flavones, flavonols, isoflavones, and flavanones. Moreover, MgtB exhibited high activity on larger flavonoid molecules such as tiliroside. PMID:23686272

  14. Ligation Bias in Illumina Next-Generation DNA Libraries: Implications for Sequencing Ancient Genomes

    PubMed Central

    Seguin-Orlando, Andaine; Schubert, Mikkel; Clary, Joel; Stagegaard, Julia; Alberdi, Maria T.; Prado, José Luis; Prieto, Alfredo; Willerslev, Eske; Orlando, Ludovic

    2013-01-01

    Ancient DNA extracts consist of a mixture of endogenous molecules and contaminant DNA templates, often originating from environmental microbes. These two populations of templates exhibit different chemical characteristics, with the former showing depurination and cytosine deamination by-products, resulting from post-mortem DNA damage. Such chemical modifications can interfere with the molecular tools used for building second-generation DNA libraries, and limit our ability to fully characterize the true complexity of ancient DNA extracts. In this study, we first use fresh DNA extracts to demonstrate that library preparation based on adapter ligation at AT-overhangs are biased against DNA templates starting with thymine residues, contrarily to blunt-end adapter ligation. We observe the same bias on fresh DNA extracts sheared on Bioruptor, Covaris and nebulizers. This contradicts previous reports suggesting that this bias could originate from the methods used for shearing DNA. This also suggests that AT-overhang adapter ligation efficiency is affected in a sequence-dependent manner and results in an uneven representation of different genomic contexts. We then show how this bias could affect the base composition of ancient DNA libraries prepared following AT-overhang ligation, mainly by limiting the ability to ligate DNA templates starting with thymines and therefore deaminated cytosines. This results in particular nucleotide misincorporation damage patterns, deviating from the signature generally expected for authenticating ancient sequence data. Consequently, we show that models adequate for estimating post-mortem DNA damage levels must be robust to the molecular tools used for building ancient DNA libraries. PMID:24205269

  15. Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics

    PubMed Central

    Wang, Wenming; Tanurdzic, Milos; Luo, Meizhong; Sisneros, Nicholas; Kim, Hye Ran; Weng, Jing-Ke; Kudrna, Dave; Mueller, Christopher; Arumuganathan, K; Carlson, John; Chapple, Clint; de Pamphilis, Claude; Mandoli, Dina; Tomkins, Jeff; Wing, Rod A; Banks, Jo Ann

    2005-01-01

    Background The lycophytes are an ancient lineage of vascular plants that diverged from the seed plant lineage about 400 Myr ago. Although the lycophytes occupy an important phylogenetic position for understanding the evolution of plants and their genomes, no genomic resources exist for this group of plants. Results Here we describe the construction of a large-insert bacterial artificial chromosome (BAC) library from the lycophyte Selaginella moellendorffii. Based on cell flow cytometry, this species has the smallest genome size among the different lycophytes tested, including Huperzia lucidula, Diphaiastrum digita, Isoetes engelmanii and S. kraussiana. The arrayed BAC library consists of 9126 clones; the average insert size is estimated to be 122 kb. Inserts of chloroplast origin account for 2.3% of the clones. The BAC library contains an estimated ten genome-equivalents based on DNA hybridizations using five single-copy and two duplicated S. moellendorffii genes as probes. Conclusion The S. moellenforffii BAC library, the first to be constructed from a lycophyte, will be useful to the scientific community as a resource for comparative plant genomics and evolution. PMID:15955246

  16. Chromosome region-specific libraries for human genome analysis. Final progress report, 1 March 1991--28 February 1994

    SciTech Connect

    Kao, F.T.

    1994-04-01

    The objectives of this grant proposal include (1) development of a chromosome microdissection and PCR-mediated microcloning technology, (2) application of this microtechnology to the construction of region-specific libraries for human genome analysis. During this grant period, the authors have successfully developed this microtechnology and have applied it to the construction of microdissection libraries for the following chromosome regions: a whole chromosome 21 (21E), 2 region-specific libraries for the long arm of chromosome 2, 2q35-q37 (2Q1) and 2q33-q35 (2Q2), and 4 region-specific libraries for the entire short arm of chromosome 2, 2p23-p25 (2P1), 2p21-p23 (2P2), 2p14-p16 (wP3) and 2p11-p13 (2P4). In addition, 20--40 unique sequence microclones have been isolated and characterized for genomic studies. These region-specific libraries and the single-copy microclones from the library have been used as valuable resources for (1) isolating microsatellite probes in linkage analysis to further refine the disease locus; (2) isolating corresponding clones with large inserts, e.g. YAC, BAC, P1, cosmid and phage, to facilitate construction of contigs for high resolution physical mapping; and (3) isolating region-specific cDNA clones for use as candidate genes. These libraries are being deposited in the American Type Culture Collection (ATCC) for general distribution.

  17. Draft Genome Sequence of Paenibacillus sp. Strain DMB5, Acclimatized and Enriched for Catabolizing Anthropogenic Compounds

    PubMed Central

    Johnson, Jenny; Shah, Binal; Jain, Kunal; Parmar, Nidhi; Hinsu, Ankit; Patel, Namrata

    2016-01-01

    Here, we present the draft genome sequence of Paenibacillus sp. strain DMB5, isolated from polluted sediments of the Kharicut Canal, Vatva, India, having a genome size of 7.5 Mbp and 7,077 coding sequences. The genome of this dye-degrading bacterium provides valuable information on the microbe-mediated biodegradation of anthropogenic compounds. PMID:27034501

  18. Construction of random sheared fosmid library from Chinese cabbage and its use for Brassica rapa genome sequencing project.

    PubMed

    Park, Tae-Ho; Park, Beom-Seok; Kim, Jin-A; Hong, Joon Ki; Jin, Mina; Seol, Young-Joo; Mun, Jeong-Hwan

    2011-01-01

    As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expected to cover approximately 90% euchromatins of both chromosomes. As the project progresses, BAC selection for sequence extension becomes more limited because BAC libraries are restriction enzyme-specific. To support the project, a random sheared fosmid library was constructed. The library consists of 97536 clones with average insert size of approximately 40 kb corresponding to seven genome equivalents, assuming a Chinese cabbage genome size of 550 Mb. The library was screened with primers designed at the end of sequences of nine points of scaffold gaps where BAC clones cannot be selected to extend the physical contigs. The selected positive clones were end-sequenced to check the overlap between the fosmid clones and the adjacent BAC clones. Nine fosmid clones were selected and fully sequenced. The sequences revealed two completed gap filling and seven sequence extensions, which can be used for further selection of BAC clones confirming that the fosmid library will facilitate the sequence completion of B. rapa. PMID:21338952

  19. From raw materials to validated system: the construction of a genomic library and microarray to interpret systemic perturbations in Northern bobwhite

    PubMed Central

    Rawat, Arun; Deng, Youping; Garcia-Reyero, Natàlia; Quinn, Michael J.; Johnson, Mark S.; Indest, Karl J.; Elasri, Mohamed O.; Perkins, Edward J.

    2010-01-01

    The limited availability of genomic tools and data for nonmodel species impedes computational and systems biology approaches in nonmodel organisms. Here we describe the development, functional annotation, and utilization of genomic tools for the avian wildlife species Northern bobwhite (Colinus virginianus) to determine the molecular impacts of exposure to 2,6-dinitrotoluene (2,6-DNT), a field contaminant of military concern. Massively parallel pyrosequencing of a normalized multitissue library of Northern bobwhite cDNAs yielded 71,384 unique transcripts that were annotated with gene ontology (GO), pathway information, and protein domain analysis. Comparative genome analyses with model organisms revealed functional homologies in 8,825 unique Northern bobwhite genes that are orthologous to 48% of Gallus gallus protein-coding genes. Pathway analysis and GO enrichment of genes differentially expressed in livers of birds exposed for 60 days (d) to 10 and 60 mg/kg/d 2,6-DNT revealed several impacts validated by RT-qPCR including: prostaglandin pathway-mediated inflammation, increased expression of a heme synthesis pathway in response to anemia, and a shift in energy metabolism toward protein catabolism via inhibition of control points for glucose and lipid metabolic pathways, PCK1 and PPARGC1, respectively. This research effort provides the first comprehensive annotated gene library for Northern bobwhite. Transcript expression analysis provided insights into the metabolic perturbations underlying several observed toxicological phenotypes in a 2,6-DNT exposure case study. Furthermore, the systemic impact of dinitrotoluenes on liver function appears conserved across species as PPAR signaling is similarly affected in fathead minnow liver tissue after exposure to 2,4-DNT. PMID:20406850

  20. Library preparation methodology can influence genomic and functional predictions in human microbiome research

    PubMed Central

    Jones, Marcus B.; Highlander, Sarah K.; Anderson, Ericka L.; Li, Weizhong; Dayrit, Mark; Klitgord, Niels; Fabani, Martin M.; Seguritan, Victor; Green, Jessica; Pride, David T.; Yooseph, Shibu; Biggs, William; Nelson, Karen E.; Venter, J. Craig

    2015-01-01

    Observations from human microbiome studies are often conflicting or inconclusive. Many factors likely contribute to these issues including small cohort sizes, sample collection, and handling and processing differences. The field of microbiome research is moving from 16S rDNA gene sequencing to a more comprehensive genomic and functional representation through whole-genome sequencing (WGS) of complete communities. Here we performed quantitative and qualitative analyses comparing WGS metagenomic data from human stool specimens using the Illumina Nextera XT and Illumina TruSeq DNA PCR-free kits, and the KAPA Biosystems Hyper Prep PCR and PCR-free systems. Significant differences in taxonomy are observed among the four different next-generation sequencing library preparations using a DNA mock community and a cell control of known concentration. We also revealed biases in error profiles, duplication rates, and loss of reads representing organisms that have a high %G+C content that can significantly impact results. As with all methods, the use of benchmarking controls has revealed critical differences among methods that impact sequencing results and later would impact study interpretation. We recommend that the community adopt PCR-free–based approaches to reduce PCR bias that affects calculations of abundance and to improve assemblies for accurate taxonomic assignment. Furthermore, the inclusion of a known-input cell spike-in control provides accurate quantitation of organisms in clinical samples. PMID:26512100

  1. Draft Genome Sequence of “Candidatus Methanomethylophilus” sp. 1R26, Enriched from Bovine Rumen, a Methanogenic Archaeon Belonging to the Methanomassiliicoccales Order

    PubMed Central

    Højberg, Ole; Urich, Tim

    2016-01-01

    Here, we present the draft genome of “Candidatus Methanomethylophilus” sp. 1R26, a member of the newly described Methanomassiliicoccales order of Euryarcheaota. The enrichment culture was established from bovine rumen contents and produced methane from trimethylamine and methanol. The draft genome contains genes for methanogenesis from methylated compounds. PMID:26893425

  2. Draft Genome Sequence of "Candidatus Methanomethylophilus" sp. 1R26, Enriched from Bovine Rumen, a Methanogenic Archaeon Belonging to the Methanomassiliicoccales Order.

    PubMed

    Noel, Samantha Joan; Højberg, Ole; Urich, Tim; Poulsen, Morten

    2016-01-01

    Here, we present the draft genome of "Candidatus Methanomethylophilus" sp. 1R26, a member of the newly described Methanomassiliicoccales order of Euryarcheaota. The enrichment culture was established from bovine rumen contents and produced methane from trimethylamine and methanol. The draft genome contains genes for methanogenesis from methylated compounds. PMID:26893425

  3. Complete Genome Sequence of Pseudoxanthomonas suwonensis Strain J1, a Cellulose-Degrading Bacterium Isolated from Leaf- and Wood-Enriched Soil.

    PubMed

    Hou, Liyuan; Jiang, Jingwei; Xu, Zhihui; Zhou, Yun; Leung, Frederick Chi-Ching

    2015-01-01

    We report here the complete genome sequence of the cellulose-degrading bacterium Pseudoxanthomonas suwonensis strain J1, isolated from soil enriched with rotten leaves and wood from the Zhong Mountain Scenic Area in Nanjing, China. This complete genome may contribute to further investigation of plant biomass degradation. PMID:26067962

  4. Genome engineering uncovers 54 evolutionarily conserved and testis-enriched genes that are not required for male fertility in mice

    PubMed Central

    Miyata, Haruhiko; Castaneda, Julio M.; Fujihara, Yoshitaka; Yu, Zhifeng; Archambeault, Denise R.; Isotani, Ayako; Kiyozumi, Daiji; Kriseman, Maya L.; Mashiko, Daisuke; Matsumura, Takafumi; Matzuk, Ryan M.; Mori, Masashi; Noda, Taichi; Oji, Asami; Okabe, Masaru; Prunskaite-Hyyrylainen, Renata; Ramirez-Solis, Ramiro; Satouh, Yuhkoh; Zhang, Qian; Ikawa, Masahito; Matzuk, Martin M.

    2016-01-01

    Gene-expression analysis studies from Schultz et al. estimate that more than 2,300 genes in the mouse genome are expressed predominantly in the male germ line. As of their 2003 publication [Schultz N, Hamra FK, Garbers DL (2003) Proc Natl Acad Sci USA 100(21):12201–12206], the functions of the majority of these testis-enriched genes during spermatogenesis and fertilization were largely unknown. Since the study by Schultz et al., functional analysis of hundreds of reproductive-tract–enriched genes have been performed, but there remain many testis-enriched genes for which their relevance to reproduction remain unexplored or unreported. Historically, a gene knockout is the “gold standard” to determine whether a gene’s function is essential in vivo. Although knockout mice without apparent phenotypes are rarely published, these knockout mouse lines and their phenotypic information need to be shared to prevent redundant experiments. Herein, we used bioinformatic and experimental approaches to uncover mouse testis-enriched genes that are evolutionarily conserved in humans. We then used gene-disruption approaches, including Knockout Mouse Project resources (targeting vectors and mice) and CRISPR/Cas9, to mutate and quickly analyze the fertility of these mutant mice. We discovered that 54 mutant mouse lines were fertile. Thus, despite evolutionary conservation of these genes in vertebrates and in some cases in all eukaryotes, our results indicate that these genes are not individually essential for male mouse fertility. Our phenotypic data are highly relevant in this fiscally tight funding period and postgenomic age when large numbers of genomes are being analyzed for disease association, and will prevent unnecessary expenditures and duplications of effort by others. PMID:27357688

  5. Robust physical methods that enrich genomic regions identical by descent for linkage studies: confirmation of a locus for osteogenesis imperfecta

    PubMed Central

    Brooks, Peter; Marcaillou, Charles; Vanpeene, Maud; Saraiva, Jean-Paul; Stockholm, Daniel; Francke, Stephan; Favis, Reyna; Cohen, Nadine; Rousseau, Francis; Tores, Frédéric; Lindenbaum, Pierre; Hager, Jörg; Philippi, Anne

    2009-01-01

    Background The monogenic disease osteogenesis imperfecta (OI) is due to single mutations in either of the collagen genes ColA1 or ColA2, but within the same family a given mutation is accompanied by a wide range of disease severity. Although this phenotypic variability implies the existence of modifier gene variants, genome wide scanning of DNA from OI patients has not been reported. Promising genome wide marker-independent physical methods for identifying disease-related loci have lacked robustness for widespread applicability. Therefore we sought to improve these methods and demonstrate their performance to identify known and novel loci relevant to OI. Results We have improved methods for enriching regions of identity-by-descent (IBD) shared between related, afflicted individuals. The extent of enrichment exceeds 10- to 50-fold for some loci. The efficiency of the new process is shown by confirmation of the identification of the Col1A2 locus in osteogenesis imperfecta patients from Amish families. Moreover the analysis revealed additional candidate linkage loci that may harbour modifier genes for OI; a locus on chromosome 1q includes COX-2, a gene implicated in osteogenesis. Conclusion Technology for physical enrichment of IBD loci is now robust and applicable for finding genes for monogenic diseases and genes for complex diseases. The data support the further investigation of genetic loci other than collagen gene loci to identify genes affecting the clinical expression of osteogenesis imperfecta. The discrimination of IBD mapping will be enhanced when the IBD enrichment procedure is coupled with deep resequencing. PMID:19331686

  6. Genome engineering uncovers 54 evolutionarily conserved and testis-enriched genes that are not required for male fertility in mice.

    PubMed

    Miyata, Haruhiko; Castaneda, Julio M; Fujihara, Yoshitaka; Yu, Zhifeng; Archambeault, Denise R; Isotani, Ayako; Kiyozumi, Daiji; Kriseman, Maya L; Mashiko, Daisuke; Matsumura, Takafumi; Matzuk, Ryan M; Mori, Masashi; Noda, Taichi; Oji, Asami; Okabe, Masaru; Prunskaite-Hyyrylainen, Renata; Ramirez-Solis, Ramiro; Satouh, Yuhkoh; Zhang, Qian; Ikawa, Masahito; Matzuk, Martin M

    2016-07-12

    Gene-expression analysis studies from Schultz et al. estimate that more than 2,300 genes in the mouse genome are expressed predominantly in the male germ line. As of their 2003 publication [Schultz N, Hamra FK, Garbers DL (2003) Proc Natl Acad Sci USA 100(21):12201-12206], the functions of the majority of these testis-enriched genes during spermatogenesis and fertilization were largely unknown. Since the study by Schultz et al., functional analysis of hundreds of reproductive-tract-enriched genes have been performed, but there remain many testis-enriched genes for which their relevance to reproduction remain unexplored or unreported. Historically, a gene knockout is the "gold standard" to determine whether a gene's function is essential in vivo. Although knockout mice without apparent phenotypes are rarely published, these knockout mouse lines and their phenotypic information need to be shared to prevent redundant experiments. Herein, we used bioinformatic and experimental approaches to uncover mouse testis-enriched genes that are evolutionarily conserved in humans. We then used gene-disruption approaches, including Knockout Mouse Project resources (targeting vectors and mice) and CRISPR/Cas9, to mutate and quickly analyze the fertility of these mutant mice. We discovered that 54 mutant mouse lines were fertile. Thus, despite evolutionary conservation of these genes in vertebrates and in some cases in all eukaryotes, our results indicate that these genes are not individually essential for male mouse fertility. Our phenotypic data are highly relevant in this fiscally tight funding period and postgenomic age when large numbers of genomes are being analyzed for disease association, and will prevent unnecessary expenditures and duplications of effort by others. PMID:27357688

  7. USE OF COMPETITIVE GENOMIC HYBRIDIZATION TO ENRICH FOR GENOME-SPECIFIC DIFFERENCES BETWEEN TWO CLOSELY RELATED HUMAN FECAL INDICATOR BACTERIA

    EPA Science Inventory

    Enterococci are frequently used as indicators of fecal pollution in surface waters. To accelerate the identification of Enterococcus faecalis-specific DNA sequences, we employed a comparative genomic strategy utilizing a positive selection process to compare E. faec...

  8. Rapid Virulence Annotation (RVA): Identification of virulence factors using a bacterial genome library and multiple invertebrate hosts

    PubMed Central

    Waterfield, Nicholas R.; Sanchez-Contreras, Maria; Eleftherianos, Ioannis; Dowling, Andrea; Yang, Guowei; Wilkinson, Paul; Parkhill, Julian; Thomson, Nicholas; Reynolds, Stuart E.; Bode, Helge B.; Dorus, Steven; ffrench-Constant, Richard H.

    2008-01-01

    Current sequence databases now contain numerous whole genome sequences of pathogenic bacteria. However, many of the predicted genes lack any functional annotation. We describe an assumption-free approach, Rapid Virulence Annotation (RVA), for the high-throughput parallel screening of genomic libraries against four different taxa: insects, nematodes, amoeba, and mammalian macrophages. These hosts represent different aspects of both the vertebrate and invertebrate immune system. Here, we apply RVA to the emerging human pathogen Photorhabdus asymbiotica using “gain of toxicity” assays of recombinant Escherichia coli clones. We describe a wealth of potential virulence loci and attribute biological function to several putative genomic islands, which may then be further characterized using conventional molecular techniques. The application of RVA to other pathogen genomes promises to ascribe biological function to otherwise uncharacterized virulence genes. PMID:18838673

  9. IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS

    EPA Science Inventory

    Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...

  10. How Accurate are the Extremely Small P-values Used in Genomic Research: An Evaluation of Numerical Libraries

    PubMed Central

    Bangalore, Sai Santosh; Wang, Jelai; Allison, David B.

    2009-01-01

    In the fields of genomics and high dimensional biology (HDB), massive multiple testing prompts the use of extremely small significance levels. Because tail areas of statistical distributions are needed for hypothesis testing, the accuracy of these areas is important to confidently make scientific judgments. Previous work on accuracy was primarily focused on evaluating professionally written statistical software, like SAS, on the Statistical Reference Datasets (StRD) provided by National Institute of Standards and Technology (NIST) and on the accuracy of tail areas in statistical distributions. The goal of this paper is to provide guidance to investigators, who are developing their own custom scientific software built upon numerical libraries written by others. In specific, we evaluate the accuracy of small tail areas from cumulative distribution functions (CDF) of the Chi-square and t-distribution by comparing several open-source, free, or commercially licensed numerical libraries in Java, C, and R to widely accepted standards of comparison like ELV and DCDFLIB. In our evaluation, the C libraries and R functions are consistently accurate up to six significant digits. Amongst the evaluated Java libraries, Colt is most accurate. These languages and libraries are popular choices among programmers developing scientific software, so the results herein can be useful to programmers in choosing libraries for CDF accuracy. PMID:20161126

  11. Towards a Library of Standard Operating Procedures (SOPs) for (meta)genomic annotation

    SciTech Connect

    Kyrpides, Nikos; Angiuoli, Samuel V.; Cochrane, Guy; Field, Dawn; Garrity, George; Gussman, Aaron; Kodira, Chinnappa D.; Klimke, William; Kyrpides, Nikos; Madupu, Ramana; Markowitz, Victor; Tatusova, Tatiana; Thomson, Nick; White, Owen

    2008-04-01

    Genome annotations describe the features of genomes and accompany sequences in genome databases. The methodologies used to generate genome annotation are diverse and typically vary amongst groups. Descriptions of the annotation procedure are helpful in interpreting genome annotation data. Standard Operating Procedures (SOPs) for genome annotation describe the processes that generate genome annotations. Some groups are currently documenting procedures but standards are lacking for structure and content of annotation SOPs. In addition, there is no central repository to store and disseminate procedures and protocols for genome annotation. We highlight the importance of SOPs for genome annotation and endorse a central online repository of SOPs.

  12. pileup.js: a JavaScript library for interactive and in-browser visualization of genomic data

    PubMed Central

    Aksoy, B. Arman; Hodes, Isaac; Perrone, Jaclyn; Hammerbacher, Jeff

    2016-01-01

    pileup.js is a new browser-based genome viewer. It is designed to facilitate the investigation of evidence for genomic variants within larger web applications. It takes advantage of recent developments in the JavaScript ecosystem to provide a modular, reliable and easily embedded library. Availability and implementation: The code and documentation for pileup.js is publicly available at https://github.com/hammerlab/pileup.js under the Apache 2.0 license. Contact: correspondence@hammerlab.org PMID:27153605

  13. Genome-Wide Analyses in Bacteria Show Small-RNA Enrichment for Long and Conserved Intergenic Regions

    PubMed Central

    Tsai, Chen-Hsun; Liao, Rick; Chou, Brendan; Palumbo, Michael

    2014-01-01

    Interest in finding small RNAs (sRNAs) in bacteria has significantly increased in recent years due to their regulatory functions. Development of high-throughput methods and more sophisticated computational algorithms has allowed rapid identification of sRNA candidates in different species. However, given their various sizes (50 to 500 nucleotides [nt]) and their potential genomic locations in the 5′ and 3′ untranslated regions as well as in intergenic regions, identification and validation of true sRNAs have been challenging. In addition, the evolution of bacterial sRNAs across different species continues to be puzzling, given that they can exert similar functions with various sequences and structures. In this study, we analyzed the enrichment patterns of sRNAs in 13 well-annotated bacterial species using existing transcriptome and experimental data. All intergenic regions were analyzed by WU-BLAST to examine conservation levels relative to species within or outside their genus. In total, more than 900 validated bacterial sRNAs and 23,000 intergenic regions were analyzed. The results indicate that sRNAs are enriched in intergenic regions, which are longer and more conserved than the average intergenic regions in the corresponding bacterial genome. We also found that sRNA-coding regions have different conservation levels relative to their flanking regions. This work provides a way to analyze how noncoding RNAs are distributed in bacterial genomes and also shows conserved features of intergenic regions that encode sRNAs. These results also provide insight into the functions of regions surrounding sRNAs and into optimization of RNA search algorithms. PMID:25313390

  14. Construction of a genomic DNA library with a TA vector and its application in cloning of the phytoene synthase gene from the cyanobacterium Spirulina platensis M-135

    NASA Astrophysics Data System (ADS)

    Yoshikazu, Kawata; Shin-Ichi, Yano; Hiroyuki, Kojima

    1998-03-01

    An efficient and simple method for constructing a genomic DNA library using a TA cloning vector is presented. It is based on the sonicative cleavage of genomic DNA and modification of fragment ends with Taq DNA polymerase, followed by ligation using a TA vector. This method was applied for cloning of the phytoene synthase gene crt B from Spirulina platensis. This method is useful when genomic DNA cannot be efficiently digested with restriction enzymes, a problem often encountered during the construction of a genomic DNA library of cyanobacteria.

  15. Physical Analysis of the Complex Rye (Secale cereale L.) Alt4 Aluminium (Aluminum) Tolerance Locus Using a Whole-Genome BAC Library of Rye cv. Blanco

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Rye is a diploid crop species with many outstanding qualities, and is also important as a source of new traits for wheat and triticale improvement. Here we describe a BAC library of rye cv. Blanco, representing a valuable resource for rye molecular genetic studies. The library provides a 6 × genome ...

  16. Genome-wide association study identifies a maternal copy-number deletion in PSG11 enriched among preeclampsia patients

    PubMed Central

    2012-01-01

    Background Specific genetic contributions for preeclampsia (PE) are currently unknown. This genome-wide association study (GWAS) aims to identify maternal single nucleotide polymorphisms (SNPs) and copy-number variants (CNVs) involved in the etiology of PE. Methods A genome-wide scan was performed on 177 PE cases (diagnosed according to National Heart, Lung and Blood Institute guidelines) and 116 normotensive controls. White female study subjects from Iowa were genotyped on Affymetrix SNP 6.0 microarrays. CNV calls made using a combination of four detection algorithms (Birdseye, Canary, PennCNV, and QuantiSNP) were merged using CNVision and screened with stringent prioritization criteria. Due to limited DNA quantities and the deleterious nature of copy-number deletions, it was decided a priori that only deletions would be selected for assay on the entire case-control dataset using quantitative real-time PCR. Results The top four SNP candidates had an allelic or genotypic p-value between 10-5 and 10-6, however, none surpassed the Bonferroni-corrected significance threshold. Three recurrent rare deletions meeting prioritization criteria detected in multiple cases were selected for targeted genotyping. A locus of particular interest was found showing an enrichment of case deletions in 19q13.31 (5/169 cases and 1/114 controls), which encompasses the PSG11 gene contiguous to a highly plastic genomic region. All algorithm calls for these regions were assay confirmed. Conclusions CNVs may confer risk for PE and represent interesting regions that warrant further investigation. Top SNP candidates identified from the GWAS, although not genome-wide significant, may be useful to inform future studies in PE genetics. PMID:22748001

  17. Characterization of biases in phosphopeptide enrichment by Ti(4+)-immobilized metal affinity chromatography and TiO2 using a massive synthetic library and human cell digests.

    PubMed

    Matheron, Lucrece; van den Toorn, Henk; Heck, Albert J R; Mohammed, Shabaz

    2014-08-19

    Outcomes of comparative evaluations of enrichment methods for phosphopeptides depend highly on the experimental protocols used, the operator, the source of the affinity matrix, and the samples analyzed. Here, we attempt such a comparative study exploring a very large synthetic library containing thousands of serine, threonine, and tyrosine phosphorylated peptides, being present in roughly equal abundance, along with their nonphosphorylated counterparts, and use an optimized protocol for enrichment by TiO2 and Ti(4+)-immobilized metal affinity chromatography (IMAC) by a single operator. Surprisingly, our data reveal that there are minimal differences between enrichment of phosphopeptides by TiO2 and Ti(4+)-IMAC when considering biochemical and biophysical parameters such as peptide length, sequence surrounding the site, hydrophobicity, and nature of the amino acid phosphorylated. Similar results were obtained when evaluating a tryptic digest of a cellular lysate, representing a more natural source of phosphopeptides. All the data presented are available via ProteomeXchange with the identifier PXD000759. PMID:25068997

  18. Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome.

    PubMed

    Gu, Zhuoya; Jin, Ke; Crabbe, M James C; Zhang, Yang; Liu, Xiaolin; Huang, Yanyan; Hua, Mengyi; Nan, Peng; Zhang, Zhaolei; Zhong, Yang

    2016-04-01

    Transposable elements (TEs) have no longer been totally considered as "junk DNA" for quite a time since the continual discoveries of their multifunctional roles in eukaryote genomes. As one of the most important and abundant TEs that still active in human genome, Alu, a SINE family, has demonstrated its indispensable regulatory functions at sequence level, but its spatial roles are still unclear. Technologies based on 3C (chromosome conformation capture) have revealed the mysterious three-dimensional structure of chromatin, and make it possible to study the distal chromatin interaction in the genome. To find the role TE playing in distal regulation in human genome, we compiled the new released Hi-C data, TE annotation, histone marker annotations, and the genome-wide methylation data to operate correlation analysis, and found that the density of Alu elements showed a strong positive correlation with the level of chromatin interactions (hESC: r = 0.9, P < 2.2 × 10(16); IMR90 fibroblasts: r = 0.94, P < 2.2 × 10(16)) and also have a significant positive correlation with some remote functional DNA elements like enhancers and promoters (Enhancer: hESC: r = 0.997, P = 2.3 × 10(-4); IMR90: r = 0.934, P = 2 × 10(-2); Promoter: hESC: r = 0.995, P = 3.8 × 10(-4); IMR90: r = 0.996, P = 3.2 × 10(-4)). Further investigation involving GC content and methylation status showed the GC content of Alu covered sequences shared a similar pattern with that of the overall sequence, suggesting that Alu elements also function as the GC nucleotide and CpG site provider. In all, our results suggest that the Alu elements may act as an alternative parameter to evaluate the Hi-C data, which is confirmed by the correlation analysis of Alu elements and histone markers. Moreover, the GC-rich Alu sequence can bring high GC content and methylation flexibility to the regions with more distal chromatin contact, regulating the transcription of tissue-specific genes. PMID:26861146

  19. Comprehensive profiling of retroviral integration sites using target enrichment methods from historical koala samples without an assembled reference genome.

    PubMed

    Cui, Pin; Löber, Ulrike; Alquezar-Planas, David E; Ishida, Yasuko; Courtiol, Alexandre; Timms, Peter; Johnson, Rebecca N; Lenz, Dorina; Helgen, Kristofer M; Roca, Alfred L; Hartman, Stefanie; Greenwood, Alex D

    2016-01-01

    Background. Retroviral integration into the host germline results in permanent viral colonization of vertebrate genomes. The koala retrovirus (KoRV) is currently invading the germline of the koala (Phascolarctos cinereus) and provides a unique opportunity for studying retroviral endogenization. Previous analysis of KoRV integration patterns in modern koalas demonstrate that they share integration sites primarily if they are related, indicating that the process is currently driven by vertical transmission rather than infection. However, due to methodological challenges, KoRV integrations have not been comprehensively characterized. Results. To overcome these challenges, we applied and compared three target enrichment techniques coupled with next generation sequencing (NGS) and a newly customized sequence-clustering based computational pipeline to determine the integration sites for 10 museum Queensland and New South Wales (NSW) koala samples collected between the 1870s and late 1980s. A secondary aim of this study sought to identify common integration sites across modern and historical specimens by comparing our dataset to previously published studies. Several million sequences were processed, and the KoRV integration sites in each koala were characterized. Conclusions. Although the three enrichment methods each exhibited bias in integration site retrieval, a combination of two methods, Primer Extension Capture and hybridization capture is recommended for future studies on historical samples. Moreover, identification of integration sites shows that the proportion of integration sites shared between any two koalas is quite small. PMID:27069793

  20. Comprehensive profiling of retroviral integration sites using target enrichment methods from historical koala samples without an assembled reference genome

    PubMed Central

    Alquezar-Planas, David E.; Ishida, Yasuko; Courtiol, Alexandre; Timms, Peter; Johnson, Rebecca N.; Lenz, Dorina; Helgen, Kristofer M.; Roca, Alfred L.; Hartman, Stefanie

    2016-01-01

    Background. Retroviral integration into the host germline results in permanent viral colonization of vertebrate genomes. The koala retrovirus (KoRV) is currently invading the germline of the koala (Phascolarctos cinereus) and provides a unique opportunity for studying retroviral endogenization. Previous analysis of KoRV integration patterns in modern koalas demonstrate that they share integration sites primarily if they are related, indicating that the process is currently driven by vertical transmission rather than infection. However, due to methodological challenges, KoRV integrations have not been comprehensively characterized. Results. To overcome these challenges, we applied and compared three target enrichment techniques coupled with next generation sequencing (NGS) and a newly customized sequence-clustering based computational pipeline to determine the integration sites for 10 museum Queensland and New South Wales (NSW) koala samples collected between the 1870s and late 1980s. A secondary aim of this study sought to identify common integration sites across modern and historical specimens by comparing our dataset to previously published studies. Several million sequences were processed, and the KoRV integration sites in each koala were characterized. Conclusions. Although the three enrichment methods each exhibited bias in integration site retrieval, a combination of two methods, Primer Extension Capture and hybridization capture is recommended for future studies on historical samples. Moreover, identification of integration sites shows that the proportion of integration sites shared between any two koalas is quite small. PMID:27069793

  1. Methylation quantitative trait loci in the developing brain and their enrichment in schizophrenia-associated genomic regions

    PubMed Central

    Hannon, Eilis; Spiers, Helen; Viana, Joana; Pidsley, Ruth; Burrage, Joe; Murphy, Therese M; Troakes, Claire; Turecki, Gustavo; O’Donovan, Michael C.; Schalkwyk, Leonard C.; Bray, Nicholas J.; Mill, Jonathan

    2015-01-01

    We characterized DNA methylation quantitative trait loci (mQTLs) in a large collection (n=166) of human fetal brain samples spanning 56–166 days post-conception, identifying >16,000 fetal brain mQTLs. Fetal brain mQTLs are primarily cis-acting, enriched in regulatory chromatin domains and transcription factor binding sites, and show significant overlap with genetic variants also associated with gene expression in the brain. Using tissue from three distinct regions of the adult brain (prefrontal cortex, striatum and cerebellum) we show that most fetal brain mQTLs are developmentally stable, although a subset is characterized by fetal-specific effects. We show that fetal brain mQTLs are enriched amongst risk loci identified in a recent large-scale genome-wide association study (GWAS) of schizophrenia, a severe psychiatric disorder with a hypothesized neurodevelopmental component. Finally, we demonstrate how mQTLs can be used to refine GWAS loci through the identification of discrete sites of variable fetal brain methylation associated with schizophrenia risk variants. PMID:26619357

  2. Association Signals Unveiled by a Comprehensive Gene Set Enrichment Analysis of Dental Caries Genome-Wide Association Studies

    PubMed Central

    Cuenco, Karen T.; Zeng, Zhen; Feingold, Eleanor; Marazita, Mary L.; Wang, Lily; Zhao, Zhongming

    2013-01-01

    Gene set-based analysis of genome-wide association study (GWAS) data has recently emerged as a useful approach to examine the joint effects of multiple risk loci in complex human diseases or phenotypes. Dental caries is a common, chronic, and complex disease leading to a decrease in quality of life worldwide. In this study, we applied the approaches of gene set enrichment analysis to a major dental caries GWAS dataset, which consists of 537 cases and 605 controls. Using four complementary gene set analysis methods, we analyzed 1331 Gene Ontology (GO) terms collected from the Molecular Signatures Database (MSigDB). Setting false discovery rate (FDR) threshold as 0.05, we identified 13 significantly associated GO terms. Additionally, 17 terms were further included as marginally associated because they were top ranked by each method, although their FDR is higher than 0.05. In total, we identified 30 promising GO terms, including ‘Sphingoid metabolic process,’ ‘Ubiquitin protein ligase activity,’ ‘Regulation of cytokine secretion,’ and ‘Ceramide metabolic process.’ These GO terms encompass broad functions that potentially interact and contribute to the oral immune response related to caries development, which have not been reported in the standard single marker based analysis. Collectively, our gene set enrichment analysis provided complementary insights into the molecular mechanisms and polygenic interactions in dental caries, revealing promising association signals that could not be detected through single marker analysis of GWAS data. PMID:23967329

  3. Differential representation of sunflower ESTs in enriched organ-specific cDNA libraries in a small scale sequencing project

    PubMed Central

    Fernández, Paula; Paniego, Norma; Lew, Sergio; Hopp, H Esteban; Heinz, Ruth A

    2003-01-01

    Background Subtractive hybridization methods are valuable tools for identifying differentially regulated genes in a given tissue avoiding redundant sequencing of clones representing the same expressed genes, maximizing detection of low abundant transcripts and thus, affecting the efficiency and cost effectiveness of small scale cDNA sequencing projects aimed to the specific identification of useful genes for breeding purposes. The objective of this work is to evaluate alternative strategies to high-throughput sequencing projects for the identification of novel genes differentially expressed in sunflower as a source of organ-specific genetic markers that can be functionally associated to important traits. Results Differential organ-specific ESTs were generated from leaf, stem, root and flower bud at two developmental stages (R1 and R4). The use of different sources of RNA as tester and driver cDNA for the construction of differential libraries was evaluated as a tool for detection of rare or low abundant transcripts. Organ-specificity ranged from 75 to 100% of non-redundant sequences in the different cDNA libraries. Sequence redundancy varied according to the target and driver cDNA used in each case. The R4 flower cDNA library was the less redundant library with 62% of unique sequences. Out of a total of 919 sequences that were edited and annotated, 318 were non-redundant sequences. Comparison against sequences in public databases showed that 60% of non-redundant sequences showed significant similarity to known sequences. The number of predicted novel genes varied among the different cDNA libraries, ranging from 56% in the R4 flower to 16 % in the R1 flower bud library. Comparison with sunflower ESTs on public databases showed that 197 of non-redundant sequences (60%) did not exhibit significant similarity to previously reported sunflower ESTs. This approach helped to successfully isolate a significant number of new reported sequences putatively related to responses

  4. DNA shuttling between plasmid vectors and a genome vector: systematic conversion and preservation of DNA libraries using the Bacillus subtilis genome (BGM) vector.

    PubMed

    Kaneko, Shinya; Akioka, Manami; Tsuge, Kenji; Itaya, Mitsuhiro

    2005-06-24

    The combined use of the contemporary vector systems, the bacterial artificial chromosome (BAC) vector and the Bacillus subtilis genome (BGM) vector, makes possible the handling of giant-length DNA (above 100 kb). Our newly constructed BGM vector efficiently integrated DNA prepared in the BAC vector. A BAC library comprised of 18 independent clones prepared from mitochondrial DNA (mtDNA) of Arabidopsis thaliana was converted to a parallel BGM library using the new BGM vector. The effectiveness of the combined use of the vector systems was confirmed by the stable recovery of all 18 DNAs as BAC clones from the respective BGM clones. We show that DNA in BGM was stably preserved at room temperature after spore formation of the host B.subtilis. Rapid and stable shuttling between Escherichiacoli and the B. subtilis host, combined with spore-mediated DNA storage, may facilitate the long-term and low-cost preservation and the transportation of DNA resources. PMID:15913652

  5. Adventures in the enormous: a 1.8 million clone BAC library for the 21.7 Gb genome of loblolly pine.

    PubMed

    Magbanua, Zenaida V; Ozkan, Seval; Bartlett, Benjamin D; Chouvarine, Philippe; Saski, Christopher A; Liston, Aaron; Cronn, Richard C; Nelson, C Dana; Peterson, Daniel G

    2011-01-01

    Loblolly pine (LP; Pinus taeda L.) is the most economically important tree in the U.S. and a cornerstone species in southeastern forests. However, genomics research on LP and other conifers has lagged behind studies on flowering plants due, in part, to the large size of conifer genomes. As a means to accelerate conifer genome research, we constructed a BAC library for the LP genotype 7-56. The LP BAC library consists of 1,824,768 individually-archived clones making it the largest single BAC library constructed to date, has a mean insert size of 96 kb, and affords 7.6X coverage of the 21.7 Gb LP genome. To demonstrate the efficacy of the library in gene isolation, we screened macroarrays with overgos designed from a pine EST anchored on LP chromosome 10. A positive BAC was sequenced and found to contain the expected full-length target gene, several gene-like regions, and both known and novel repeats. Macroarray analysis using the retrotransposon IFG-7 (the most abundant repeat in the sequenced BAC) as a probe indicates that IFG-7 is found in roughly 210,557 copies and constitutes about 5.8% or 1.26 Gb of LP nuclear DNA; this DNA quantity is eight times the Arabidopsis genome. In addition to its use in genome characterization and gene isolation as demonstrated herein, the BAC library should hasten whole genome sequencing of LP via next-generation sequencing strategies/technologies and facilitate improvement of trees through molecular breeding and genetic engineering. The library and associated products are distributed by the Clemson University Genomics Institute (www.genome.clemson.edu). PMID:21283709

  6. Genome-Wide Association Studies Suggest Limited Immune Gene Enrichment in Schizophrenia Compared to 5 Autoimmune Diseases.

    PubMed

    Pouget, Jennie G; Gonçalves, Vanessa F; Spain, Sarah L; Finucane, Hilary K; Raychaudhuri, Soumya; Kennedy, James L; Knight, Jo

    2016-09-01

    There has been intense debate over the immunological basis of schizophrenia, and the potential utility of adjunct immunotherapies. The major histocompatibility complex is consistently the most powerful region of association in genome-wide association studies (GWASs) of schizophrenia and has been interpreted as strong genetic evidence supporting the immune hypothesis. However, global pathway analyses provide inconsistent evidence of immune involvement in schizophrenia, and it remains unclear whether genetic data support an immune etiology per se. Here we empirically test the hypothesis that variation in immune genes contributes to schizophrenia. We show that there is no enrichment of immune loci outside of the MHC region in the largest genetic study of schizophrenia conducted to date, in contrast to 5 diseases of known immune origin. Among 108 regions of the genome previously associated with schizophrenia, we identify 6 immune candidates (DPP4, HSPD1, EGR1, CLU, ESAM, NFATC3) encoding proteins with alternative, nonimmune roles in the brain. While our findings do not refute evidence that has accumulated in support of the immune hypothesis, they suggest that genetically mediated alterations in immune function may not play a major role in schizophrenia susceptibility. Instead, there may be a role for pleiotropic effects of a small number of immune genes that also regulate brain development and plasticity. Whether immune alterations drive schizophrenia progression is an important question to be addressed by future research, especially in light of the growing interest in applying immunotherapies in schizophrenia. PMID:27242348

  7. Genome Sequence of Halomonas sp. Strain KO116, an Ionic Liquid-Tolerant Marine Bacterium Isolated from a Lignin-Enriched Seawater Microcosm.

    PubMed

    O'Dell, Kaela B; Woo, Hannah L; Utturkar, Sagar; Klingeman, Dawn; Brown, Steven D; Hazen, Terry C

    2015-01-01

    Halomonas sp. strain KO116 was isolated from Nile Delta Mediterranean Sea surface water enriched with insoluble organosolv lignin. It was further screened for growth on alkali lignin minimal salts medium agar. The strain tolerates the ionic liquid 1-ethyl-3-methylimidazolium acetate. Its complete genome sequence is presented in this report. PMID:25953187

  8. Genome Sequence of Halomonas sp. Strain KO116, an Ionic Liquid-Tolerant Marine Bacterium Isolated from a Lignin-Enriched Seawater Microcosm

    PubMed Central

    O’Dell, Kaela B.; Woo, Hannah L.; Utturkar, Sagar; Klingeman, Dawn; Brown, Steven D.

    2015-01-01

    Halomonas sp. strain KO116 was isolated from Nile Delta Mediterranean Sea surface water enriched with insoluble organosolv lignin. It was further screened for growth on alkali lignin minimal salts medium agar. The strain tolerates the ionic liquid 1-ethyl-3-methylimidazolium acetate. Its complete genome sequence is presented in this report. PMID:25953187

  9. Genome Sequence of Halomonas sp. Strain KO116, an Ionic Liquid- Tolerant Marine Bacterium Isolated from a Lignin-Enriched Seawater Microcosm

    DOE PAGESBeta

    O'Dell, Kaela; Woo, Hannah L.; Utturkar, Sagar M.; Klingeman, Dawn Marie; Brown, Steven D.; Hazen, Terry C.

    2015-05-07

    Halomonas sp. strain KO116 was isolated from Nile Delta Mediterranean Sea surface water enriched with insoluble organosolv lignin. It was further screened for growth on alkali lignin minimal salts medium agar. The strain tolerates the ionic liquid 1-ethyl-3-methylimidazolium acetate. Its complete genome sequence is presented in this report.

  10. Use of in vitro OmniPlex libraries for high-throughput comparative genomics and molecular haplotyping

    NASA Astrophysics Data System (ADS)

    Kamberov, Emmanuel; Sleptsova, Irina; Suchyta, Stephen; Bruening, Eric D.; Ziehler, William; Seward Nagel, Julie; Langmore, John P.; Makarov, Vladimir

    2002-06-01

    OmniPlex Technology is a new approach to genome amplification and targeted analysis. Initially the entire genome is reformatted into small, amplifiable molecules called Plexisomes, which represent the entire genome as an OmniPlex Library. The whole genome can be amplified en masse using universal primers; using locus-specific primers, regions as large as 50 kb can be amplified. Amplified Plexisomes can be analyzed using conventional methods such as capillary sequencing and microarray hybridization. The advantages to using OmniPlex as the 'front-end' for conventional analytical instruments are that a) the initial copy number of the analytes can be increased to achieve better signal-to-noire ratio, b) only a single priming site is used and c) up to 20 times fewer biochemical reactions and oligonucleotides are necessary to amplify a large region, compared to conventional PCR. These factors make OmniPlex more flexible, faster, and less expensive than conventional technologies. OmniPlex has been applied to targeted sequencing of human, animal, plant, and microorganism genomes. In addition, OmniPlex is inherently able to haplotype large regions of human DNA to accelerate target discovery and pharmacogenomics. OmniPlex will be a key tool for delivery of improved crops and livestock, new pharmaceutical products, and personalized medicine.

  11. Availability of birth defects and genetic disease information in public libraries -- implications for the Human Genome Project

    SciTech Connect

    Sell, S.; Gettig, E.; Mulvihill, J.J.

    1994-09-01

    In order to better educate the public about birth defects and genetic diseases/testing, access to information is critical. The public library system of the United States is extensive and serves as an invaluable resource to citizens. We surveyed reference librarians at each of 87 public libraries in Allegheny and Westmoreland Counties, Pennsylvania. The study design included a questionnaire to ascertain the genetic knowledge of reference librarians and cataloged current resources in print and via telecommunications available to the public. A high compliance rate was achieved due to the incentive of providing copies of the Alliance of Genetic Support Group Directory to those who responded to the survey along with complete sets of the forty-three March of Dimes Information Sheets currently available. Analysis of demographic data related to the age, gender, and educational background, in addition to the occurrence of personal experiences with genetic disease was ascertained. Reference librarians were chosen as the study group due to the common experience of families seeking further information from the public library after or prior to a genetic consultation. As the Human Genome Project identifies new genes for conditions, people will seek public information more frequently. The study shows that public libraries are an appropriate point of education to and for the public.

  12. Recombinant expression library of Pyrococcus furiosus constructed by high-throughput cloning: a useful tool for functional and structural genomics

    PubMed Central

    Yuan, Hui; Peng, Li; Han, Zhong; Xie, Juan-Juan; Liu, Xi-Peng

    2015-01-01

    Hyperthermophile Pyrococcus furiosus grows optimally near 100°C and is an important resource of many industrial and molecular biological enzymes. To study the structure and function of P. furiosus proteins at whole genome level, we constructed expression plasmids of each P. furiosus gene using a ligase-independent cloning method, which was based on amplifying target gene and vector by PCR using phosphorothioate-modified primers and digesting PCR products by λ exonuclease. Our cloning method had a positive clone percentage of ≥ 80% in 96-well plate cloning format. Small-scale expression experiment showed that 55 out of 80 genes were efficiently expressed in Escherichia coli Strain Rosetta 2(DE3)pLysS. In summary, this recombinant expression library of P. furiosus provides a platform for functional and structural studies, as well as developing novel industrial enzymes. Our cloning scheme is adaptable to constructing recombinant expression library of other sequenced organisms. PMID:26441878

  13. B Chromosomes of Aegilops speltoides Are Enriched in Organelle Genome-Derived Sequences

    PubMed Central

    Ruban, Alevtina; Fuchs, Jörg; Marques, André; Schubert, Veit; Soloviev, Alexander; Raskina, Olga; Badaeva, Ekaterina; Houben, Andreas

    2014-01-01

    B chromosomes (Bs) are dispensable components of the genome exhibiting non-Mendelian inheritance. Chromosome counts and flow cytometric analysis of the grass species Aegilops speltoides revealed a tissue-type specific distribution of the roughly 570 Mbp large B chromosomes. To address the question whether organelle-to-nucleus DNA transfer is a mechanism that drives the evolution of Bs, in situ hybridization was performed with labelled organellar DNA. The observed B-specific accumulation of chloroplast- and mitochondria-derived sequences suggests a reduced selection against the insertion of organellar DNA in supernumerary chromosomes. The distribution of B-localised organellar-derived sequences and other sequences differs between genotypes of different geographical origins. PMID:24587288

  14. Molecular Cloning and Characterization of a Newly Isolated Pyrethroid-Degrading Esterase Gene from a Genomic Library of Ochrobactrum anthropi YZ-1

    PubMed Central

    Song, Jinlong; Shi, Yanhua; Li, Kang; Zhao, Bin; Yan, Yanchun

    2013-01-01

    A novel pyrethroid-degrading esterase gene pytY was isolated from the genomic library of Ochrobactrum anthropi YZ-1. It possesses an open reading frame (ORF) of 897 bp. Blast search showed that its deduced amino acid sequence shares moderate identities (30% to 46%) with most homologous esterases. Phylogenetic analysis revealed that PytY is a member of the esterase VI family. pytY showed very low sequence similarity compared with reported pyrethroid-degrading genes. PytY was expressed, purified, and characterized. Enzyme assay revealed that PytY is a broad-spectrum degrading enzyme that can degrade various pyrethroids. It is a new pyrethroid-degrading gene and enriches genetic resource. Kinetic constants of Km and Vmax were 2.34 mmol·L−1 and 56.33 nmol min−1, respectively, with lambda-cyhalothrin as substrate. PytY displayed good degrading ability and stability over a broad range of temperature and pH. The optimal temperature and pH were of 35°C and 7.5. No cofactors were required for enzyme activity. The results highlighted the potential use of PytY in the elimination of pyrethroid residuals from contaminated environments. PMID:24155944

  15. Construction and characterization of a BAC library for the molecular dissection of a single wild beet centromere and sugar beet (Beta vulgaris) genome analysis.

    PubMed

    Gindullis, F; Dechyeva, D; Schmidt, T

    2001-10-01

    We have constructed a sugar beet bacterial artificial chromosome (BAC) library of the chromosome mutant PRO1. This Beta vulgaris mutant carries a single chromosome fragment of 6-9 Mbp that is derived from the wild beet Beta procumbens and is transmitted efficiently in meiosis and mitosis. The library consists of 50,304 clones, with an average insert size of 125 kb. Filter hybridizations revealed that approximately 3.1% of the clones contain mitochondrial or chloroplast DNA. Based on a haploid genome size of 758 Mbp, the library represents eight genome equivalents. Thus, there is a greater than 99.96% probability that any sequence of the PROI genome can be found in the library. Approximately 0.2% of the clones hybridized with centromeric sequences of the PRO1 minichromosome. Using the identified BAC clones in fluorescence in situ hybridization experiments with PRO1 and B. procumbens chromosome spreads, their wild-beet origin and centromeric localization were demonstrated. Comparative Southern hybridization of pulsed-field separated PROI DNA and BAC inserts indicate that the centromeric region of the minichromosome is represented by overlapping clones in the library. Therefore, the PRO1 BAC library provides a useful tool for the characterization of a single plant centromere and is a valuable resource for sugar beet genome analysis. PMID:11681609

  16. Development of genomic resources for the narrow-leafed lupin (Lupinus angustifolius): construction of a bacterial artificial chromosome (BAC) library and BAC-end sequencing

    PubMed Central

    2011-01-01

    Background Lupinus angustifolius L, also known as narrow-leafed lupin (NLL), is becoming an important grain legume crop that is valuable for sustainable farming and is becoming recognised as a potential human health food. Recent interest is being directed at NLL to improve grain production, disease and pest management and health benefits of the grain. However, studies have been hindered by a lack of extensive genomic resources for the species. Results A NLL BAC library was constructed consisting of 111,360 clones with an average insert size of 99.7 Kbp from cv Tanjil. The library has approximately 12 × genome coverage. Both ends of 9600 randomly selected BAC clones were sequenced to generate 13985 BAC end-sequences (BESs), covering approximately 1% of the NLL genome. These BESs permitted a preliminary characterisation of the NLL genome such as organisation and composition, with the BESs having approximately 39% G:C content, 16.6% repetitive DNA and 5.4% putative gene-encoding regions. From the BESs 9966 simple sequence repeat (SSR) motifs were identified and some of these are shown to be potential markers. Conclusions The NLL BAC library and BAC-end sequences are powerful resources for genetic and genomic research on lupin. These resources will provide a robust platform for future high-resolution mapping, map-based cloning, comparative genomics and assembly of whole-genome sequencing data for the species. PMID:22014081

  17. Enriching libraries of high-aspect-ratio micro- or nanostructures by rapid, low-cost, benchtop nanofabrication.

    PubMed

    Kim, Philseok; Adorno-Martinez, Wilmer E; Khan, Mughees; Aizenberg, Joanna

    2012-02-01

    We provide a protocol for transforming the structure of an array of high-aspect-ratio (HAR) micro/nanostructures into various new geometries. Polymeric HAR arrays are replicated from a Bosch-etched silicon master pattern by soft lithography. By using various conditions, the original pattern is coated with metal, which acts as an electrode for the electrodeposition of conductive polymers, transforming the original structure into a wide range of user-defined new designs. These include scaled replicas with sub-100-nm-level control of feature sizes and complex 3D shapes such as tapered or bent columnar structures bearing hierarchical features. Gradients of patterns and shapes on a single substrate can also be produced. This benchtop fabrication protocol allows the production of customized libraries of arrays of closed-cell or isolated HAR micro/nanostructures at a very low cost within 1 week, when starting from a silicon master that otherwise would be very expensive and slow to produce using conventional fabrication techniques. PMID:22281867

  18. A bacterial artificial chromosome library for the Australian saltwater crocodile (Crocodylus porosus) and its utilization in gene isolation and genome characterization

    PubMed Central

    2009-01-01

    Background Crocodilians (Order Crocodylia) are an ancient vertebrate group of tremendous ecological, social, and evolutionary importance. They are the only extant reptilian members of Archosauria, a monophyletic group that also includes birds, dinosaurs, and pterosaurs. Consequently, crocodilian genomes represent a gateway through which the molecular evolution of avian lineages can be explored. To facilitate comparative genomics within Crocodylia and between crocodilians and other archosaurs, we have constructed a bacterial artificial chromosome (BAC) library for the Australian saltwater crocodile, Crocodylus porosus. This is the first BAC library for a crocodile and only the second BAC resource for a crocodilian. Results The C. porosus BAC library consists of 101,760 individually archived clones stored in 384-well microtiter plates. NotI digestion of random clones indicates an average insert size of 102 kb. Based on a genome size estimate of 2778 Mb, the library affords 3.7 fold (3.7×) coverage of the C. porosus genome. To investigate the utility of the library in studying sequence distribution, probes derived from CR1a and CR1b, two crocodilian CR1-like retrotransposon subfamilies, were hybridized to C. porosus macroarrays. The results indicate that there are a minimum of 20,000 CR1a/b elements in C. porosus and that their distribution throughout the genome is decidedly non-random. To demonstrate the utility of the library in gene isolation, we probed the C. porosus macroarrays with an overgo designed from a C-mos (oocyte maturation factor) partial cDNA. A BAC containing C-mos was identified and the C-mos locus was sequenced. Nucleotide and amino acid sequence alignment of the C. porosus C-mos coding sequence with avian and reptilian C-mos orthologs reveals greater sequence similarity between C. porosus and birds (specifically chicken and zebra finch) than between C. porosus and squamates (green anole). Conclusion We have demonstrated the utility of the

  19. All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs.

    PubMed

    Schork, Andrew J; Thompson, Wesley K; Pham, Phillip; Torkamani, Ali; Roddey, J Cooper; Sullivan, Patrick F; Kelsoe, John R; O'Donovan, Michael C; Furberg, Helena; Schork, Nicholas J; Andreassen, Ole A; Dale, Anders M

    2013-04-01

    Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery Rate (sFDR) methods to leverage genic enrichment in GWAS summary statistics data to uncover new loci likely to replicate in independent samples. Specifically, we use linkage disequilibrium-weighted annotations for each SNP in combination with nominal p-values to estimate the True Discovery Rate (TDR = 1-FDR) for strata determined by different genic categories. We show a consistent pattern of enrichment of polygenic effects in specific annotation categories across diverse phenotypes, with the greatest enrichment for SNPs tagging regulatory and coding genic elements, little enrichment in introns, and negative enrichment for intergenic SNPs. Stratified enrichment directly leads to increased TDR for a given p-value, mirrored by increased replication rates in independent samples. We show this in independent Crohn's disease GWAS, where we find a hundredfold variation in replication rate across genic categories. Applying a well-established sFDR methodology we demonstrate the utility of stratification for improving power of GWAS in complex phenotypes, with increased rejection rates from 20% in height to 300% in schizophrenia with traditional FDR and sFDR both fixed at 0.05. Our analyses demonstrate an inherent stratification among GWAS SNPs with important conceptual implications that can be leveraged by statistical methods to improve the discovery of loci. PMID:23637621

  20. Novel Anti-Campylobacter Compounds Identified Using High Throughput Screening of a Pre-selected Enriched Small Molecules Library

    PubMed Central

    Kumar, Anand; Drozd, Mary; Pina-Mimbela, Ruby; Xu, Xiulan; Helmy, Yosra A.; Antwi, Janet; Fuchs, James R.; Nislow, Corey; Templeton, Jillian; Blackall, Patrick J.; Rajashekara, Gireesh

    2016-01-01

    Campylobacter is a leading cause of foodborne bacterial gastroenteritis worldwide and infections can be fatal. The emergence of antibiotic-resistant Campylobacter spp. necessitates the development of new antimicrobials. We identified novel anti-Campylobacter small molecule inhibitors using a high throughput growth inhibition assay. To expedite screening, we made use of a “bioactive” library of 4182 compounds that we have previously shown to be active against diverse microbes. Screening for growth inhibition of Campylobacter jejuni, identified 781 compounds that were either bactericidal or bacteriostatic at a concentration of 200 μM. Seventy nine of the bactericidal compounds were prioritized for secondary screening based on their physico-chemical properties. Based on the minimum inhibitory concentration against a diverse range of C. jejuni and a lack of effect on gut microbes, we selected 12 compounds. No resistance was observed to any of these 12 lead compounds when C. jejuni was cultured with lethal or sub-lethal concentrations suggesting that C. jejuni is less likely to develop resistance to these compounds. Top 12 compounds also possessed low cytotoxicity to human intestinal epithelial cells (Caco-2 cells) and no hemolytic activity against sheep red blood cells. Next, these 12 compounds were evaluated for ability to clear C. jejuni in vitro. A total of 10 compounds had an anti-C. jejuni effect in Caco-2 cells with some effective even at 25 μM concentrations. These novel 12 compounds belong to five established antimicrobial chemical classes; piperazines, aryl amines, piperidines, sulfonamide, and pyridazinone. Exploitation of analogs of these chemical classes may provide Campylobacter specific drugs that can be applied in both human and animal medicine. PMID:27092106

  1. CDH13 and HCRTR2 May Be Associated with Hypersomnia Symptom of Bipolar Depression: A Genome-Wide Functional Enrichment Pathway Analysis.

    PubMed

    Cho, Chul-Hyun; Lee, Heon-Jeong; Woo, Hyun Goo; Choi, Ji-Hye; Greenwood, Tiffany A; Kelsoe, John R

    2015-07-01

    Although bipolar disorder is highly heritable, the identification of specific genetic variations is limited because of the complex traits underlying the disorder. We performed a genome-wide association study of bipolar disorder using a subphenotype that shows hypersomnia symptom during a major depressive episode. We investigated a total of 2,191 cases, 1,434 controls, and 703,012 single nucleotide polymorphisms (SNPs) in the merged samples obtained from the Translational Genomics Institute and the Genetic Association Information Network. The gene emerging as the most significant by statistical analysis was rs1553441 (odds ratio=0.4093; p=1.20×10(-5); Permuted p=6.0×10(-6)). However, the 5×0(-8) threshold for statistical significance required in a genome-wide association study was not achieved. The functional enrichment pathway analysis showed significant enrichments in the adhesion, development-related, synaptic transmission-related, and cell recognition-related pathways. For further evaluation, each gene of the enriched pathways was reviewed and matched with genes that were suggested to be associated with psychiatric disorders by previous genetic studies. We found that the cadherin 13 and hypocretin (orexin) receptor 2 genes may be involved in the hypersomnia symptom during a major depressive episode of bipolar disorder. PMID:26207136

  2. CDH13 and HCRTR2 May Be Associated with Hypersomnia Symptom of Bipolar Depression: A Genome-Wide Functional Enrichment Pathway Analysis

    PubMed Central

    Cho, Chul-Hyun; Woo, Hyun Goo; Choi, Ji-Hye; Greenwood, Tiffany A.; Kelsoe, John R.

    2015-01-01

    Although bipolar disorder is highly heritable, the identification of specific genetic variations is limited because of the complex traits underlying the disorder. We performed a genome-wide association study of bipolar disorder using a subphenotype that shows hypersomnia symptom during a major depressive episode. We investigated a total of 2,191 cases, 1,434 controls, and 703,012 single nucleotide polymorphisms (SNPs) in the merged samples obtained from the Translational Genomics Institute and the Genetic Association Information Network. The gene emerging as the most significant by statistical analysis was rs1553441 (odds ratio=0.4093; p=1.20×10-5; Permuted p=6.0×10-6). However, the 5×0-8 threshold for statistical significance required in a genome-wide association study was not achieved. The functional enrichment pathway analysis showed significant enrichments in the adhesion, development-related, synaptic transmission-related, and cell recognition-related pathways. For further evaluation, each gene of the enriched pathways was reviewed and matched with genes that were suggested to be associated with psychiatric disorders by previous genetic studies. We found that the cadherin 13 and hypocretin (orexin) receptor 2 genes may be involved in the hypersomnia symptom during a major depressive episode of bipolar disorder. PMID:26207136

  3. Mitogenome assembly from genomic multiplex libraries: comparison of strategies and novel mitogenomes for five species of frogs.

    PubMed

    Machado, D J; Lyra, M L; Grant, T

    2016-05-01

    Next-generation sequencing continues to revolutionize biodiversity studies by generating unprecedented amounts of DNA sequence data for comparative genomic analysis. However, these data are produced as millions or billions of short reads of variable quality that cannot be directly applied in comparative analyses, creating a demand for methods to facilitate assembly. We optimized an in silico strategy to efficiently reconstruct high-quality mitochondrial genomes directly from genomic reads. We tested this strategy using sequences from five species of frogs: Hylodes meridionalis (Hylodidae), Hyloxalus yasuni (Dendrobatidae), Pristimantis fenestratus (Craugastoridae), and Melanophryniscus simplex and Rhinella sp. (Bufonidae). These are the first mitogenomes published for these species, the genera Hylodes, Hyloxalus, Pristimantis, Melanophryniscus and Rhinella, and the families Craugastoridae and Hylodidae. Sequences were generated using only half of one lane of a standard Illumina HiqSeq 2000 flow cell, resulting in fewer than eight million reads. We analysed the reads of Hylodes meridionalis using three different assembly strategies: (1) reference-based (using bowtie2); (2) de novo (using abyss, soapdenovo2 and velvet); and (3) baiting and iterative mapping (using mira and mitobim). Mitogenomes were assembled exclusively with strategy 3, which we employed to assemble the remaining mitogenomes. Annotations were performed with mitos and confirmed by comparison with published amphibian mitochondria. In most cases, we recovered all 13 coding genes, 22 tRNAs, and two ribosomal subunit genes, with minor gene rearrangements. Our results show that few raw reads can be sufficient to generate high-quality scaffolds, making any Illumina machine run using genomic multiplex libraries a potential source of data for organelle assemblies as by-catch. PMID:26607054

  4. Final report. Human artificial episomal chromosome (HAEC) for building large genomic libraries

    SciTech Connect

    Jean-Michael H. Vos

    1999-12-09

    Collections of human DNA fragments are maintained for research purposes as clones in bacterial host cells. However for unknown reasons, some regions of the human genome appear to be unclonable or unstable in bacteria. Their team has developed a system using episomes (extrachromosomal, autonomously replication DNA) that maintains large DNA fragments in human cells. This human artificial episomal chromosomal (HAEC) system may prove useful for coverage of these especially difficult regions. In the broader biomedical community, the HAEC system also shows promise for use in functional genomics and gene therapy. Recent improvements to the HAEC system and its application to mapping, sequencing, and functionally studying human and mouse DNA are summarized. Mapping and sequencing the human genome and model organisms are only the first steps in determining the function of various genetic units critical for gene regulation, DNA replication, chromatin packaging, chromosomal stability, and chromatid segregation. Such studies will require the ability to transfer and manipulate entire functional units into mammalian cells.

  5. Construction of genome-wide physical BAC contigs using mapped cDNA as probes: Toward an integrated BAC library resource for genome sequencing and analysis. Annual report, July 1995--January 1997

    SciTech Connect

    Mitchell, S.C.; Bocskai, D.; Cao, Y.

    1997-12-31

    The goal of human genome project is to characterize and sequence entire genomes of human and several model organisms, thus providing complete sets of information on the entire structure of transcribed, regulatory and other functional regions for these organisms. In the past years, a number of useful genetic and physical markers on human and mouse genomes have been made available along with the advent of BAC library resources for these organisms. The advances in technology and resource development made it feasible to efficiently construct genome-wide physical BAC contigs for human and other genomes. Currently, over 30,000 mapped STSs and 27,000 mapped Unigenes are available for human genome mapping. ESTs and cDNAs are excellent resources for building contig maps for two reasons. Firstly, they exist in two alternative forms--as both sequence information for PCR primer pairs, and cDoreen genomic libraries efficiently for large number of DNA probes by combining over 100 cDNA probes in each hybridization. Second, the linkage and order of genes are rather conserved among human, mouse and other model organisms. Therefore, gene markers have advantages over random anonymous STSs in building maps for comparative genomic studies.

  6. Screening of Metagenomic and Genomic Libraries Reveals Three Classes of Bacterial Enzymes That Overcome the Toxicity of Acrylate

    PubMed Central

    Curson, Andrew R. J.; Burns, Oliver J.; Voget, Sonja; Daniel, Rolf; Todd, Jonathan D.; McInnis, Kathryn; Wexler, Margaret; Johnston, Andrew W. B.

    2014-01-01

    Acrylate is produced in significant quantities through the microbial cleavage of the highly abundant marine osmoprotectant dimethylsulfoniopropionate, an important process in the marine sulfur cycle. Acrylate can inhibit bacterial growth, likely through its conversion to the highly toxic molecule acrylyl-CoA. Previous work identified an acrylyl-CoA reductase, encoded by the gene acuI, as being important for conferring on bacteria the ability to grow in the presence of acrylate. However, some bacteria lack acuI, and, conversely, many bacteria that may not encounter acrylate in their regular environments do contain this gene. We therefore sought to identify new genes that might confer tolerance to acrylate. To do this, we used functional screening of metagenomic and genomic libraries to identify novel genes that corrected an E. coli mutant that was defective in acuI, and was therefore hyper-sensitive to acrylate. The metagenomic libraries yielded two types of genes that overcame this toxicity. The majority encoded enzymes resembling AcuI, but with significant sequence divergence among each other and previously ratified AcuI enzymes. One other metagenomic gene, arkA, had very close relatives in Bacillus and related bacteria, and is predicted to encode an enoyl-acyl carrier protein reductase, in the same family as FabK, which catalyses the final step in fatty-acid biosynthesis in some pathogenic Firmicute bacteria. A genomic library of Novosphingobium, a metabolically versatile alphaproteobacterium that lacks both acuI and arkA, yielded vutD and vutE, two genes that, together, conferred acrylate resistance. These encode sequential steps in the oxidative catabolism of valine in a pathway in which, significantly, methacrylyl-CoA is a toxic intermediate. These findings expand the range of bacteria for which the acuI gene encodes a functional acrylyl-CoA reductase, and also identify novel enzymes that can similarly function in conferring acrylate resistance, likely, again

  7. An Expressed Sequence Tag (EST)-enriched genetic map of turbot (Scophthalmus maximus): a useful framework for comparative genomics across model and farmed teleosts

    PubMed Central

    2012-01-01

    Background The turbot (Scophthalmus maximus) is a relevant species in European aquaculture. The small turbot genome provides a source for genomics strategies to use in order to understand the genetic basis of productive traits, particularly those related to sex, growth and pathogen resistance. Genetic maps represent essential genomic screening tools allowing to localize quantitative trait loci (QTL) and to identify candidate genes through comparative mapping. This information is the backbone to develop marker-assisted selection (MAS) programs in aquaculture. Expressed sequenced tag (EST) resources have largely increased in turbot, thus supplying numerous type I markers suitable for extending the previous linkage map, which was mostly based on anonymous loci. The aim of this study was to construct a higher-resolution turbot genetic map using EST-linked markers, which will turn out to be useful for comparative mapping studies. Results A consensus gene-enriched genetic map of the turbot was constructed using 463 SNP and microsatellite markers in nine reference families. This map contains 438 markers, 180 EST-linked, clustered at 24 linkage groups. Linkage and comparative genomics evidences suggested additional linkage group fusions toward the consolidation of turbot map according to karyotype information. The linkage map showed a total length of 1402.7 cM with low average intermarker distance (3.7 cM; ~2 Mb). A global 1.6:1 female-to-male recombination frequency (RF) ratio was observed, although largely variable among linkage groups and chromosome regions. Comparative sequence analysis revealed large macrosyntenic patterns against model teleost genomes, significant hits decreasing from stickleback (54%) to zebrafish (20%). Comparative mapping supported particular chromosome rearrangements within Acanthopterygii and aided to assign unallocated markers to specific turbot linkage groups. Conclusions The new gene-enriched high-resolution turbot map represents a

  8. KENeV: A web-application for the automated reconstruction and visualization of the enriched metabolic and signaling super-pathways deriving from genomic experiments

    PubMed Central

    Pilalis, Eleftherios; Koutsandreas, Theodoros; Valavanis, Ioannis; Athanasiadis, Emmanouil; Spyrou, George; Chatziioannou, Aristotelis

    2015-01-01

    Gene expression analysis, using high throughput genomic technologies,has become an indispensable step for the meaningful interpretation of the underlying molecular complexity, which shapes the phenotypic manifestation of the investigated biological mechanism. The modularity of the cellular response to different experimental conditions can be comprehended through the exploitation of molecular pathway databases, which offer a controlled, curated background for statistical enrichment analysis. Existing tools enable pathway analysis, visualization, or pathway merging but none integrates a fully automated workflow, combining all above-mentioned modules and destined to non-programmer users. We introduce an online web application, named KEGG Enriched Network Visualizer (KENeV), which enables a fully automated workflow starting from a list of differentially expressed genes and deriving the enriched KEGG metabolic and signaling pathways, merged into two respective, non-redundant super-networks. The final networks can be downloaded as SBML files, for further analysis, or instantly visualized through an interactive visualization module. In conclusion, KENeV (available online at http://www.grissom.gr/kenev) provides an integrative tool, suitable for users with no programming experience, for the functional interpretation, at both the metabolic and signaling level, of differentially expressed gene subsets deriving from genomic experiments. PMID:26925206

  9. KENeV: A web-application for the automated reconstruction and visualization of the enriched metabolic and signaling super-pathways deriving from genomic experiments.

    PubMed

    Pilalis, Eleftherios; Koutsandreas, Theodoros; Valavanis, Ioannis; Athanasiadis, Emmanouil; Spyrou, George; Chatziioannou, Aristotelis

    2015-01-01

    Gene expression analysis, using high throughput genomic technologies,has become an indispensable step for the meaningful interpretation of the underlying molecular complexity, which shapes the phenotypic manifestation of the investigated biological mechanism. The modularity of the cellular response to different experimental conditions can be comprehended through the exploitation of molecular pathway databases, which offer a controlled, curated background for statistical enrichment analysis. Existing tools enable pathway analysis, visualization, or pathway merging but none integrates a fully automated workflow, combining all above-mentioned modules and destined to non-programmer users. We introduce an online web application, named KEGG Enriched Network Visualizer (KENeV), which enables a fully automated workflow starting from a list of differentially expressed genes and deriving the enriched KEGG metabolic and signaling pathways, merged into two respective, non-redundant super-networks. The final networks can be downloaded as SBML files, for further analysis, or instantly visualized through an interactive visualization module. In conclusion, KENeV (available online at http://www.grissom.gr/kenev) provides an integrative tool, suitable for users with no programming experience, for the functional interpretation, at both the metabolic and signaling level, of differentially expressed gene subsets deriving from genomic experiments. PMID:26925206

  10. From Human Monocytes to Genome-Wide Binding Sites - A Protocol for Small Amounts of Blood: Monocyte Isolation/ChIP-Protocol/Library Amplification/Genome Wide Computational Data Analysis

    PubMed Central

    Weiterer, Sebastian; Uhle, Florian; Bhuju, Sabin; Jarek, Michael; Weigand, Markus A.; Bartkuhn, Marek

    2014-01-01

    Chromatin immunoprecipitation in combination with a genome-wide analysis via high-throughput sequencing is the state of the art method to gain genome-wide representation of histone modification or transcription factor binding profiles. However, chromatin immunoprecipitation analysis in the context of human experimental samples is limited, especially in the case of blood cells. The typically extremely low yields of precipitated DNA are usually not compatible with library amplification for next generation sequencing. We developed a highly reproducible protocol to present a guideline from the first step of isolating monocytes from a blood sample to analyse the distribution of histone modifications in a genome-wide manner. Conclusion: The protocol describes the whole work flow from isolating monocytes from human blood samples followed by a high-sensitivity and small-scale chromatin immunoprecipitation assay with guidance for generating libraries compatible with next generation sequencing from small amounts of immunoprecipitated DNA. PMID:24732314

  11. Genome Clone Libraries and Data from the Integrated Molecular Analysis of Genomes and their Expression (I.M.A.G.E.) Consortium

    DOE Data Explorer

    The I.M.A.G.E. Consortium was initiated in 1993 by four academic groups on a collaborative basis after informal discussions led to a common vision of how to achieve an important goal in the study of the human genome: the Integrated Molecular Analysis of Genomes and their Expression Consortium's primary goal is to create arrayed cDNA libraries and associated bioinformatics tools, and make them publicly available to the research community. The primary organisms of interest include intensively studied mammalian species, including human, mouse, rat and non-human primate species. The Consortium has also focused on several commonly studied model organisms; as part of this effort it has arrayed cDNAs from zebrafish, and Fugu (pufferfish) as well as Xenopus laevis and X. tropicalis (frog). Utilizing high speed robotics, over nine million individual cDNA clones have been arrayed into 384-well microtiter plates, and sufficient replicas have been created to distribute copies both to sequencing centers and to a network of five distributors located worldwide. The I.M.A.G.E. Consortium represents the world's largest public cDNA collection, and works closely with the National Institutes of Health's Mammalian Gene Collection(MGC) to help it achieve its goal of creating a full-length cDNA clone for every human and mouse gene. I.M.A.G.E. is also a member of the ORFeome Collaboration, working to generate a complete set of expression-ready open reading frame clones representing each human gene. Custom informatics tools have been developed in support of these projects to better allow the research community to select clones of interest and track and collect all data deposited into public databases about those clones and their related sequences. I.M.A.G.E. clones are publicly available, free of any royalties, and may be used by anyone agreeing with the Consortium's guidelines.

  12. Coexisting/Coexpressing Genomic Libraries (CoGeL) identify interactions among distantly located genetic loci for developing complex microbial phenotypes

    PubMed Central

    Nicolaou, Sergios A.; Gaida, Stefan M.; Papoutsakis, Eleftherios T.

    2011-01-01

    In engineering novel microbial strains for biotechnological applications, beyond a priori identifiable pathways to be engineered, it is becoming increasingly important to develop complex, ill-defined cellular phenotypes. One approach is to screen genomic or metagenomic libraries to identify genes imparting desirable phenotypes, such as tolerance to stressors or novel catabolic programs. Such libraries are limited by their inability to identify interactions among distant genetic loci. To solve this problem, we constructed plasmid- and fosmid-based Escherichia coli Coexisting/Coexpressing Genomic Libraries (CoGeLs). As a proof of principle, four sets of two genes of the l-lysine biosynthesis pathway distantly located on the E. coli chromosome were knocked out. Upon transformation of these auxotrophs with CoGeLs, cells growing without supplementation were found to harbor library inserts containing the knocked-out genes demonstrating the interaction between the two libraries. CoGeLs were also screened to identify genetic loci that work synergistically to create the considerably more complex acid-tolerance phenotype. CoGeL screening identified combination of genes known to enhance acid tolerance (gadBC operon and adiC), but also identified the novel combination of arcZ and recA that greatly enhanced acid tolerance by 9000-fold. arcZ is a small RNA that we show increases pH tolerance alone and together with recA. PMID:21976725

  13. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  14. The MICHR Genomic DNA BioLibrary: An Empirical Study of the Ethics of Biorepository Development

    PubMed Central

    Roessler, Blake J.; Steneck, Nicholas H.; Powell, Lisa

    2015-01-01

    In this article, we report on an effort to study the development and usefulness of a large, broad-use, opt-in biorepository for genomic research, focusing on three ethical issues: providing appropriate understanding, recruiting in ways that do not comprise autonomous decisions, and assessing costs vs. benefits. We conclude: 1) Understanding can be improved by separating the task of informing subjects from documenting informed consent (Common Rule) and permission to use personal health information and samples for research (HIPAA); however, regulations might have to be changed to accommodate this approach. 2) Changing recruiting methods increases efficiency but can interfere with subject autonomy. 3) Finally, we propose a framework for the objective evaluation of the utility of biorepositories and suggest that more attention needs to be paid to use and sustainability. PMID:25742665

  15. The MICHR Genomic DNA BioLibrary: An Empirical Study of the Ethics of Biorepository Development.

    PubMed

    Roessler, Blake J; Steneck, Nicholas H; Connally, Lisa

    2015-02-01

    In this article, we report on an effort to study the development and usefulness of a large, broad-use, opt-in biorepository for genomic research, focusing on three ethical issues: providing appropriate understanding, recruiting in ways that do not comprise autonomous decisions, and assessing costs versus benefits. We conclude the following: (a) Understanding can be improved by separating the task of informing subjects from documenting informed consent (Common Rule) and permission to use personal health information and samples for research (Health Insurance Portability and Accountability Act [HIPAA]); however, regulations might have to be changed to accommodate this approach. (b) Changing recruiting methods increases efficiency but can interfere with subject autonomy. (c) Finally, we propose a framework for the objective evaluation of the utility of biorepositories and suggest that more attention needs to be paid to use and sustainability. PMID:25742665

  16. Phylogenetic marker development for target enrichment from transcriptome and genome skim data: the pipeline and its application in southern African Oxalis (Oxalidaceae).

    PubMed

    Schmickl, Roswitha; Liston, Aaron; Zeisek, Vojtěch; Oberlander, Kenneth; Weitemier, Kevin; Straub, Shannon C K; Cronn, Richard C; Dreyer, Léanne L; Suda, Jan

    2016-09-01

    Phylogenetics benefits from using a large number of putatively independent nuclear loci and their combination with other sources of information, such as the plastid and mitochondrial genomes. To facilitate the selection of orthologous low-copy nuclear (LCN) loci for phylogenetics in nonmodel organisms, we created an automated and interactive script to select hundreds of LCN loci by a comparison between transcriptome and genome skim data. We used our script to obtain LCN genes for southern African Oxalis (Oxalidaceae), a speciose plant lineage in the Greater Cape Floristic Region. This resulted in 1164 LCN genes greater than 600 bp. Using target enrichment combined with genome skimming (Hyb-Seq), we obtained on average 1141 LCN loci, nearly the whole plastid genome and the nrDNA cistron from 23 southern African Oxalis species. Despite a wide range of gene trees, the phylogeny based on the LCN genes was very robust, as retrieved through various gene and species tree reconstruction methods as well as concatenation. Cytonuclear discordance was strong. This indicates that organellar phylogenies alone are unlikely to represent the species tree and stresses the utility of Hyb-Seq in phylogenetics. PMID:26577756

  17. Varicella-zoster virus (VZV) transcription during latency in human ganglia: detection of transcripts mapping to genes 21, 29, 62, and 63 in a cDNA library enriched for VZV RNA.

    PubMed Central

    Cohrs, R J; Barbour, M; Gilden, D H

    1996-01-01

    Information on the extent of virus DNA transcription and translation in infected tissue is crucial to an understanding of herpesvirus latency. To detect low-abundance latent varicella-zoster virus (VZV) transcripts, poly(A)+ RNA extracted from latently infected human trigeminal ganglia was enriched for VZV transcripts by hybridization to biotinylated VZV DNA. After hybridization, the RNA-DNA hybrid was isolated by binding to avidin-coated beads and extensively washed, and the RNA was released by heat denaturation. A lambda-based cDNA library was then constructed from the enriched RNA. PCR and DNA sequencing of DNA extracted from the cDNA library revealed the presence of VZV genes 21, 29, 62, and 63, but not VZV genes 4, 10, 40, 51, and 61, in the enriched cDNA library. These findings confirm the detection of VZV gene 29 and 62 transcripts on Northern (RNA) blots prepared from latently infected human ganglia (J.L. Meier, R.P. Holman, K.D. Croen, J.E. Smialek, and S.E. Straus, Virology 193:193-200, 1993) and the presence of VZV gene 21 transcripts in a cDNA library from mRNA of latently infected ganglia (R.J. Cohrs, K. Srock, M.B. Barbour, G. Owens, R. Mahalingam, M.E. Devlin, M. Wellish and D.H. Gilden, J. Virol. 68:7900-7908,1994) and also reveal, for the first time, the presence of VZV gene 63 RNA in latently infected human ganglia. PMID:8627753

  18. Draft Genome Sequence of Ruminoclostridium sp. Ne3, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus.

    PubMed

    Wang, Han; Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J

    2015-01-01

    The draft genome sequence of Ruminoclostridium sp. Ne3 was reconstructed from the metagenome of a hydrogenogenic microbial consortium growing on xylan. The organism is likely the primary hemicellulose degrader within the consortium. PMID:25908130

  19. Draft Genome Sequence of Ruminoclostridium sp. Ne3, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus

    PubMed Central

    Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J.

    2015-01-01

    The draft genome sequence of Ruminoclostridium sp. Ne3 was reconstructed from the metagenome of a hydrogenogenic microbial consortium growing on xylan. The organism is likely the primary hemicellulose degrader within the consortium. PMID:25908130

  20. Genome-wide analysis of primary CD4+ and CD8+ T cell transcriptomes shows evidence for a network of enriched pathways associated with HIV disease

    PubMed Central

    2011-01-01

    Background HIV preferentially infects CD4+ T cells, and the functional impairment and numerical decline of CD4+ and CD8+ T cells characterize HIV disease. The numerical decline of CD4+ and CD8+ T cells affects the optimal ratio between the two cell types necessary for immune regulation. Therefore, this work aimed to define the genomic basis of HIV interactions with the cellular transcriptome of both CD4+ and CD8+ T cells. Results Genome-wide transcriptomes of primary CD4+ and CD8+ T cells from HIV+ patients were analyzed at different stages of HIV disease using Illumina microarray. For each cell subset, pairwise comparisons were performed and differentially expressed (DE) genes were identified (fold change >2 and B-statistic >0) followed by quantitative PCR validation. Gene ontology (GO) analysis of DE genes revealed enriched categories of complement activation, actin filament, proteasome core and proton-transporting ATPase complex. By gene set enrichment analysis (GSEA), a network of enriched pathways functionally connected by mitochondria was identified in both T cell subsets as a transcriptional signature of HIV disease progression. These pathways ranged from metabolism and energy production (TCA cycle and OXPHOS) to mitochondria meditated cell apoptosis and cell cycle dysregulation. The most unique and significant feature of our work was that the non-progressing status in HIV+ long-term non-progressors was associated with MAPK, WNT, and AKT pathways contributing to cell survival and anti-viral responses. Conclusions These data offer new comparative insights into HIV disease progression from the aspect of HIV-host interactions at the transcriptomic level, which will facilitate the understanding of the genetic basis of transcriptomic interaction of HIV in vivo and how HIV subverts the human gene machinery at the individual cell type level. PMID:21410942

  1. A novel methyl-binding domain protein enrichment method for identifying genome-wide tissue-specific DNA methylation from nanogram DNA samples

    PubMed Central

    2013-01-01

    Background Growing evidence suggests that DNA methylation plays a role in tissue-specific differentiation. Current approaches to methylome analysis using enrichment with the methyl-binding domain protein (MBD) are restricted to large (≥1 μg) DNA samples, limiting the analysis of small tissue samples. Here we present a technique that enables characterization of genome-wide tissue-specific methylation patterns from nanogram quantities of DNA. Results We have developed a methodology utilizing MBD2b/MBD3L1 enrichment for methylated DNA, kinase pre-treated ligation-mediated PCR amplification (MeKL) and hybridization to the comprehensive high-throughput array for relative methylation (CHARM) customized tiling arrays, which we termed MeKL-chip. Kinase modification in combination with the addition of PEG has increased ligation-mediated PCR amplification over 20-fold, enabling >400-fold amplification of starting DNA. We have shown that MeKL-chip can be applied to as little as 20 ng of DNA, enabling comprehensive analysis of small DNA samples. Applying MeKL-chip to the mouse retina (a limited tissue source) and brain, 2,498 tissue-specific differentially methylated regions (T-DMRs) were characterized. The top five T-DMRs (Rgs20, Hes2, Nfic, Cckbr and Six3os1) were validated by pyrosequencing. Conclusions MeKL-chip enables genome-wide methylation analysis of nanogram quantities of DNA with a wide range of observed-to-expected CpG ratios due to the binding properties of the MBD2b/MBD3L1 protein complex. This methodology enabled the first analysis of genome-wide methylation in the mouse retina, characterizing novel T-DMRs. PMID:23759032

  2. Mining new crystal protein genes from Bacillus thuringiensis on the basis of mixed plasmid-enriched genome sequencing and a computational pipeline.

    PubMed

    Ye, Weixing; Zhu, Lei; Liu, Yingying; Crickmore, Neil; Peng, Donghai; Ruan, Lifang; Sun, Ming

    2012-07-01

    We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes. PMID:22544259

  3. Lessons from Library Power: Enriching Teaching and Learning. Final Report of the Evaluation of the National Library Power Initiative, an Initiative of the DeWitt Wallace-Reader's Digest Fund.

    ERIC Educational Resources Information Center

    Zweizig, Douglas L.; Hopkins, Dianne McAfee

    This book presents the results of an evaluation of Library Power, an initiative of the DeWitt Wallace-Reader's Digest Fund that provided support for school library development in 19 communities. Following an introductory chapter, the chapters are organized around key questions of the evaluation. Chapters 2 through 4 address the implementation of…

  4. Whole Genome Duplication and Enrichment of Metal Cation Transporters Revealed by De Novo Genome Sequencing of Extremely Halotolerant Black Yeast Hortaea werneckii

    PubMed Central

    Jackman, Shaun; Turk, Martina; Sadowski, Ivan; Nislow, Corey; Jones, Steven; Birol, Inanc; Cimerman, Nina Gunde; Plemenitaš, Ana

    2013-01-01

    Hortaea werneckii, ascomycetous yeast from the order Capnodiales, shows an exceptional adaptability to osmotically stressful conditions. To investigate this unusual phenotype we obtained a draft genomic sequence of a H. werneckii strain isolated from hypersaline water of solar saltern. Two of its most striking characteristics that may be associated with a halotolerant lifestyle are the large genetic redundancy and the expansion of genes encoding metal cation transporters. Although no sexual state of H. werneckii has yet been described, a mating locus with characteristics of heterothallic fungi was found. The total assembly size of the genome is 51.6 Mb, larger than most phylogenetically related fungi, coding for almost twice the usual number of predicted genes (23333). The genome appears to have experienced a relatively recent whole genome duplication, and contains two highly identical gene copies of almost every protein. This is consistent with some previous studies that reported increases in genomic DNA content triggered by exposure to salt stress. In hypersaline conditions transmembrane ion transport is of utmost importance. The analysis of predicted metal cation transporters showed that most types of transporters experienced several gene duplications at various points during their evolution. Consequently they are present in much higher numbers than expected. The resulting diversity of transporters presents interesting biotechnological opportunities for improvement of halotolerance of salt-sensitive species. The involvement of plasma P-type H+ ATPases in adaptation to different concentrations of salt was indicated by their salt dependent transcription. This was not the case with vacuolar H+ ATPases, which were transcribed constitutively. The availability of this genomic sequence is expected to promote the research of H. werneckii. Studying its extreme halotolerance will not only contribute to our understanding of life in hypersaline environments, but should also

  5. Whole genome duplication and enrichment of metal cation transporters revealed by de novo genome sequencing of extremely halotolerant black yeast Hortaea werneckii.

    PubMed

    Lenassi, Metka; Gostinčar, Cene; Jackman, Shaun; Turk, Martina; Sadowski, Ivan; Nislow, Corey; Jones, Steven; Birol, Inanc; Cimerman, Nina Gunde; Plemenitaš, Ana

    2013-01-01

    Hortaea werneckii, ascomycetous yeast from the order Capnodiales, shows an exceptional adaptability to osmotically stressful conditions. To investigate this unusual phenotype we obtained a draft genomic sequence of a H. werneckii strain isolated from hypersaline water of solar saltern. Two of its most striking characteristics that may be associated with a halotolerant lifestyle are the large genetic redundancy and the expansion of genes encoding metal cation transporters. Although no sexual state of H. werneckii has yet been described, a mating locus with characteristics of heterothallic fungi was found. The total assembly size of the genome is 51.6 Mb, larger than most phylogenetically related fungi, coding for almost twice the usual number of predicted genes (23333). The genome appears to have experienced a relatively recent whole genome duplication, and contains two highly identical gene copies of almost every protein. This is consistent with some previous studies that reported increases in genomic DNA content triggered by exposure to salt stress. In hypersaline conditions transmembrane ion transport is of utmost importance. The analysis of predicted metal cation transporters showed that most types of transporters experienced several gene duplications at various points during their evolution. Consequently they are present in much higher numbers than expected. The resulting diversity of transporters presents interesting biotechnological opportunities for improvement of halotolerance of salt-sensitive species. The involvement of plasma P-type H⁺ ATPases in adaptation to different concentrations of salt was indicated by their salt dependent transcription. This was not the case with vacuolar H⁺ ATPases, which were transcribed constitutively. The availability of this genomic sequence is expected to promote the research of H. werneckii. Studying its extreme halotolerance will not only contribute to our understanding of life in hypersaline environments, but should

  6. Generation of a Genome Scale Lentiviral Vector Library for EF1α Promoter-Driven Expression of Human ORFs and Identification of Human Genes Affecting Viral Titer

    PubMed Central

    Škalamera, Dubravka; Dahmer, Mareike; Purdon, Amy S.; Wilson, Benjamin M.; Ranall, Max V.; Blumenthal, Antje; Gabrielli, Brian; Gonda, Thomas J.

    2012-01-01

    The bottleneck in elucidating gene function through high-throughput gain-of-function genome screening is the limited availability of comprehensive libraries for gene overexpression. Lentiviral vectors are the most versatile and widely used vehicles for gene expression in mammalian cells. Lentiviral supernatant libraries for genome screening are commonly generated in the HEK293T cell line, yet very little is known about the effect of introduced sequences on the produced viral titer, which we have shown to be gene dependent. We have generated an arrayed lentiviral vector library for the expression of 17,030 human proteins by using the GATEWAY® cloning system to transfer ORFs from the Mammalian Gene Collection into an EF1alpha promoter-dependent lentiviral expression vector. This promoter was chosen instead of the more potent and widely used CMV promoter, because it is less prone to silencing and provides more stable long term expression. The arrayed lentiviral clones were used to generate viral supernatant by packaging in the HEK293T cell line. The efficiency of transfection and virus production was estimated by measuring the fluorescence of IRES driven GFP, co-expressed with the ORFs. More than 90% of cloned ORFs produced sufficient virus for downstream screening applications. We identified genes which consistently produced very high or very low viral titer. Supernatants from select clones that were either high or low virus producers were tested on a range of cell lines. Some of the low virus producers, including two previously uncharacterized proteins were cytotoxic to HEK293T cells. The library we have constructed presents a powerful resource for high-throughput gain-of-function screening of the human genome and drug-target discovery. Identification of human genes that affect lentivirus production may lead to improved technology for gene expression using lentiviral vectors. PMID:23251614

  7. Genomic library screening for viruses from the human dental plaque revealed pathogen-specific lytic phage sequences.

    PubMed

    Al-Jarbou, Ahmed Nasser

    2012-01-01

    Bacterial pathogenesis presents an astounding arsenal of virulence factors that allow them to conquer many different niches throughout the course of infection. Principally fascinating is the fact that some bacterial species are able to induce different diseases by expression of different combinations of virulence factors. Nevertheless, studies aiming at screening for the presence of bacteriophages in humans have been limited. Such screening procedures would eventually lead to identification of phage-encoded properties that impart increased bacterial fitness and/or virulence in a particular niche, and hence, would potentially be used to reverse the course of bacterial infections. As the human oral cavity represents a rich and dynamic ecosystem for several upper respiratory tract pathogens. However, little is known about virus diversity in human dental plaque which is an important reservoir. We applied the culture-independent approach to characterize virus diversity in human dental plaque making a library from a virus DNA fraction amplified using a multiple displacement method and sequenced 80 clones. The resulting sequence showed 44% significant identities to GenBank databases by TBLASTX analysis. TBLAST homology comparisons showed that 66% was viral; 18% eukarya; 10% bacterial; 6% mobile elements. These sequences were sorted into 6 contigs and 45 single sequences in which 4 contigs and a single sequence showed significant identity to a small region of a putative prophage in the Corynebacterium diphtheria genome. These findings interestingly highlight the uniqueness of over half of the sequences, whilst the dominance of a pathogen-specific prophage sequences imply their role in virulence. PMID:21969025

  8. Draft Genome Sequence of Clostridium beijerinckii Ne1, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus.

    PubMed

    Wang, Han; Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J

    2015-01-01

    The draft genome of Clostridium beijerinckii strain Ne1 was reconstructed from the metagenomic sequence of a mixed-microbial consortium that produced commercially significant quantities of hydrogen from xylan as a sole feedstock. The organism possesses relatively limited hemicellulolytic capacity and likely requires the action of other organisms to completely degrade xylan. PMID:25908128

  9. Draft Genome Sequence of Clostridium sp. Ne2, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus.

    PubMed

    Wang, Han; Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J

    2015-01-01

    The draft genome sequence of Clostridium sp. Ne2 was reconstructed from a metagenome of a hydrogenogenic microbial consortium. The organism is most closely related to Clostridium magnum and is a strict anaerobe that is predicted to ferment a range of simple sugars. PMID:25908129

  10. Quick genome sequencing of “Candidatus Liberibacter” strains by use of Enrichment-Enlargement-Next generation sequencing (EEN)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Members of “Candidatus Liberibacter” are associated with several important plant diseases such as citrus Huanglongbing (HLB) and potato zebra chip (ZC) disease. Inability to culture and low titers in infected hosts have been major obstacles for research on these bacteria. The use of whole genome seq...

  11. The Genome of the Generalist Plant Pathogen Fusarium avenaceum Is Enriched with Genes Involved in Redox, Signaling and Secondary Metabolism

    PubMed Central

    Lysøe, Erik; Harris, Linda J.; Walkowiak, Sean; Subramaniam, Rajagopal; Divon, Hege H.; Riiser, Even S.; Llorens, Carlos; Gabaldón, Toni; Kistler, H. Corby; Jonkers, Wilfried; Kolseth, Anna-Karin; Nielsen, Kristian F.; Thrane, Ulf; Frandsen, Rasmus J. N.

    2014-01-01

    Fusarium avenaceum is a fungus commonly isolated from soil and associated with a wide range of host plants. We present here three genome sequences of F. avenaceum, one isolated from barley in Finland and two from spring and winter wheat in Canada. The sizes of the three genomes range from 41.6–43.1 MB, with 13217–13445 predicted protein-coding genes. Whole-genome analysis showed that the three genomes are highly syntenic, and share>95% gene orthologs. Comparative analysis to other sequenced Fusaria shows that F. avenaceum has a very large potential for producing secondary metabolites, with between 75 and 80 key enzymes belonging to the polyketide, non-ribosomal peptide, terpene, alkaloid and indole-diterpene synthase classes. In addition to known metabolites from F. avenaceum, fuscofusarin and JM-47 were detected for the first time in this species. Many protein families are expanded in F. avenaceum, such as transcription factors, and proteins involved in redox reactions and signal transduction, suggesting evolutionary adaptation to a diverse and cosmopolitan ecology. We found that 20% of all predicted proteins were considered to be secreted, supporting a life in the extracellular space during interaction with plant hosts. PMID:25409087

  12. Draft Genome Sequence of Clostridium sp. Ne2, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus

    PubMed Central

    Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J.

    2015-01-01

    The draft genome sequence of Clostridium sp. Ne2 was reconstructed from a metagenome of a hydrogenogenic microbial consortium. The organism is most closely related to Clostridium magnum and is a strict anaerobe that is predicted to ferment a range of simple sugars. PMID:25908129

  13. Draft Genome Sequence of Clostridium beijerinckii Ne1, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus

    PubMed Central

    Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J.

    2015-01-01

    The draft genome of Clostridium beijerinckii strain Ne1 was reconstructed from the metagenomic sequence of a mixed-microbial consortium that produced commercially significant quantities of hydrogen from xylan as a sole feedstock. The organism possesses relatively limited hemicellulolytic capacity and likely requires the action of other organisms to completely degrade xylan. PMID:25908128

  14. Lung cancer diagnosed in the young is associated with enrichment for targetable genomic alterations and poor prognosis

    PubMed Central

    Sacher, Adrian G.; Dahlberg, Suzanne E.; Heng, Jennifer; Mach, Stacy; Jänne, Pasi A.; Oxnard, Geoffrey R.

    2016-01-01

    Importance NSCLC in the young is a rare entity and the genomics and clinical characteristics of this disease are poorly understood. In contrast, young age at diagnosis has been demonstrated to define unique disease biology in other cancers. Here we report on the association of young age with targetable genomic alterations and prognosis in a large cohort of NSCLC patients. Objective To determine the relationship between young age at diagnosis and both the presence of a potentially targetable genomic alteration as well as prognosis and natural history. Design All patients with NSCLC genotyped at the Dana-Farber Cancer Institute between 2002–2014 were identified. Tumor genotype, patient characteristics and clinical outcomes were collected. Multivariate logistic regression was used to analyze the relationship between age and mutation status. Multivariate Cox proportional hazard models were fitted for survival analysis. Setting A National Cancer Institute (NCI) designated comprehensive cancer center. Participants All patients with NSCLC seen at the Dana-Farber Cancer Institute between 2002–2014 who underwent tumor genotyping. Main Outcome Measure The frequency of targetable genomic alterations by defined age categories as well as the association of these age groups with survival. Results 2237 patients with NSCLC were studied. EGFR (p=0.02) and ALK (P<0.01) were associated with younger age, and a similar trend existed for HER2 (p=0.15) and ROS1 (p=0.1) but not BRAF V600E (p=0.43). Amongst patients tested for all 5 targetable genomic alterations, younger age was associated with an increased frequency of a targetable genotype (p<0.01). Those diagnosed at age 50 or younger have a 59% increased likelihood of harboring a targetable genotype. While presence of a potentially targetable genomic alteration treated with a targeted agent was associated with improved survival, the youngest and oldest age groupings had similarly poor outcomes even when a targetable genotype was

  15. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  16. Whitefly (Bemisia tabaci) genome project: analysis of sequenced clones from egg, instar, and adult (viruliferous and non-viruliferous) cDNA libraries

    PubMed Central

    Leshkowitz, Dena; Gazit, Shirley; Reuveni, Eli; Ghanim, Murad; Czosnek, Henryk; McKenzie, Cindy; Shatters, Robert L; Brown, Judith K

    2006-01-01

    Background The past three decades have witnessed a dramatic increase in interest in the whitefly Bemisia tabaci, owing to its nature as a taxonomically cryptic species, the damage it causes to a large number of herbaceous plants because of its specialized feeding in the phloem, and to its ability to serve as a vector of plant viruses. Among the most important plant viruses to be transmitted by B. tabaci are those in the genus Begomovirus (family, Geminiviridae). Surprisingly, little is known about the genome of this whitefly. The haploid genome size for male B. tabaci has been estimated to be approximately one billion bp by flow cytometry analysis, about five times the size of the fruitfly Drosophila melanogaster. The genes involved in whitefly development, in host range plasticity, and in begomovirus vector specificity and competency, are unknown. Results To address this general shortage of genomic sequence information, we have constructed three cDNA libraries from non-viruliferous whiteflies (eggs, immature instars, and adults) and two from adult insects that fed on tomato plants infected by two geminiviruses: Tomato yellow leaf curl virus (TYLCV) and Tomato mottle virus (ToMoV). In total, the sequence of 18,976 clones was determined. After quality control, and removal of 5,542 clones of mitochondrial origin 9,110 sequences remained which included 3,843 singletons and 1,017 contigs. Comparisons with public databases indicated that the libraries contained genes involved in cellular and developmental processes. In addition, approximately 1,000 bases aligned with the genome of the B. tabaci endosymbiotic bacterium Candidatus Portiera aleyrodidarum, originating primarily from the egg and instar libraries. Apart from the mitochondrial sequences, the longest and most abundant sequence encodes vitellogenin, which originated from whitefly adult libraries, indicating that much of the gene expression in this insect is directed toward the production of eggs. Conclusion This

  17. Combining genetic mapping with genome-wide expression in experimental autoimmune encephalomyelitis highlights a gene network enriched for T cell functions and candidate genes regulating autoimmunity

    PubMed Central

    Thessen Hedreul, Melanie; Möller, Steffen; Stridh, Pernilla; Gupta, Yask; Gillett, Alan; Daniel Beyeen, Amennai; Öckinger, Johan; Flytzani, Sevasti; Diez, Margarita; Olsson, Tomas; Jagodic, Maja

    2013-01-01

    The experimental autoimmune encephalomyelitis (EAE) is an autoimmune disease of the central nervous system commonly used to study multiple sclerosis (MS). We combined clinical EAE phenotypes with genome-wide expression profiling in spleens from 150 backcross rats between susceptible DA and resistant PVG rat strains during the chronic EAE phase. This enabled correlation of transcripts with genotypes, other transcripts and clinical EAE phenotypes and implicated potential genetic causes and pathways in EAE. We detected 2285 expression quantitative trait loci (eQTLs). Sixty out of 599 cis-eQTLs overlapped well-known EAE QTLs and constitute positional candidate genes, including Ifit1 (Eae7), Atg7 (Eae20-22), Klrc3 (eEae22) and Mfsd4 (Eae17). A trans-eQTL that overlaps Eae23a regulated a large number of small RNAs and implicates a master regulator of transcription. We defined several disease-correlated networks enriched for pathways involved in cell-mediated immunity. They include C-type lectins, G protein coupled receptors, mitogen-activated protein kinases, transmembrane proteins, suppressors of transcription (Jundp2 and Nr1d1) and STAT transcription factors (Stat4) involved in interferon signaling. The most significant network was enriched for T cell functions, similar to genetic findings in MS, and revealed both established and novel gene interactions. Transcripts in the network have been associated with T cell proliferation and differentiation, the TCR signaling and regulation of regulatory T cells. A number of network genes and their family members have been associated with MS and/or other autoimmune diseases. Combining disease and genome-wide expression phenotypes provides a link between disease risk genes and distinct molecular pathways that are dysregulated during chronic autoimmune inflammation. PMID:23900079

  18. Genomic Survey and Biochemical Analysis of Recombinant Candidate Cyanobacteriochromes Reveals Enrichment for Near UV/Violet Sensors in the Halotolerant and Alkaliphilic Cyanobacterium Microcoleus IPPAS B353.

    PubMed

    Cho, Sung Mi; Jeoung, Sae Chae; Song, Ji-Young; Kupriyanova, Elena V; Pronina, Natalia A; Lee, Bong-Woo; Jo, Seong-Whan; Park, Beom-Seok; Choi, Sang-Bong; Song, Ji-Joon; Park, Youn-Il

    2015-11-20

    Cyanobacteriochromes (CBCRs), which are exclusive to and widespread among cyanobacteria, are photoproteins that sense the entire range of near-UV and visible light. CBCRs are related to the red/far-red phytochromes that utilize linear tetrapyrrole (bilin) chromophores. Best characterized from the unicellular cyanobacterium Synechocystis sp. PCC 6803 and the multicellular heterocyst forming filamentous cyanobacteria Nostoc punctiforme ATCC 29133 and Anabaena sp. PCC 7120, CBCRs have been poorly investigated in mat-forming, nonheterocystous cyanobacteria. In this study, we sequenced the genome of one of such species, Microcoleus IPPAS B353 (Microcoleus B353), and identified two phytochromes and seven CBCRs with one or more bilin-binding cGMP-specific phosphodiesterase, adenylyl cyclase and FhlA (GAF) domains. Biochemical and spectroscopic measurements of 23 purified GAF proteins from phycocyanobilin (PCB) producing recombinant Escherichia coli indicated that 13 of these proteins formed near-UV and visible light-absorbing covalent adducts: 10 GAFs contained PCB chromophores, whereas three contained the PCB isomer, phycoviolobilin (PVB). Furthermore, the complement of Microcoleus B353 CBCRs is enriched in near-UV and violet sensors, but lacks red/green and green/red CBCRs that are widely distributed in other cyanobacteria. We hypothesize that enrichment in short wavelength-absorbing CBCRs is critical for acclimation to high-light environments where this organism is found. PMID:26405033

  19. Identification of a Mitochondrial DNA Polymerase Affecting Cardiotoxicity of Sunitinib Using a Genome-Wide Screening on S. pombe Deletion Library.

    PubMed

    Kim, Dong-Myung; Kim, Hanna; Yeon, Ji-Hyun; Lee, Ju-Hee; Park, Han-Oh

    2016-01-01

    Drug toxicity is a key issue for drug R&D, a fundamental challenge of which is to screen for the targets genome-wide. The anticancer tyrosine kinase inhibitor sunitinib is known to induce cardiotoxicity. Here, to understand the molecular insights of cardiotoxicity by sunitinib at the genome level, we used a genome-wide drug target screening technology (GPScreen) that measures drug-induced haploinsufficiency (DIH) in the fission yeast Schizosaccharomyces pombe genome-wide deletion library and found a mitochondrial DNA polymerase (POG1). In the results, sunitinib induced more severe cytotoxicity and mitochondrial damage in POG1-deleted heterozygous mutants compared to wild type (WT) of S. pombe. Furthermore, knockdown of the human ortholog POLG of S. pombe POG1 in human cells significantly increased the cytotoxicity of sunitinib. Notably, sunitinib dramatically decreased the levels of POLG mRNAs and proteins, of which downregulation was already known to induce mitochondrial damage of cardiomyocytes, causing cardiotoxicity. These results indicate that POLG might play a crucial role in mitochondrial damage as a gene of which expressional pathway is targeted by sunitinib for cardiotoxicity, and that genome-wide drug target screening with GPScreen can be applied to drug toxicity target discovery to understand the molecular insights regarding drug toxicity. PMID:26385865

  20. Cas-Database: web-based genome-wide guide RNA library design for gene knockout screens using CRISPR-Cas9

    PubMed Central

    Park, Jeongbin; Kim, Jin-Soo; Bae, Sangsu

    2016-01-01

    Motivation: CRISPR-derived RNA guided endonucleases (RGENs) have been widely used for both gene knockout and knock-in at the level of single or multiple genes. RGENs are now available for forward genetic screens at genome scale, but single guide RNA (sgRNA) selection at this scale is difficult. Results: We develop an online tool, Cas-Database, a genome-wide gRNA library design tool for Cas9 nucleases from Streptococcus pyogenes (SpCas9). With an easy-to-use web interface, Cas-Database allows users to select optimal target sequences simply by changing the filtering conditions. Furthermore, it provides a powerful way to select multiple optimal target sequences from thousands of genes at once for the creation of a genome-wide library. Cas-Database also provides a web application programming interface (web API) for advanced bioinformatics users. Availability and implementation: Free access at http://www.rgenome.net/cas-database/. Contact: sangsubae@hanyang.ac.kr or jskim01@snu.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153724

  1. ve-SEQ: Robust, unbiased enrichment for streamlined detection and whole-genome sequencing of HCV and other highly diverse pathogens

    PubMed Central

    Trebes, Amy; Brown, Anthony; Klenerman, Paul; Buck, David; Piazza, Paolo; Barnes, Eleanor; Bowden, Rory

    2015-01-01

    The routine availability of high-depth virus sequence data would allow the sensitive detection of resistance-associated variants that can jeopardize HIV or hepatitis C virus (HCV) treatment. We introduce ve-SEQ, a high-throughput method for sequence-specific enrichment and characterization of whole-virus genomes at up to 20% divergence from a reference sequence and 1,000-fold greater sensitivity than direct sequencing. The extreme genetic diversity of HCV led us to implement an algorithm for the efficient design of panels of oligonucleotide probes to capture any sequence among a defined set of targets without detectable bias. ve-SEQ enables efficient detection and sequencing of any HCV genome, including mixtures and intra-host variants, in a single experiment, with greater tolerance of sequence diversity than standard amplification methods and greater sensitivity than metagenomic sequencing, features that are directly applicable to other pathogens or arbitrary groups of target organisms, allowing the combination of sensitive detection with sequencing in many settings. PMID:27092241

  2. Enrichments in Biology

    ERIC Educational Resources Information Center

    Richard, Paul W.

    1969-01-01

    Emphasizes the need for enrichment materials in addition to laboratory and textbook work, particularly for interested and able students. Discusses the use of filmstrips, loop films, Innovations to Inquiry, BSCS pamphlets, newspapers, magazines and scientific periodicals, television, field trips, library resources, and programed units. (EB)

  3. Enriching the Catalog

    ERIC Educational Resources Information Center

    Tennant, Roy

    2004-01-01

    After decades of costly and time-consuming effort, nearly all libraries have completed the retrospective conversion of their card catalogs to electronic form. However, bibliographic systems still are really not much more than card catalogs on wheels. Enriched content that Amazon.com takes for granted--such as digitized tables of contents, cover…

  4. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation

  5. Recurrent Rare Genomic Copy Number Variants and Bicuspid Aortic Valve Are Enriched in Early Onset Thoracic Aortic Aneurysms and Dissections

    PubMed Central

    Prakash, Siddharth; Kuang, Shao-Qing; Regalado, Ellen; Guo, Dongchuan; Milewicz, Dianna

    2016-01-01

    Thoracic Aortic Aneurysms and Dissections (TAAD) are a major cause of death in the United States. The spectrum of TAAD ranges from genetic disorders, such as Marfan syndrome, to sporadic isolated disease of unknown cause. We hypothesized that genomic copy number variants (CNVs) contribute causally to early onset TAAD (ETAAD). We conducted a genome-wide SNP array analysis of ETAAD patients of European descent who were enrolled in the National Registry of Genetically Triggered Thoracic Aortic Aneurysms and Cardiovascular Conditions (GenTAC). Genotyping was performed on the Illumina Omni-Express platform, using PennCNV, Nexus and CNVPartition for CNV detection. ETAAD patients (n = 108, 100% European American, 28% female, average age 20 years, 55% with bicuspid aortic valves) were compared to 7013 dbGAP controls without a history of vascular disease using downsampled Omni 2.5 data. For comparison, 805 sporadic TAAD patients with late onset aortic disease (STAAD cohort) and 192 affected probands from families with at least two affected relatives (FTAAD cohort) from our institution were screened for additional CNVs at these loci with SNP arrays. We identified 47 recurrent CNV regions in the ETAAD, FTAAD and STAAD groups that were absent or extremely rare in controls. Nine rare CNVs that were either very large (>1 Mb) or shared by ETAAD and STAAD or FTAAD patients were also identified. Four rare CNVs involved genes that cause arterial aneurysms when mutated. The largest and most prevalent of the recurrent CNVs were at Xq28 (two duplications and two deletions) and 17q25.1 (three duplications). The percentage of individuals harboring rare CNVs was significantly greater in the ETAAD cohort (32%) than in the FTAAD (23%) or STAAD (17%) cohorts. We identified multiple loci affected by rare CNVs in one-third of ETAAD patients, confirming the genetic heterogeneity of TAAD. Alterations of candidate genes at these loci may contribute to the pathogenesis of TAAD. PMID:27092555

  6. Chromosome region-specific libraries for human genome analysis. Progress report, September 1, 1991--August 31, 1992

    SciTech Connect

    Kao, Fa-Ten

    1992-08-01

    During the grant period progress has been made in the successful demonstration of regional mapping of microclones derived from microdissection libraries; successful demonstration of the feasibility of converting microclones with short inserts into yeast artificial chromosome clones with very large inserts for high resolution physical mapping of the dissected region; Successful demonstration of the usefulness of region-specific microclones to isolate region-specific cDNA clones as candidate genes to facilitate search for the crucial genes underlying genetic diseases assigned to the dissected region; and the successful construction of four region-specific microdissection libraries for human chromosome 2, including 2q35-q37, 2q33-q35, 2p23-p25 and 2p2l-p23. The 2q35-q37 library has been characterized in detail. The characterization of the other three libraries is in progress. These region-specific microdissection libraries and the unique sequence microclones derived from the libraries will be valuable resources for investigators engaged in high resolution physical mapping and isolation of disease-related genes residing in these chromosomal regions.

  7. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries.

    PubMed

    Saski, Christopher A; Bhattacharjee, Ranjana; Scheffler, Brian E; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  8. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries

    PubMed Central

    Saski, Christopher A.; Bhattacharjee, Ranjana; Scheffler, Brian E.; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  9. Genome-wide comparison of the transcriptomes of highly enriched normal and chronic myeloid leukemia stem and progenitor cell populations.

    PubMed

    Gerber, Jonathan M; Gucwa, Jessica L; Esopi, David; Gurel, Meltem; Haffner, Michael C; Vala, Milada; Nelson, William G; Jones, Richard J; Yegnasubramanian, Srinivasan

    2013-05-01

    The persistence leukemia stem cells (LSCs) in chronic myeloid leukemia (CML) despite tyrosine kinase inhibition (TKI) may explain relapse after TKI withdrawal. Here we performed genome-wide transcriptome analysis of highly refined CML and normal stem and progenitor cell populations to identify novel targets for the eradication of CML LSCs using exon microarrays. We identified 97 genes that were differentially expressed in CML versus normal stem and progenitor cells. These included cell surface genes significantly upregulated in CML LSCs: DPP4 (CD26), IL2RA (CD25), PTPRD, CACNA1D, IL1RAP, SLC4A4, and KCNK5. Further analyses of the LSCs revealed dysregulation of normal cellular processes, evidenced by alternative splicing of genes in key cancer signaling pathways such as p53 signaling (e.g. PERP, CDKN1A), kinase binding (e.g. DUSP12, MARCKS), and cell proliferation (MYCN, TIMELESS); downregulation of pro-differentiation and TGF-β/BMP signaling pathways; upregulation of oxidative metabolism and DNA repair pathways; and activation of inflammatory cytokines, including CCL2, and multiple oncogenes (e.g., CCND1). These data represent an important resource for understanding the molecular changes in CML LSCs, which may be exploited to develop novel therapies for eradication these cells and achieve cure. PMID:23651669

  10. Genomic Signatures of North American Soybean Improvement Inform Diversity Enrichment Strategies and Clarify the Impact of Hybridization

    PubMed Central

    Vaughn, Justin N.; Li, Zenglu

    2016-01-01

    Crop improvement represents a long-running experiment in artificial selection on a complex trait, namely yield. How such selection relates to natural populations is unclear, but the analysis of domesticated populations could offer insights into the relative role of selection, drift, and recombination in all species facing major shifts in selective regimes. Because of the extreme autogamy exhibited by soybean (Glycine max), many “immortalized” genotypes of elite varieties spanning the last century have been preserved and characterized using ∼50,000 single nucleotide polymorphic (SNP) markers. Also due to autogamy, the history of North American soybean breeding can be roughly divided into pre- and posthybridization eras, allowing for direct interrogation of the role of recombination in improvement and selection. Here, we report on genome-wide characterization of the structure and history of North American soybean populations and the signature of selection in these populations. Supporting previous work, we find that maturity defines population structure. Though the diversity of North American ancestors is comparable to available landraces, prehybridization line selections resulted in a clonal structure that dominated early breeding and explains many of the reductions in diversity found in the initial generations of soybean hybridization. The rate of allele frequency change does not deviate sharply from neutral expectation, yet some regions bare hallmarks of strong selection, suggesting a highly variable range of selection strengths biased toward weak effects. We also discuss the importance of haplotypes as units of analysis when complex traits fall under novel selection regimes. PMID:27402364

  11. Genome-Centric Analysis of Microbial Populations Enriched by Hydraulic Fracture Fluid Additives in a Coal Bed Methane Production Well.

    PubMed

    Robbins, Steven J; Evans, Paul N; Parks, Donovan H; Golding, Suzanne D; Tyson, Gene W

    2016-01-01

    Coal bed methane (CBM) is generated primarily through the microbial degradation of coal. Despite a limited understanding of the microorganisms responsible for this process, there is significant interest in developing methods to stimulate additional methane production from CBM wells. Physical techniques including hydraulic fracture stimulation are commonly applied to CBM wells, however the effects of specific additives contained in hydraulic fracture fluids on native CBM microbial communities are poorly understood. Here, metagenomic sequencing was applied to the formation waters of a hydraulically fractured and several non-fractured CBM production wells to determine the effect of this stimulation technique on the in-situ microbial community. The hydraulically fractured well was dominated by two microbial populations belonging to the class Phycisphaerae (within phylum Planctomycetes) and candidate phylum Aminicenantes. Populations from these phyla were absent or present at extremely low abundance in non-fractured CBM wells. Detailed metabolic reconstruction of near-complete genomes from these populations showed that their high relative abundance in the hydraulically fractured CBM well could be explained by the introduction of additional carbon sources, electron acceptors, and biocides contained in the hydraulic fracture fluid. PMID:27375557

  12. Genome-Centric Analysis of Microbial Populations Enriched by Hydraulic Fracture Fluid Additives in a Coal Bed Methane Production Well

    PubMed Central

    Robbins, Steven J.; Evans, Paul N.; Parks, Donovan H.; Golding, Suzanne D.; Tyson, Gene W.

    2016-01-01

    Coal bed methane (CBM) is generated primarily through the microbial degradation of coal. Despite a limited understanding of the microorganisms responsible for this process, there is significant interest in developing methods to stimulate additional methane production from CBM wells. Physical techniques including hydraulic fracture stimulation are commonly applied to CBM wells, however the effects of specific additives contained in hydraulic fracture fluids on native CBM microbial communities are poorly understood. Here, metagenomic sequencing was applied to the formation waters of a hydraulically fractured and several non-fractured CBM production wells to determine the effect of this stimulation technique on the in-situ microbial community. The hydraulically fractured well was dominated by two microbial populations belonging to the class Phycisphaerae (within phylum Planctomycetes) and candidate phylum Aminicenantes. Populations from these phyla were absent or present at extremely low abundance in non-fractured CBM wells. Detailed metabolic reconstruction of near-complete genomes from these populations showed that their high relative abundance in the hydraulically fractured CBM well could be explained by the introduction of additional carbon sources, electron acceptors, and biocides contained in the hydraulic fracture fluid. PMID:27375557

  13. Genomic Signatures of North American Soybean Improvement Inform Diversity Enrichment Strategies and Clarify the Impact of Hybridization.

    PubMed

    Vaughn, Justin N; Li, Zenglu

    2016-01-01

    Crop improvement represents a long-running experiment in artificial selection on a complex trait, namely yield. How such selection relates to natural populations is unclear, but the analysis of domesticated populations could offer insights into the relative role of selection, drift, and recombination in all species facing major shifts in selective regimes. Because of the extreme autogamy exhibited by soybean (Glycine max), many "immortalized" genotypes of elite varieties spanning the last century have been preserved and characterized using ∼50,000 single nucleotide polymorphic (SNP) markers. Also due to autogamy, the history of North American soybean breeding can be roughly divided into pre- and posthybridization eras, allowing for direct interrogation of the role of recombination in improvement and selection. Here, we report on genome-wide characterization of the structure and history of North American soybean populations and the signature of selection in these populations. Supporting previous work, we find that maturity defines population structure. Though the diversity of North American ancestors is comparable to available landraces, prehybridization line selections resulted in a clonal structure that dominated early breeding and explains many of the reductions in diversity found in the initial generations of soybean hybridization. The rate of allele frequency change does not deviate sharply from neutral expectation, yet some regions bare hallmarks of strong selection, suggesting a highly variable range of selection strengths biased toward weak effects. We also discuss the importance of haplotypes as units of analysis when complex traits fall under novel selection regimes. PMID:27402364

  14. A methylome-wide study of aging using massively parallel sequencing of the methyl-CpG-enriched genomic fraction from blood in over 700 subjects.

    PubMed

    McClay, Joseph L; Aberg, Karolina A; Clark, Shaunna L; Nerella, Srilaxmi; Kumar, Gaurav; Xie, Lin Y; Hudson, Alexandra D; Harada, Aki; Hultman, Christina M; Magnusson, Patrik K E; Sullivan, Patrick F; Van Den Oord, Edwin J C G

    2014-03-01

    The central importance of epigenetics to the aging process is increasingly being recognized. Here we perform a methylome-wide association study (MWAS) of aging in whole blood DNA from 718 individuals, aged 25-92 years (mean = 55). We sequenced the methyl-CpG-enriched genomic DNA fraction, averaging 67.3 million reads per subject, to obtain methylation measurements for the ∼27 million autosomal CpGs in the human genome. Following extensive quality control, we adaptively combined methylation measures for neighboring, highly-correlated CpGs into 4 344 016 CpG blocks with which we performed association testing. Eleven age-associated differentially methylated regions (DMRs) passed Bonferroni correction (P-value < 1.15 × 10(-8)). Top findings replicated in an independent sample set of 558 subjects using pyrosequencing of bisulfite-converted DNA (min P-value < 10(-30)). To examine biological themes, we selected 70 DMRs with false discovery rate of <0.1. Of these, 42 showed hypomethylation and 28 showed hypermethylation with age. Hypermethylated DMRs were more likely to overlap with CpG islands and shores. Hypomethylated DMRs were more likely to be in regions associated with polycomb/regulatory proteins (e.g. EZH2) or histone modifications H3K27ac, H3K4m1, H3K4m2, H3K4m3 and H3K9ac. Among genes implicated by the top DMRs were protocadherins, homeobox genes, MAPKs and ryanodine receptors. Several of our DMRs are at genes with potential relevance for age-related disease. This study successfully demonstrates the application of next-generation sequencing to MWAS, by interrogating a large proportion of the methylome and returning potentially novel age DMRs, in addition to replicating several loci implicated in previous studies using microarrays. PMID:24135035

  15. A genomic-scale artificial microRNA library as a tool to investigate the functionally redundant gene space in Arabidopsis.

    PubMed

    Hauser, Felix; Chen, Wenxiao; Deinlein, Ulrich; Chang, Kenneth; Ossowski, Stephan; Fitz, Joffrey; Hannon, Gregory J; Schroeder, Julian I

    2013-08-01

    Traditional forward genetic screens are limited in the identification of homologous genes with overlapping functions. Here, we report the analyses and assembly of genome-wide protein family definitions that comprise the largest estimate for the potentially redundant gene space in Arabidopsis thaliana. On this basis, a computational design of genome-wide family-specific artificial microRNAs (amiRNAs) was performed using high-performance computing resources. The amiRNA designs are searchable online (http://phantomdb.ucsd.edu). A computationally derived library of 22,000 amiRNAs was synthesized in 10 sublibraries of 1505 to 4082 amiRNAs, each targeting defined functional protein classes. For example, 2964 amiRNAs target annotated DNA and RNA binding protein families and 1777 target transporter proteins, and another sublibrary targets proteins of unknown function. To evaluate the potential of an amiRNA-based screen, we tested 122 amiRNAs targeting transcription factor, protein kinase, and protein phosphatase families. Several amiRNA lines showed morphological phenotypes, either comparable to known phenotypes of single and double/triple mutants or caused by overexpression of microRNAs. Moreover, novel morphological and abscisic acid-insensitive seed germination mutants were identified for amiRNAs targeting zinc finger homeodomain transcription factors and mitogen-activated protein kinase kinase kinases, respectively. These resources provide an approach for genome-wide genetic screens of the functionally redundant gene space in Arabidopsis. PMID:23956262

  16. Draft Genome Sequence of Marinobacter sp. Strain P4B1, an Electrogenic Perchlorate-Reducing Strain Isolated from a Long-Term Mixed Enrichment Culture of Marine Bacteria.

    PubMed

    Stepanov, Victor G; Xiao, Yeyuan; Lopez, April J; Roberts, Deborah J; Fox, George E

    2016-01-01

    The perchlorate-reducing strain Marinobacter sp. strain P4B1 was isolated from a long-term perchlorate-degrading enrichment culture seeded with marine sediment. The draft genome of Marinobacter sp. P4B1 is comprised of the bacterial chromosome (3.60 Mbp, G+C 58.51%, 3,269 predicted genes) and its associated plasmid pMARS01 (0.14 Mbp, G+C 52.95%, 165 predicted genes). PMID:26798109

  17. Draft Genome Sequence of Marinobacter sp. Strain P4B1, an Electrogenic Perchlorate-Reducing Strain Isolated from a Long-Term Mixed Enrichment Culture of Marine Bacteria

    PubMed Central

    Stepanov, Victor G.; Xiao, Yeyuan; Lopez, April J.

    2016-01-01

    The perchlorate-reducing strain Marinobacter sp. strain P4B1 was isolated from a long-term perchlorate-degrading enrichment culture seeded with marine sediment. The draft genome of Marinobacter sp. P4B1 is comprised of the bacterial chromosome (3.60 Mbp, G+C 58.51%, 3,269 predicted genes) and its associated plasmid pMARS01 (0.14 Mbp, G+C 52.95%, 165 predicted genes). PMID:26798109

  18. Characterization of genome-wide ordered sequence-tagged Mycobacterium mutant libraries by Cartesian Pooling-Coordinate Sequencing

    PubMed Central

    Vandewalle, Kristof; Festjens, Nele; Plets, Evelyn; Vuylsteke, Marnik; Saeys, Yvan; Callewaert, Nico

    2015-01-01

    Reverse genetics research approaches require the availability of methods to rapidly generate specific mutants. Alternatively, where these methods are lacking, the construction of pre-characterized libraries of mutants can be extremely valuable. However, this can be complex, expensive and time consuming. Here, we describe a robust, easy to implement parallel sequencing-based method (Cartesian Pooling-Coordinate Sequencing or CP-CSeq) that reports both on the identity as well as on the location of sequence-tagged biological entities in well-plate archived clone collections. We demonstrate this approach using a transposon insertion mutant library of the Mycobacterium bovis BCG vaccine strain, providing the largest resource of mutants in any strain of the M. tuberculosis complex. The method is applicable to any entity for which sequence-tagged identification is possible. PMID:25960123

  19. A nanobuffer reporter library for fine-scale imaging and perturbation of endocytic organelles | Office of Cancer Genomics

    Cancer.gov

    Endosomes, lysosomes and related catabolic organelles are a dynamic continuum of vacuolar structures that impact a number of cell physiological processes such as protein/lipid metabolism, nutrient sensing and cell survival. Here we develop a library of ultra-pH-sensitive fluorescent nanoparticles with chemical properties that allow fine-scale, multiplexed, spatio-temporal perturbation and quantification of catabolic organelle maturation at single organelle resolution to support quantitative investigation of these processes in living cells.

  20. A genomic library-based amplification approach (GL-PCR) for the mapping of multiple IS6110 insertion sites and strain differentiation of Mycobacterium tuberculosis.

    PubMed

    Namouchi, Amine; Mardassi, Helmi

    2006-11-01

    Evidence suggests that insertion of the IS6110 element is not without consequence to the biology of Mycobacterium tuberculosis complex strains. Thus, mapping of multiple IS6110 insertion sites in the genome of biomedically relevant clinical isolates would result in a better understanding of the role of this mobile element, particularly with regard to transmission, adaptability and virulence. In the present paper, we describe a versatile strategy, referred to as GL-PCR, that amplifies IS6110-flanking sequences based on the construction of a genomic library. M. tuberculosis chromosomal DNA is fully digested with HincII and then ligated into a plasmid vector between T7 and T3 promoter sequences. The ligation reaction product is transformed into Escherichia coli and selective PCR amplification targeting both 5' and 3' IS6110-flanking sequences are performed on the plasmid library DNA. For this purpose, four separate PCR reactions are performed, each combining an outward primer specific for one IS6110 end with either T7 or T3 primer. Determination of the nucleotide sequence of the PCR products generated from a single ligation reaction allowed mapping of 21 out of the 24 IS6110 copies of two 12 banded M. tuberculosis strains, yielding an overall sensitivity of 87,5%. Furthermore, by simply comparing the migration pattern of GL-PCR-generated products, the strategy proved to be as valuable as IS6110 RFLP for molecular typing of M. tuberculosis complex strains. Importantly, GL-PCR was able to discriminate between strains differing by a single IS6110 band. PMID:16725220

  1. Losing Libraries, Saving Libraries

    ERIC Educational Resources Information Center

    Miller, Rebecca

    2010-01-01

    This summer, as public libraries continued to get budget hit after budget hit across the country, several readers asked for a comprehensive picture of the ravages of the recession on library service. In partnership with 2010 Movers & Shakers Laura Solomon and Mandy Knapp, Ohio librarians who bought the Losing Libraries domain name, "LJ" launched…

  2. Identification and characterization of coenzyme B12-dependent glycerol dehydratase- and diol dehydratase-encoding genes from metagenomic DNA libraries derived from enrichment cultures.

    PubMed

    Knietsch, Anja; Bowien, Susanne; Whited, Gregg; Gottschalk, Gerhard; Daniel, Rolf

    2003-06-01

    To isolate genes encoding coenzyme B(12)-dependent glycerol and diol dehydratases, metagenomic libraries from three different environmental samples were constructed after allowing growth of the dehydratase-containing microorganisms present for 48 h with glycerol under anaerobic conditions. The libraries were searched for the targeted genes by an activity screen, which was based on complementation of a constructed dehydratase-negative Escherichia coli strain. In this way, two positive E. coli clones out of 560,000 tested clones were obtained. In addition, screening was performed by colony hybridization with dehydratase-specific DNA fragments as probes. The screening of 158,000 E. coli clones by this method yielded five positive clones. Two of the plasmids (pAK6 and pAK8) recovered from the seven positive clones contained genes identical to those encoding the glycerol dehydratase of Citrobacter freundii and were not studied further. The remaining five plasmids (pAK2 to -5 and pAK7) contained two complete and three incomplete dehydratase-encoding gene regions, which were similar to the corresponding regions of enteric bacteria. Three (pAK2, -3, and -7) coded for glycerol dehydratases and two (pAK4 and -5) coded for diol dehydratases. We were able to perform high-level production and purification of three of these dehydratases. The glycerol dehydratases purified from E. coli Bl21/pAK2.1 and E. coli Bl21/pAK7.1 and the complemented hybrid diol dehydratase purified from E. coli Bl21/pAK5.1 were subject to suicide inactivation by glycerol and were cross-reactivated by the reactivation factor (DhaFG) for the glycerol dehydratase of C. freundii. The activities of the three environmentally derived dehydratases and that of glycerol dehydratase of C. freundii with glycerol or 1,2-propanediol as the substrate were inhibited in the presence of the glycerol fermentation product 1,3-propanediol. Taking the catalytic efficiency, stability against inactivation by glycerol, and

  3. New glucosidase activities identified by functional screening of a genomic DNA library from the gut microbiota of the termite Reticulitermes santonensis.

    PubMed

    Mattéotti, Christel; Thonart, Philippe; Francis, Frédéric; Haubruge, Eric; Destain, Jacqueline; Brasseur, Catherine; Bauwens, Julien; De Pauw, Edwin; Portetelle, Daniel; Vandenbol, Micheline

    2011-12-20

    β-Glucosidases are widely distributed in living organisms and play a major role in the degradation of wood, hydrolysing cellobiose or cello-oligosaccharides to glucose. Termites are among the rare animals capable of digesting wood, thanks to enzyme activities of their own and to enzymes produced by their gut microbiota. Many bacteria have been identified in the guts of lower termites, some of which possess cellulolytic or/and hemicellulolytic activity, required for digesting wood. Here, having isolated bacterial colonies from the gut of Reticulitermes santonensis, we constructed in Escherichia coli a genomic DNA library corresponding to all of the colonies obtained and screened the library for clones displaying β-glucosidase activity. This screen revealed 8 positive clones. Sequence analysis with the BLASTX program revealed putative enzymes belonging to three glycoside hydrolase families (GH1, GH3 and GH4). Agar-plate tests and enzymatic assays revealed differences between the GH1- and GH3-type enzymes (as regards substrate specificity and regulation) and a difference in substrate specificity within the GH3 group. The substrate specificities and characteristic activities of these enzymes suggest that they may intervene in the depolymerisation of cellulose and hemicellulose. PMID:21324659

  4. Sequencing and comparative genomics analysis in Senecio scandens Buch.-Ham. Ex D. Don, based on full-length cDNA library

    PubMed Central

    Qian, Gang; Ping, Junjiao; Zhang, Zhen; Xu, Delin

    2014-01-01

    Senecio scandens Buch.-Ham. ex D. Don, an important antibacterial source of Chinese traditional medicine, has a widespread distribution in a few ecological habitats of China. We generated a full-length complementary DNA (cDNA) library from a sample of elite individuals with superior antibacterial properties, with satisfactory parameters such as library storage (4.30 × 106 CFU), efficiency of titre (1.30 × 106 CFU/mL), transformation efficiency (96.35%), full-length ratio (64.00%) and redundancy ratio (3.28%). The BLASTN search revealed the facile formation of counterparts between the experimental sample and Arabidopsis thaliana in view of high-homology cDNA sequence (90.79%) with e-values <1e – 50. Sequence similarities to known proteins indicate that the entire sequences of the full-length cDNA clones consist of the major of functional genes identified by a large set of microarray data from the present experimental material. For other Compositae species, a large set of full-length cDNA clones reported in the present article will serve as a useful resource to facilitate further research on the transferability of expressed sequence tag-derived simple sequence repeats (EST-SSR) development, comparative genomics and novel transcript profiles. PMID:26740776

  5. Genomic resources for water yam (Dioscorea alata L.): analyses of EST-Sequences, De Novo sequencing and GBS libraries

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources such as SSRs, SNPs and InDels in several model and non-model plant species. Yam (Dioscorea spp.) i...

  6. A Prospective Virtual Screening Study: Enriching Hit Rates and Designing Focus Libraries To Find Inhibitors of PI3Kδ and PI3Kγ.

    PubMed

    Damm-Ganamet, Kelly L; Bembenek, Scott D; Venable, Jennifer W; Castro, Glenda G; Mangelschots, Lieve; Peeters, Daniëlle C G; Mcallister, Heather M; Edwards, James P; Disepio, Daniel; Mirzadegan, Taraneh

    2016-05-12

    Here, we report a high-throughput virtual screening (HTVS) study using phosphoinositide 3-kinase (both PI3Kγ and PI3Kδ). Our initial HTVS results of the Janssen corporate database identified small focused libraries with hit rates at 50% inhibition showing a 50-fold increase over those from a HTS (high-throughput screen). Further, applying constraints based on "chemically intuitive" hydrogen bonds and/or positional requirements resulted in a substantial improvement in the hit rates (versus no constraints) and reduced docking time. While we find that docking scoring functions are not capable of providing a reliable relative ranking of a set of compounds, a prioritization of groups of compounds (e.g., low, medium, and high) does emerge, which allows for the chemistry efforts to be quickly focused on the most viable candidates. Thus, this illustrates that it is not always necessary to have a high correlation between a computational score and the experimental data to impact the drug discovery process. PMID:27043133

  7. Construction and Characterization of a Repetitive DNA Library in Parodontidae (Actinopterygii: Characiformes): A Genomic and Evolutionary Approach to the Degeneration of the W Sex Chromosome

    PubMed Central

    Oliveira, Jordana Inácio Nascimento; Nogaroto, Viviane; Almeida, Mara Cristina; Artoni, Roberto Ferreira; Cestari, Marta Margarete; Moreira-Filho, Orlando; Vicari, Marcelo Ricardo

    2014-01-01

    Abstract Repetitive DNA sequences, including tandem and dispersed repeats, comprise a large portion of eukaryotic genomes and are important for gene regulation, sex chromosome differentiation, and karyotype evolution. In Parodontidae, only the repetitive DNAs WAp and pPh2004 and rDNAs were previously studied using fluorescence in situ hybridization. This study aimed to build a library of repetitive DNA in Parodontidae. We isolated 40 clones using Cot-1; 17 of these clones exhibited similarity to repetitive DNA sequences, including satellites, minisatellites, microsatellites, and class I and class II transposable elements (TEs), from Danio rerio and other organisms. The physical mapping of the clones to chromosomes revealed the presence of a satellite DNA, a Helitron element, and degenerate short interspersed element (SINE), long interspersed element (LINE), and tc1-mariner elements on the sex chromosomes. Some clones exhibited dispersed signals; other sequences were not detected. The 5S rDNA was detected on an autosomal pair. These elements likely function in the molecular degeneration of the W chromosome in Parodontidae. Thus, the location of these elements on the chromosomes is important for understanding the function of these repetitive DNAs and for integrative studies with genome sequencing. The presented data demonstrate that an intensive invasion of TEs occurred during W sex chromosome differentiation in the Parodontidae. PMID:25122415

  8. Elucidation of the Photorhabdus temperata Genome and Generation of a Transposon Mutant Library To Identify Motility Mutants Altered in Pathogenesis

    PubMed Central

    Hurst, Sheldon; Rowedder, Holli; Michaels, Brandye; Bullock, Hannah; Jackobeck, Ryan; Abebe-Akele, Feseha; Durakovic, Umjia; Gately, Jon; Janicki, Erik

    2015-01-01

    ABSTRACT The entomopathogenic nematode Heterorhabditis bacteriophora forms a specific mutualistic association with its bacterial partner Photorhabdus temperata. The microbial symbiont is required for nematode growth and development, and symbiont recognition is strain specific. The aim of this study was to sequence the genome of P. temperata and identify genes that plays a role in the pathogenesis of the Photorhabdus-Heterorhabditis symbiosis. A draft genome sequence of P. temperata strain NC19 was generated. The 5.2-Mb genome was organized into 17 scaffolds and contained 4,808 coding sequences (CDS). A genetic approach was also pursued to identify mutants with altered motility. A bank of 10,000 P. temperata transposon mutants was generated and screened for altered motility patterns. Five classes of motility mutants were identified: (i) nonmotile mutants, (ii) mutants with defective or aberrant swimming motility, (iii) mutant swimmers that do not require NaCl or KCl, (iv) hyperswimmer mutants that swim at an accelerated rate, and (v) hyperswarmer mutants that are able to swarm on the surface of 1.25% agar. The transposon insertion sites for these mutants were identified and used to investigate other physiological properties, including insect pathogenesis. The motility-defective mutant P13-7 had an insertion in the RNase II gene and showed reduced virulence and production of extracellular factors. Genetic complementation of this mutant restored wild-type activity. These results demonstrate a role for RNA turnover in insect pathogenesis and other physiological functions. IMPORTANCE The relationship between Photorhabdus and entomopathogenic nematode Heterorhabditis represents a well-known mutualistic system that has potential as a biological control agent. The elucidation of the genome of the bacterial partner and role that RNase II plays in its life cycle has provided a greater understanding of Photorhabdus as both an insect pathogen and a nematode symbiont. PMID

  9. Aquaculture Genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomics chapter covers the basics of genome mapping and sequencing and the current status of several relevant species. The chapter briefly describes the development and use of (cDNA, BAC, etc.) libraries for mapping and obtaining specific sequence information. Other topics include comparative ...

  10. Resistance gene enrichment sequencing (RenSeq) enables reannotation of the NB-LRR gene family from sequenced plant genomes and rapid mapping of resistance loci in segregating populations.

    PubMed

    Jupe, Florian; Witek, Kamil; Verweij, Walter; Sliwka, Jadwiga; Pritchard, Leighton; Etherington, Graham J; Maclean, Dan; Cock, Peter J; Leggett, Richard M; Bryan, Glenn J; Cardle, Linda; Hein, Ingo; Jones, Jonathan D G

    2013-11-01

    RenSeq is a NB-LRR (nucleotide binding-site leucine-rich repeat) gene-targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB-LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB-LRRs and can be accessed through a genome browser that we provide. We compared these NB-LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ~80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum 'Heinz 1706' extended the NB-LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co-segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi-ber2) and S. ruiz-ceballosii (Rpi-rzc1), we were able to apply RenSeq successfully to identify markers that co-segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy-to-adapt Galaxy pipelines. PMID:23937694