Sample records for reference sequence library

  1. Accelerating plant DNA barcode reference library construction using herbarium specimens: improved experimental techniques.

    PubMed

    Xu, Chao; Dong, Wenpan; Shi, Shuo; Cheng, Tao; Li, Changhao; Liu, Yanlei; Wu, Ping; Wu, Hongkun; Gao, Peng; Zhou, Shiliang

    2015-11-01

    A well-covered reference library is crucial for successful identification of species by DNA barcoding. The biggest difficulty in building such a reference library is the lack of materials of organisms. Herbarium collections are potentially an enormous resource of materials. In this study, we demonstrate that it is likely to build such reference libraries using the reconstructed (self-primed PCR amplified) DNA from the herbarium specimens. We used 179 rosaceous specimens to test the effects of DNA reconstruction, 420 randomly sampled specimens to estimate the usable percentage and another 223 specimens of true cherries (Cerasus, Rosaceae) to test the coverage of usable specimens to the species. The barcode rbcLb (the central four-sevenths of rbcL gene) and matK was each amplified in two halves and sequenced on Roche GS 454 FLX+. DNA from the herbarium specimens was typically shorter than 300 bp. DNA reconstruction enabled amplification fragments of 400-500 bp without bringing or inducing any sequence errors. About one-third of specimens in the national herbarium of China (PE) were proven usable after DNA reconstruction. The specimens in PE cover all Chinese true cherry species and 91.5% of vascular species listed in Flora of China. It is very possible to build well-covered reference libraries for DNA barcoding of vascular species in China. As exemplified in this study, DNA reconstruction and DNA-labelled next-generation sequencing can accelerate the construction of local reference libraries. By putting the local reference libraries together, a global library for DNA barcoding becomes closer to reality. © 2015 John Wiley & Sons Ltd.

  2. A new strategy for genome assembly using short sequence reads and reduced representation libraries.

    PubMed

    Young, Andrew L; Abaan, Hatice Ozel; Zerbino, Daniel; Mullikin, James C; Birney, Ewan; Margulies, Elliott H

    2010-02-01

    We have developed a novel approach for using massively parallel short-read sequencing to generate fast and inexpensive de novo genomic assemblies comparable to those generated by capillary-based methods. The ultrashort (<100 base) sequences generated by this technology pose specific biological and computational challenges for de novo assembly of large genomes. To account for this, we devised a method for experimentally partitioning the genome using reduced representation (RR) libraries prior to assembly. We use two restriction enzymes independently to create a series of overlapping fragment libraries, each containing a tractable subset of the genome. Together, these libraries allow us to reassemble the entire genome without the need of a reference sequence. As proof of concept, we applied this approach to sequence and assembled the majority of the 125-Mb Drosophila melanogaster genome. We subsequently demonstrate the accuracy of our assembly method with meaningful comparisons against the current available D. melanogaster reference genome (dm3). The ease of assembly and accuracy for comparative genomics suggest that our approach will scale to future mammalian genome-sequencing efforts, saving both time and money without sacrificing quality.

  3. Plans and progress for building a Great Lakes fauna DNA barcode reference library

    EPA Science Inventory

    DNA reference libraries provide researchers with an important tool for assessing regional biodiversity by allowing unknown genetic sequences to be assigned identities, while also providing a means for taxonomists to validate identifications. Expanding the representation of Great...

  4. Using PATIMDB to Create Bacterial Transposon Insertion Mutant Libraries

    PubMed Central

    Urbach, Jonathan M.; Wei, Tao; Liberati, Nicole; Grenfell-Lee, Daniel; Villanueva, Jacinto; Wu, Gang; Ausubel, Frederick M.

    2015-01-01

    PATIMDB is a software package for facilitating the generation of transposon mutant insertion libraries. The software has two main functions: process tracking and automated sequence analysis. The process tracking function specifically includes recording the status and fates of multiwell plates and samples in various stages of library construction. Automated sequence analysis refers specifically to the pipeline of sequence analysis starting with ABI files from a sequencing facility and ending with insertion location identifications. The protocols in this unit describe installation and use of PATIMDB software. PMID:19343706

  5. Targeted RNA-Sequencing with Competitive Multiplex-PCR Amplicon Libraries

    PubMed Central

    Blomquist, Thomas M.; Crawford, Erin L.; Lovett, Jennie L.; Yeo, Jiyoun; Stanoszek, Lauren M.; Levin, Albert; Li, Jia; Lu, Mei; Shi, Leming; Muldrew, Kenneth; Willey, James C.

    2013-01-01

    Whole transcriptome RNA-sequencing is a powerful tool, but is costly and yields complex data sets that limit its utility in molecular diagnostic testing. A targeted quantitative RNA-sequencing method that is reproducible and reduces the number of sequencing reads required to measure transcripts over the full range of expression would be better suited to diagnostic testing. Toward this goal, we developed a competitive multiplex PCR-based amplicon sequencing library preparation method that a) targets only the sequences of interest and b) controls for inter-target variation in PCR amplification during library preparation by measuring each transcript native template relative to a known number of synthetic competitive template internal standard copies. To determine the utility of this method, we intentionally selected PCR conditions that would cause transcript amplification products (amplicons) to converge toward equimolar concentrations (normalization) during library preparation. We then tested whether this approach would enable accurate and reproducible quantification of each transcript across multiple library preparations, and at the same time reduce (through normalization) total sequencing reads required for quantification of transcript targets across a large range of expression. We demonstrate excellent reproducibility (R2 = 0.997) with 97% accuracy to detect 2-fold change using External RNA Controls Consortium (ERCC) reference materials; high inter-day, inter-site and inter-library concordance (R2 = 0.97–0.99) using FDA Sequencing Quality Control (SEQC) reference materials; and cross-platform concordance with both TaqMan qPCR (R2 = 0.96) and whole transcriptome RNA-sequencing following “traditional” library preparation using Illumina NGS kits (R2 = 0.94). Using this method, sequencing reads required to accurately quantify more than 100 targeted transcripts expressed over a 107-fold range was reduced more than 10,000-fold, from 2.3×109 to 1.4×105 sequencing reads. These studies demonstrate that the competitive multiplex-PCR amplicon library preparation method presented here provides the quality control, reproducibility, and reduced sequencing reads necessary for development and implementation of targeted quantitative RNA-sequencing biomarkers in molecular diagnostic testing. PMID:24236095

  6. Potential for DNA-based identification of Great Lakes fauna: match and mismatch between taxa inventories and DNA barcode libraries.

    PubMed

    Trebitz, Anett S; Hoffman, Joel C; Grant, George W; Billehus, Tyler M; Pilgrim, Erik M

    2015-07-22

    DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.

  7. Potential for DNA-based identification of Great Lakes fauna: match and mismatch between taxa inventories and DNA barcode libraries

    NASA Astrophysics Data System (ADS)

    Trebitz, Anett S.; Hoffman, Joel C.; Grant, George W.; Billehus, Tyler M.; Pilgrim, Erik M.

    2015-07-01

    DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.

  8. Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

    PubMed Central

    Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

    2012-01-01

    Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365

  9. Geoseq: a tool for dissecting deep-sequencing datasets.

    PubMed

    Gurtowski, James; Cancio, Anthony; Shah, Hardik; Levovitz, Chaya; George, Ajish; Homann, Robert; Sachidanandam, Ravi

    2010-10-12

    Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  10. BOKP: A DNA Barcode Reference Library for Monitoring Herbal Drugs in the Korean Pharmacopeia

    PubMed Central

    Liu, Jinxin; Shi, Linchun; Song, Jingyuan; Sun, Wei; Han, Jianping; Liu, Xia; Hou, Dianyun; Yao, Hui; Li, Mingyue; Chen, Shilin

    2017-01-01

    Herbal drug authentication is an important task in traditional medicine; however, it is challenged by the limitations of traditional authentication methods and the lack of trained experts. DNA barcoding is conspicuous in almost all areas of the biological sciences and has already been added to the British pharmacopeia and Chinese pharmacopeia for routine herbal drug authentication. However, DNA barcoding for the Korean pharmacopeia still requires significant improvements. Here, we present a DNA barcode reference library for herbal drugs in the Korean pharmacopeia and developed a species identification engine named KP-IDE to facilitate the adoption of this DNA reference library for the herbal drug authentication. Using taxonomy records, specimen records, sequence records, and reference records, KP-IDE can identify an unknown specimen. Currently, there are 6,777 taxonomy records, 1,054 specimen records, 30,744 sequence records (ITS2 and psbA-trnH) and 285 reference records. Moreover, 27 herbal drug materials were collected from the Seoul Yangnyeongsi herbal medicine market to give an example for real herbal drugs authentications. Our study demonstrates the prospects of the DNA barcode reference library for the Korean pharmacopeia and provides future directions for the use of DNA barcoding for authenticating herbal drugs listed in other modern pharmacopeias. PMID:29326593

  11. Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries

    PubMed Central

    Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.

    2013-01-01

    Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772

  12. The use of PacBio and Hi-C data in denovo assembly of the goat genome

    USDA-ARS?s Scientific Manuscript database

    Generating de novo reference genome assemblies for non-model organisms is a laborious task that often requires a large amount of data from several sequencing platforms and cytogenetic surveys. By using PacBio sequence data and new library creation techniques, we present a de novo, high quality refer...

  13. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

    PubMed

    Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

    2016-05-26

    Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.

  14. Using herbarium-derived DNAs to assemble a large-scale DNA barcode library for the vascular plants of Canada.

    PubMed

    Kuzmina, Maria L; Braukmann, Thomas W A; Fazekas, Aron J; Graham, Sean W; Dewaard, Stephanie L; Rodrigues, Anuar; Bennett, Bruce A; Dickinson, Timothy A; Saarela, Jeffery M; Catling, Paul M; Newmaster, Steven G; Percy, Diana M; Fenneman, Erin; Lauron-Moreau, Aurélien; Ford, Bruce; Gillespie, Lynn; Subramanyam, Ragupathy; Whitton, Jeannette; Jennings, Linda; Metsger, Deborah; Warne, Connor P; Brown, Allison; Sears, Elizabeth; Dewaard, Jeremy R; Zakharov, Evgeny V; Hebert, Paul D N

    2017-12-01

    Constructing complete, accurate plant DNA barcode reference libraries can be logistically challenging for large-scale floras. Here we demonstrate the promise and challenges of using herbarium collections for building a DNA barcode reference library for the vascular plant flora of Canada. Our study examined 20,816 specimens representing 5076 of 5190 vascular plant species in Canada (98%). For 98% of the specimens, at least one of the DNA barcode regions was recovered from the plastid loci rbcL and matK and from the nuclear ITS2 region. We used beta regression to quantify the effects of age, type of preservation, and taxonomic affiliation (family) on DNA sequence recovery. Specimen age and method of preservation had significant effects on sequence recovery for all markers, but influenced some families more (e.g., Boraginaceae) than others (e.g., Asteraceae). Our DNA barcode library represents an unparalleled resource for metagenomic and ecological genetic research working on temperate and arctic biomes. An observed decline in sequence recovery with specimen age may be associated with poor primer matches, intragenomic variation (for ITS2), or inhibitory secondary compounds in some taxa.

  15. Using herbarium-derived DNAs to assemble a large-scale DNA barcode library for the vascular plants of Canada1

    PubMed Central

    Kuzmina, Maria L.; Braukmann, Thomas W. A.; Fazekas, Aron J.; Graham, Sean W.; Dewaard, Stephanie L.; Rodrigues, Anuar; Bennett, Bruce A.; Dickinson, Timothy A.; Saarela, Jeffery M.; Catling, Paul M.; Newmaster, Steven G.; Percy, Diana M.; Fenneman, Erin; Lauron-Moreau, Aurélien; Ford, Bruce; Gillespie, Lynn; Subramanyam, Ragupathy; Whitton, Jeannette; Jennings, Linda; Metsger, Deborah; Warne, Connor P.; Brown, Allison; Sears, Elizabeth; Dewaard, Jeremy R.; Zakharov, Evgeny V.; Hebert, Paul D. N.

    2017-01-01

    Premise of the study: Constructing complete, accurate plant DNA barcode reference libraries can be logistically challenging for large-scale floras. Here we demonstrate the promise and challenges of using herbarium collections for building a DNA barcode reference library for the vascular plant flora of Canada. Methods: Our study examined 20,816 specimens representing 5076 of 5190 vascular plant species in Canada (98%). For 98% of the specimens, at least one of the DNA barcode regions was recovered from the plastid loci rbcL and matK and from the nuclear ITS2 region. We used beta regression to quantify the effects of age, type of preservation, and taxonomic affiliation (family) on DNA sequence recovery. Results: Specimen age and method of preservation had significant effects on sequence recovery for all markers, but influenced some families more (e.g., Boraginaceae) than others (e.g., Asteraceae). Discussion: Our DNA barcode library represents an unparalleled resource for metagenomic and ecological genetic research working on temperate and arctic biomes. An observed decline in sequence recovery with specimen age may be associated with poor primer matches, intragenomic variation (for ITS2), or inhibitory secondary compounds in some taxa. PMID:29299394

  16. Sequenced sorghum mutant library- an efficient platform for discovery of causal gene mutations

    USDA-ARS?s Scientific Manuscript database

    Ethyl methanesulfonate (EMS) efficiently generates high-density mutations in genomes. We applied whole-genome sequencing to 256 phenotyped mutant lines of sorghum (Sorghum bicolor L. Moench) to 16x coverage. Comparisons with the reference sequence revealed >1.8 million canonical EMS-induced G/C to A...

  17. Moorea BIOCODE barcode library as a tool for understanding predator-prey interactions: insights into the diet of common predatory coral reef fishes

    NASA Astrophysics Data System (ADS)

    Leray, M.; Boehm, J. T.; Mills, S. C.; Meyer, C. P.

    2012-06-01

    Identifying species involved in consumer-resource interactions is one of the main limitations in the construction of food webs. DNA barcoding of prey items in predator guts provides a valuable tool for characterizing trophic interactions, but the method relies on the availability of reference sequences to which prey sequences can be matched. In this study, we demonstrate that the COI sequence library of the Moorea BIOCODE project, an ecosystem-level barcode initiative, enables the identification of a large proportion of semi-digested fish, crustacean and mollusks found in the guts of three Hawkfish and two Squirrelfish species. While most prey remains lacked diagnostic morphological characters, 94% of the prey found in 67 fishes had >98% sequence similarity with BIOCODE reference sequences. Using this species-level prey identification, we demonstrate how DNA barcoding can provide insights into resource partitioning, predator feeding behaviors and the consequences of predation on ecosystem function.

  18. BAC-end sequence-based SNP mining in Allotetraploid Cotton (Gossypium) utilizing re-sequencing data, phylogenetic inferences and perspectives for genetic mapping

    USDA-ARS?s Scientific Manuscript database

    A bacterial artificial chromosome (BAC) library and BAC-end sequences for Gossypium hirsutum L. have recently been developed. Here we report on genomic-based genome-wide SNP mining utilizing re-sequencing data with a BAC-end sequence reference for twelve G. hirsutum L. lines, one G. barbadense L. li...

  19. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

    PubMed

    Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

    2015-11-18

    RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as well as whole genome analyses.

  20. First DNA Barcode Reference Library for the Identification of South American Freshwater Fish from the Lower Paraná River

    PubMed Central

    Brancolini, Florencia; del Pazo, Felipe; Posner, Victoria Maria; Grimberg, Alexis; Arranz, Silvia Eda

    2016-01-01

    Valid fish species identification is essential for biodiversity conservation and fisheries management. Here, we provide a sequence reference library based on mitochondrial cytochrome c oxidase subunit I for a valid identification of 79 freshwater fish species from the Lower Paraná River. Neighbour-joining analysis based on K2P genetic distances formed non-overlapping clusters for almost all species with a ≥99% bootstrap support each. Identification was successful for 97.8% of species as the minimum genetic distance to the nearest neighbour exceeded the maximum intraspecific distance in all these cases. A barcoding gap of 2.5% was apparent for the whole data set with the exception of four cases. Within-species distances ranged from 0.00% to 7.59%, while interspecific distances varied between 4.06% and 19.98%, without considering Odontesthes species with a minimum genetic distance of 0%. Sequence library validation was performed by applying BOLDs BIN analysis tool, Poisson Tree Processes model and Automatic Barcode Gap Discovery, along with a reliable taxonomic assignment by experts. Exhaustive revision of vouchers was performed when a conflicting assignment was detected after sequence analysis and BIN discordance evaluation. Thus, the sequence library presented here can be confidently used as a benchmark for identification of half of the fish species recorded for the Lower Paraná River. PMID:27442116

  1. Utility of GenBank and the Barcode of Life Data Systems (BOLD) for the identification of forensically important Diptera from Belgium and France

    PubMed Central

    Sonet, Gontran; Jordaens, Kurt; Braet, Yves; Bourguignon, Luc; Dupont, Eréna; Backeljau, Thierry; De Meyer, Marc; Desmyter, Stijn

    2013-01-01

    Abstract Fly larvae living on dead corpses can be used to estimate post-mortem intervals. The identification of these flies is decisive in forensic casework and can be facilitated by using DNA barcodes provided that a representative and comprehensive reference library of DNA barcodes is available. We constructed a local (Belgium and France) reference library of 85 sequences of the COI DNA barcode fragment (mitochondrial cytochrome c oxidase subunit I gene), from 16 fly species of forensic interest (Calliphoridae, Muscidae, Fanniidae). This library was then used to evaluate the ability of two public libraries (GenBank and the Barcode of Life Data Systems – BOLD) to identify specimens from Belgian and French forensic cases. The public libraries indeed allow a correct identification of most specimens. Yet, some of the identifications remain ambiguous and some forensically important fly species are not, or insufficiently, represented in the reference libraries. Several search options offered by GenBank and BOLD can be used to further improve the identifications obtained from both libraries using DNA barcodes. PMID:24453564

  2. SCARF: maximizing next-generation EST assemblies for evolutionary and population genomic analyses.

    PubMed

    Barker, Michael S; Dlugosch, Katrina M; Reddy, A Chaitanya C; Amyotte, Sarah N; Rieseberg, Loren H

    2009-02-15

    Scaffolded and Corrected Assembly of Roche 454 (SCARF) is a next-generation sequence assembly tool for evolutionary genomics that is designed especially for assembling 454 EST sequences against high-quality reference sequences from related species. The program was created to knit together 454 contigs that do not assemble during traditional de novo assembly, using a reference sequence library to orient the 454 sequences. SCARF is freely available at http://msbarker.com/software.htm, and is released under the open source GPLv3 license (http://www.opensource.org/licenses/gpl-3.0.html.

  3. Accelerated construction of a regional DNA-barcode reference library: Caddisflies (Trichoptera) in the Great Smoky Mountains National Park

    USGS Publications Warehouse

    Zhou, X.; Robinson, J.L.; Geraci, C.J.; Parker, C.R.; Flint, O.S.; Etnier, D.A.; Ruiter, D.; DeWalt, R.E.; Jacobus, L.M.; Hebert, P.D.N.

    2011-01-01

    Deoxyribonucleic acid (DNA) barcoding is an effective tool for species identification and lifestage association in a wide range of animal taxa. We developed a strategy for rapid construction of a regional DNA-barcode reference library and used the caddisflies (Trichoptera) of the Great Smoky Mountains National Park (GSMNP) as a model. Nearly 1000 cytochrome c oxidase subunit I (COI) sequences, representing 209 caddisfly species previously recorded from GSMNP, were obtained from the global Trichoptera Barcode of Life campaign. Most of these sequences were collected from outside the GSMNP area. Another 645 COI sequences, representing 80 species, were obtained from specimens collected in a 3-d bioblitz (short-term, intense sampling program) in GSMNP. The joint collections provided barcode coverage for 212 species, 91% of the GSMNP fauna. Inclusion of samples from other localities greatly expedited construction of the regional DNA-barcode reference library. This strategy increased intraspecific divergence and decreased average distances to nearest neighboring species, but the DNA-barcode library was able to differentiate 93% of the GSMNP Trichoptera species examined. Global barcoding projects will aid construction of regional DNA-barcode libraries, but local surveys make crucial contributions to progress by contributing rare or endemic species and full-length barcodes generated from high-quality DNA. DNA taxonomy is not a goal of our present work, but the investigation of COI divergence patterns in caddisflies is providing new insights into broader biodiversity patterns in this group and has directed attention to various issues, ranging from the need to re-evaluate species taxonomy with integrated morphological and molecular evidence to the necessity of an appropriate interpretation of barcode analyses and its implications in understanding species diversity (in contrast to a simple claim for barcoding failure).

  4. Combination of Multiple Spectral Libraries Improves the Current Search Methods Used to Identify Missing Proteins in the Chromosome-Centric Human Proteome Project.

    PubMed

    Cho, Jin-Young; Lee, Hyoung-Joo; Jeong, Seul-Ki; Kim, Kwang-Youl; Kwon, Kyung-Hoon; Yoo, Jong Shin; Omenn, Gilbert S; Baker, Mark S; Hancock, William S; Paik, Young-Ki

    2015-12-04

    Approximately 2.9 billion long base-pair human reference genome sequences are known to encode some 20 000 representative proteins. However, 3000 proteins, that is, ~15% of all proteins, have no or very weak proteomic evidence and are still missing. Missing proteins may be present in rare samples in very low abundance or be only temporarily expressed, causing problems in their detection and protein profiling. In particular, some technical limitations cause missing proteins to remain unassigned. For example, current mass spectrometry techniques have high limits and error rates for the detection of complex biological samples. An insufficient proteome coverage in a reference sequence database and spectral library also raises major issues. Thus, the development of a better strategy that results in greater sensitivity and accuracy in the search for missing proteins is necessary. To this end, we used a new strategy, which combines a reference spectral library search and a simulated spectral library search, to identify missing proteins. We built the human iRefSPL, which contains the original human reference spectral library and additional peptide sequence-spectrum match entries from other species. We also constructed the human simSPL, which contains the simulated spectra of 173 907 human tryptic peptides determined by MassAnalyzer (version 2.3.1). To prove the enhanced analytical performance of the combination of the human iRefSPL and simSPL methods for the identification of missing proteins, we attempted to reanalyze the placental tissue data set (PXD000754). The data from each experiment were analyzed using PeptideProphet, and the results were combined using iProphet. For the quality control, we applied the class-specific false-discovery rate filtering method. All of the results were filtered at a false-discovery rate of <1% at the peptide and protein levels. The quality-controlled results were then cross-checked with the neXtProt DB (2014-09-19 release). The two spectral libraries, iRefSPL and simSPL, were designed to ensure no overlap of the proteome coverage. They were shown to be complementary to spectral library searching and significantly increased the number of matches. From this trial, 12 new missing proteins were identified that passed the following criterion: at least 2 peptides of 7 or more amino acids in length or one of 9 or more amino acids in length with one or more unique sequences. Thus, the iRefSPL and simSPL combination can be used to help identify peptides that have not been detected by conventional sequence database searches with improved sensitivity and a low error rate.

  5. Modeling genome coverage in single-cell sequencing

    PubMed Central

    Daley, Timothy; Smith, Andrew D.

    2014-01-01

    Motivation: Single-cell DNA sequencing is necessary for examining genetic variation at the cellular level, which remains hidden in bulk sequencing experiments. But because they begin with such small amounts of starting material, the amount of information that is obtained from single-cell sequencing experiment is highly sensitive to the choice of protocol employed and variability in library preparation. In particular, the fraction of the genome represented in single-cell sequencing libraries exhibits extreme variability due to quantitative biases in amplification and loss of genetic material. Results: We propose a method to predict the genome coverage of a deep sequencing experiment using information from an initial shallow sequencing experiment mapped to a reference genome. The observed coverage statistics are used in a non-parametric empirical Bayes Poisson model to estimate the gain in coverage from deeper sequencing. This approach allows researchers to know statistical features of deep sequencing experiments without actually sequencing deeply, providing a basis for optimizing and comparing single-cell sequencing protocols or screening libraries. Availability and implementation: The method is available as part of the preseq software package. Source code is available at http://smithlabresearch.org/preseq. Contact: andrewds@usc.edu Supplementary information: Supplementary material is available at Bioinformatics online. PMID:25107873

  6. DNA barcode data accurately assign higher spider taxa

    PubMed Central

    Coddington, Jonathan A.; Agnarsson, Ingi; Cheng, Ren-Chung; Čandek, Klemen; Driskell, Amy; Frick, Holger; Gregorič, Matjaž; Kostanjšek, Rok; Kropf, Christian; Kweskin, Matthew; Lokovšek, Tjaša; Pipan, Miha; Vidergar, Nina

    2016-01-01

    The use of unique DNA sequences as a method for taxonomic identification is no longer fundamentally controversial, even though debate continues on the best markers, methods, and technology to use. Although both existing databanks such as GenBank and BOLD, as well as reference taxonomies, are imperfect, in best case scenarios “barcodes” (whether single or multiple, organelle or nuclear, loci) clearly are an increasingly fast and inexpensive method of identification, especially as compared to manual identification of unknowns by increasingly rare expert taxonomists. Because most species on Earth are undescribed, a complete reference database at the species level is impractical in the near term. The question therefore arises whether unidentified species can, using DNA barcodes, be accurately assigned to more inclusive groups such as genera and families—taxonomic ranks of putatively monophyletic groups for which the global inventory is more complete and stable. We used a carefully chosen test library of CO1 sequences from 49 families, 313 genera, and 816 species of spiders to assess the accuracy of genus and family-level assignment. We used BLAST queries of each sequence against the entire library and got the top ten hits. The percent sequence identity was reported from these hits (PIdent, range 75–100%). Accurate assignment of higher taxa (PIdent above which errors totaled less than 5%) occurred for genera at PIdent values >95 and families at PIdent values ≥ 91, suggesting these as heuristic thresholds for accurate generic and familial identifications in spiders. Accuracy of identification increases with numbers of species/genus and genera/family in the library; above five genera per family and fifteen species per genus all higher taxon assignments were correct. We propose that using percent sequence identity between conventional barcode sequences may be a feasible and reasonably accurate method to identify animals to family/genus. However, the quality of the underlying database impacts accuracy of results; many outliers in our dataset could be attributed to taxonomic and/or sequencing errors in BOLD and GenBank. It seems that an accurate and complete reference library of families and genera of life could provide accurate higher level taxonomic identifications cheaply and accessibly, within years rather than decades. PMID:27547527

  7. Spectrum-to-Spectrum Searching Using a Proteome-wide Spectral Library*

    PubMed Central

    Yen, Chia-Yu; Houel, Stephane; Ahn, Natalie G.; Old, William M.

    2011-01-01

    The unambiguous assignment of tandem mass spectra (MS/MS) to peptide sequences remains a key unsolved problem in proteomics. Spectral library search strategies have emerged as a promising alternative for peptide identification, in which MS/MS spectra are directly compared against a reference library of confidently assigned spectra. Two problems relate to library size. First, reference spectral libraries are limited to rediscovery of previously identified peptides and are not applicable to new peptides, because of their incomplete coverage of the human proteome. Second, problems arise when searching a spectral library the size of the entire human proteome. We observed that traditional dot product scoring methods do not scale well with spectral library size, showing reduction in sensitivity when library size is increased. We show that this problem can be addressed by optimizing scoring metrics for spectrum-to-spectrum searches with large spectral libraries. MS/MS spectra for the 1.3 million predicted tryptic peptides in the human proteome are simulated using a kinetic fragmentation model (MassAnalyzer version2.1) to create a proteome-wide simulated spectral library. Searches of the simulated library increase MS/MS assignments by 24% compared with Mascot, when using probabilistic and rank based scoring methods. The proteome-wide coverage of the simulated library leads to 11% increase in unique peptide assignments, compared with parallel searches of a reference spectral library. Further improvement is attained when reference spectra and simulated spectra are combined into a hybrid spectral library, yielding 52% increased MS/MS assignments compared with Mascot searches. Our study demonstrates the advantages of using probabilistic and rank based scores to improve performance of spectrum-to-spectrum search strategies. PMID:21532008

  8. Comparison of methods for library construction and short read annotation of shellfish viral metagenomes.

    PubMed

    Wei, Hong-Ying; Huang, Sheng; Wang, Jiang-Yong; Gao, Fang; Jiang, Jing-Zhe

    2018-03-01

    The emergence and widespread use of high-throughput sequencing technologies have promoted metagenomic studies on environmental or animal samples. Library construction for metagenome sequencing and annotation of the produced sequence reads are important steps in such studies and influence the quality of metagenomic data. In this study, we collected some marine mollusk samples, such as Crassostrea hongkongensis, Chlamys farreri, and Ruditapes philippinarum, from coastal areas in South China. These samples were divided into two batches to compare two library construction methods for shellfish viral metagenome. Our analysis showed that reverse-transcribing RNA into cDNA and then amplifying it simultaneously with DNA by whole genome amplification (WGA) yielded a larger amount of DNA compared to using only WGA or WTA (whole transcriptome amplification). Moreover, higher quality libraries were obtained by agarose gel extraction rather than with AMPure bead size selection. However, the latter can also provide good results if combined with the adjustment of the filter parameters. This, together with its simplicity, makes it a viable alternative. Finally, we compared three annotation tools (BLAST, DIAMOND, and Taxonomer) and two reference databases (NCBI's NR and Uniprot's Uniref). Considering the limitations of computing resources and data transfer speed, we propose the use of DIAMOND with Uniref for annotating metagenomic short reads as its running speed can guarantee a good annotation rate. This study may serve as a useful reference for selecting methods for Shellfish viral metagenome library construction and read annotation.

  9. Aptaligner: automated software for aligning pseudorandom DNA X-aptamers from next-generation sequencing data.

    PubMed

    Lu, Emily; Elizondo-Riojas, Miguel-Angel; Chang, Jeffrey T; Volk, David E

    2014-06-10

    Next-generation sequencing results from bead-based aptamer libraries have demonstrated that traditional DNA/RNA alignment software is insufficient. This is particularly true for X-aptamers containing specialty bases (W, X, Y, Z, ...) that are identified by special encoding. Thus, we sought an automated program that uses the inherent design scheme of bead-based X-aptamers to create a hypothetical reference library and Markov modeling techniques to provide improved alignments. Aptaligner provides this feature as well as length error and noise level cutoff features, is parallelized to run on multiple central processing units (cores), and sorts sequences from a single chip into projects and subprojects.

  10. Clinical Application of Picodroplet Digital PCR Technology for Rapid Detection of EGFR T790M in Next-Generation Sequencing Libraries and DNA from Limited Tumor Samples.

    PubMed

    Borsu, Laetitia; Intrieri, Julie; Thampi, Linta; Yu, Helena; Riely, Gregory; Nafa, Khedoudja; Chandramohan, Raghu; Ladanyi, Marc; Arcila, Maria E

    2016-11-01

    Although next-generation sequencing (NGS) is a robust technology for comprehensive assessment of EGFR-mutant lung adenocarcinomas with acquired resistance to tyrosine kinase inhibitors, it may not provide sufficiently rapid and sensitive detection of the EGFR T790M mutation, the most clinically relevant resistance biomarker. Here, we describe a digital PCR (dPCR) assay for rapid T790M detection on aliquots of NGS libraries prepared for comprehensive profiling, fully maximizing broad genomic analysis on limited samples. Tumor DNAs from patients with EGFR-mutant lung adenocarcinomas and acquired resistance to epidermal growth factor receptor inhibitors were prepared for Memorial Sloan-Kettering-Integrated Mutation Profiling of Actionable Cancer Targets sequencing, a hybrid capture-based assay interrogating 410 cancer-related genes. Precapture library aliquots were used for rapid EGFR T790M testing by dPCR, and results were compared with NGS and locked nucleic acid-PCR Sanger sequencing (reference high sensitivity method). Seventy resistance samples showed 99% concordance with the reference high sensitivity method in accuracy studies. Input as low as 2.5 ng provided a sensitivity of 1% and improved further with increasing DNA input. dPCR on libraries required less DNA and showed better performance than direct genomic DNA. dPCR on NGS libraries is a robust and rapid approach to EGFR T790M testing, allowing most economical utilization of limited material for comprehensive assessment. The same assay can also be performed directly on any limited DNA source and cell-free DNA. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  11. A reliable DNA barcode reference library for the identification of the North European shelf fish fauna.

    PubMed

    Knebelsberger, Thomas; Landi, Monica; Neumann, Hermann; Kloppmann, Matthias; Sell, Anne F; Campbell, Patrick D; Laakmann, Silke; Raupach, Michael J; Carvalho, Gary R; Costa, Filipe O

    2014-09-01

    Valid fish species identification is an essential step both for fundamental science and fisheries management. The traditional identification is mainly based on external morphological diagnostic characters, leading to inconsistent results in many cases. Here, we provide a sequence reference library based on mitochondrial cytochrome c oxidase subunit I (COI) for a valid identification of 93 North Atlantic fish species originating from the North Sea and adjacent waters, including many commercially exploited species. Neighbour-joining analysis based on K2P genetic distances formed nonoverlapping clusters for all species with a ≥99% bootstrap support each. Identification was successful for 100% of the species as the minimum genetic distance to the nearest neighbour always exceeded the maximum intraspecific distance. A barcoding gap was apparent for the whole data set. Within-species distances ranged from 0 to 2.35%, while interspecific distances varied between 3.15 and 28.09%. Distances between congeners were on average 51-fold higher than those within species. The validation of the sequence library by applying BOLDs barcode index number (BIN) analysis tool and a ranking system demonstrated high taxonomic reliability of the DNA barcodes for 85% of the investigated fish species. Thus, the sequence library presented here can be confidently used as a benchmark for identification of at least two-thirds of the typical fish species recorded for the North Sea. © 2014 John Wiley & Sons Ltd.

  12. Improved serial analysis of V1 ribosomal sequence tags (SARST-V1) provides a rapid, comprehensive, sequence-based characterization of bacterial diversity and community composition.

    PubMed

    Yu, Zhongtang; Yu, Marie; Morrison, Mark

    2006-04-01

    Serial analysis of ribosomal sequence tags (SARST) is a recently developed technology that can generate large 16S rRNA gene (rrs) sequence data sets from microbiomes, but there are numerous enzymatic and purification steps required to construct the ribosomal sequence tag (RST) clone libraries. We report here an improved SARST method, which still targets the V1 hypervariable region of rrs genes, but reduces the number of enzymes, oligonucleotides, reagents, and technical steps needed to produce the RST clone libraries. The new method, hereafter referred to as SARST-V1, was used to examine the eubacterial diversity present in community DNA recovered from the microbiome resident in the ovine rumen. The 190 sequenced clones contained 1055 RSTs and no less than 236 unique phylotypes (based on > or = 95% sequence identity) that were assigned to eight different eubacterial phyla. Rarefaction and monomolecular curve analyses predicted that the complete RST clone library contains 99% of the 353 unique phylotypes predicted to exist in this microbiome. When compared with ribosomal intergenic spacer analysis (RISA) of the same community DNA sample, as well as a compilation of nine previously published conventional rrs clone libraries prepared from the same type of samples, the RST clone library provided a more comprehensive characterization of the eubacterial diversity present in rumen microbiomes. As such, SARST-V1 should be a useful tool applicable to comprehensive examination of diversity and composition in microbiomes and offers an affordable, sequence-based method for diversity analysis.

  13. Sequencing and assembly of the 22-gb loblolly pine genome.

    PubMed

    Zimin, Aleksey; Stevens, Kristian A; Crepeau, Marc W; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L; de Jong, Pieter J; Neale, David B; Salzberg, Steven L; Yorke, James A; Langley, Charles H

    2014-03-01

    Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer "super-reads," rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp.

  14. Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community

    DOE PAGES

    Bowers, Robert M.; Clum, Alicia; Tice, Hope; ...

    2015-10-24

    Background: The rapid development of sequencing technologies has provided access to environments that were either once thought inhospitable to life altogether or that contain too few cells to be analyzed using genomics approaches. While 16S rRNA gene microbial community sequencing has revolutionized our understanding of community composi tion and diversity over time and space, it only provides a crude estimate of microbial functional and metabolic potential. Alternatively, shotgun metagenomics allows comprehensive sampling of all genetic material in an environment, without any underlying primer biases. Until recently, one of the major bottlenecks of shotgun metagenomics has been the requirement for largemore » initial DNA template quantities during library preparation. Results: Here, we investigate the effects of varying template concentrations across three low biomass library preparation protocols on their ability to accurately reconstruct a mock microbial community of known composition. We analyze the effects of input DNA quantity and library preparation method on library insert size, GC content, community composition, assembly quality and metagenomic binning. We found that library preparation method and the amount of starting material had significant impacts on the mock community metagenomes. In particular, GC content shifted towards more GC rich sequences at the lower input quantities regardless of library prep method, the number of low quality reads that could not be mapped to the reference genomes increased with decreasing input quantities, and the different library preparation methods had an impact on overall metagenomic community composition. Conclusions: This benchmark study provides recommendations for library creation of representative and minimally biased metagenome shotgun sequencing, enabling insights into functional attributes of low biomass ecosystem microbial communities.« less

  15. Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowers, Robert M.; Clum, Alicia; Tice, Hope

    Background: The rapid development of sequencing technologies has provided access to environments that were either once thought inhospitable to life altogether or that contain too few cells to be analyzed using genomics approaches. While 16S rRNA gene microbial community sequencing has revolutionized our understanding of community composi tion and diversity over time and space, it only provides a crude estimate of microbial functional and metabolic potential. Alternatively, shotgun metagenomics allows comprehensive sampling of all genetic material in an environment, without any underlying primer biases. Until recently, one of the major bottlenecks of shotgun metagenomics has been the requirement for largemore » initial DNA template quantities during library preparation. Results: Here, we investigate the effects of varying template concentrations across three low biomass library preparation protocols on their ability to accurately reconstruct a mock microbial community of known composition. We analyze the effects of input DNA quantity and library preparation method on library insert size, GC content, community composition, assembly quality and metagenomic binning. We found that library preparation method and the amount of starting material had significant impacts on the mock community metagenomes. In particular, GC content shifted towards more GC rich sequences at the lower input quantities regardless of library prep method, the number of low quality reads that could not be mapped to the reference genomes increased with decreasing input quantities, and the different library preparation methods had an impact on overall metagenomic community composition. Conclusions: This benchmark study provides recommendations for library creation of representative and minimally biased metagenome shotgun sequencing, enabling insights into functional attributes of low biomass ecosystem microbial communities.« less

  16. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs.

    PubMed

    Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M

    2017-06-01

    The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange events and exploring cellular heterogeneity. Strand-seq is a single-cell sequencing technology that resolves the individual homologs within a cell by restricting sequence analysis to the DNA template strands used during DNA replication. This protocol, which takes up to 4 d to complete, relies on the directionality of DNA, in which each single strand of a DNA molecule is distinguished based on its 5'-3' orientation. Culturing cells in a thymidine analog for one round of cell division labels nascent DNA strands, allowing for their selective removal during genomic library construction. To preserve directionality of template strands, genomic preamplification is bypassed and labeled nascent strands are nicked and not amplified during library preparation. Each single-cell library is multiplexed for pooling and sequencing, and the resulting sequence data are aligned, mapping to either the minus or plus strand of the reference genome, to assign template strand states for each chromosome in the cell. The major adaptations to conventional single-cell sequencing protocols include harvesting of daughter cells after a single round of BrdU incorporation, bypassing of whole-genome amplification, and removal of the BrdU + strand during Strand-seq library preparation. By sequencing just template strands, the structure and identity of each homolog are preserved.

  17. Optimization of the genotyping-by-sequencing strategy for population genomic analysis in conifers.

    PubMed

    Pan, Jin; Wang, Baosheng; Pei, Zhi-Yong; Zhao, Wei; Gao, Jie; Mao, Jian-Feng; Wang, Xiao-Ru

    2015-07-01

    Flexibility and low cost make genotyping-by-sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI-MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference-free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000-11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking. © 2014 John Wiley & Sons Ltd.

  18. Reference quality assembly of the 3.5 Gb genome of Capsicum annuum form a single linked-read library

    USDA-ARS?s Scientific Manuscript database

    Linked-Read sequencing technology has recently been employed successfully for de novo assembly of multiple human genomes, however the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5 gigabase (Gb) diploid pepper (Cap...

  19. Plans and progress for building a Great Lakes fauna DNA ...

    EPA Pesticide Factsheets

    DNA reference libraries provide researchers with an important tool for assessing regional biodiversity by allowing unknown genetic sequences to be assigned identities, while also providing a means for taxonomists to validate identifications. Expanding the representation of Great Lakes species in such reference libraries is an explicit component of research at EPA’s Mid-Continent Ecology Division. Our DNA reference library building efforts began in 2012 with the goal of providing barcodes for at least 5 specimens of each native and nonindigenous fish and aquatic invertebrate species currently present in the Great Lakes. The approach is to pull taxonomically validated specimen for sequencing from EPA led sampling efforts of adult/juvenile fish, larval fish, benthic macroinvertebrates, and zooplankton; while also soliciting aid from state and federal agencies for tissue from “shopping list” organisms. The barcodes we generate are made available through the publicly accessible BOLD (Barcode of Life) database, and help inform a planned Great Lakes biodiversity inventory. To date, our submissions to BOLD are limited to fishes; of the 88 fish species listed as being present within Lake Superior, roughly half were successfully barcoded, while only 22 species met the desired quota of 5 barcoded specimens per species. As we continue to generate genomic information from our collections and the taxonomic representations become more complete, we will continue to

  20. Characterization of skin ulceration syndrome associated microRNAs in sea cucumber Apostichopus japonicus by deep sequencing.

    PubMed

    Li, Chenghua; Feng, Weida; Qiu, Lihua; Xia, Changge; Su, Xiurong; Jin, Chunhua; Zhou, Tingting; Zeng, Yuan; Li, Taiwu

    2012-08-01

    MicroRNAs (miRNAs) constitute a family of small RNA species which have been demonstrated to be one of key effectors in mediating host-pathogen interaction. In this study, two haemocytes miRNA libraries were constructed with deep sequenced by illumina Hiseq2000 from healthy (L1) and skin ulceration syndrome Apostichopus japonicus (L2). The high throughput solexa sequencing resulted in 9,579,038 and 7,742,558 clean data from L1 and L2, respectively. Sequences analysis revealed that 40 conserved miRNAs were found in both libraries, in which let-7 and mir-125 were speculated to be clustered together and expressed accordingly. Eighty-six miRNA candidates were also identified by reference genome search and stem-loop structure prediction. Importantly, mir-31 and mir-2008 displayed significant differential expression between the two libraries according to FPKM model, which might be considered as promising targets for elucidating the intrinsic mechanism of skin ulceration syndrome outbreak in the species. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. De novo Transcriptome Assembly of Common Wild Rice (Oryza rufipogon Griff.) and Discovery of Drought-Response Genes in Root Tissue Based on Transcriptomic Data.

    PubMed

    Tian, Xin-Jie; Long, Yan; Wang, Jiao; Zhang, Jing-Wen; Wang, Yan-Yan; Li, Wei-Min; Peng, Yu-Fa; Yuan, Qian-Hua; Pei, Xin-Wu

    2015-01-01

    The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants.

  2. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    PubMed

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  3. Signatures from Tissue-specific MPSS Libraries Identify Transcripts Preferentially Expressed in the Mouse Inner Ear

    PubMed Central

    Peters, Linda M.; Belyantseva, Inna A.; Lagziel, Ayala; Battey, James F.; Friedman, Thomas B.; Morell, Robert J.

    2007-01-01

    Specialization in cell function and morphology is influenced by the differential expression of mRNAs, many of which are expressed at low abundance and restricted to certain cell types. Detecting such transcripts in cDNA libraries may require sequencing millions of clones. Massively parallel signature sequencing (MPSS) is well-suited for identifying transcripts that are expressed in discrete cell types and in low abundance. We have made MPSS libraries from microdissections of three inner ear tissues. By comparing these MPSS libraries to those of 87 other tissues included in the Mouse Reference Transcriptome (MRT) online resource, we have identified genes that are highly enriched in, or specific to, the inner ear. We show by RT-PCR and in situ hybridization that signatures unique to the inner ear libraries identify transcripts with highly specific cell-type localizations. These transcripts serve to illustrate the utility of a resource that is available to the research community. Utilization of these resources will increase the number of known transcription units and expand our knowledge of the tissue-specific regulation of the transcriptome. PMID:17049805

  4. Reference quality assembly of the 3.5-Gb genome of Capsicum annuum from a single linked-read library.

    PubMed

    Hulse-Kemp, Amanda M; Maheshwari, Shamoni; Stoffel, Kevin; Hill, Theresa A; Jaffe, David; Williams, Stephen R; Weisenfeld, Neil; Ramakrishnan, Srividya; Kumar, Vijay; Shah, Preyas; Schatz, Michael C; Church, Deanna M; Van Deynze, Allen

    2018-01-01

    Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper ( Capsicum annuum ) genome with a single Linked-Read library. Plant genomes, including pepper, are characterized by long, highly similar repetitive sequences. Accordingly, significant effort is used to ensure that the sequenced plant is highly homozygous and the resulting assembly is a haploid consensus. With a phased assembly approach, we targeted a heterozygous F 1 derived from a wide cross to assess the ability to derive both haplotypes and characterize a pungency gene with a large insertion/deletion. The Supernova software generated a highly ordered, more contiguous sequence assembly than all currently available C. annuum reference genomes. Over 83% of the final assembly was anchored and oriented using four publicly available  de novo linkage maps. A comparison of the annotation of conserved eukaryotic genes indicated the completeness of assembly. The validity of the phased assembly is further demonstrated with the complete recovery of both 2.5-Kb insertion/deletion haplotypes of the PUN1 locus in the F 1 sample that represents pungent and nonpungent peppers, as well as nearly full recovery of the BUSCO2 gene set within each of the two haplotypes. The most contiguous pepper genome assembly to date has been generated which demonstrates that Linked-Read library technology provides a tool to de novo assemble complex highly repetitive heterozygous plant genomes. This technology can provide an opportunity to cost-effectively develop high-quality genome assemblies for other complex plants and compare structural and gene differences through accurate haplotype reconstruction.

  5. Towards a comprehensive barcode library for arctic life - Ephemeroptera, Plecoptera, and Trichoptera of Churchill, Manitoba, Canada

    PubMed Central

    2009-01-01

    Background This study reports progress in assembling a DNA barcode reference library for Ephemeroptera, Plecoptera, and Trichoptera ("EPTs") from a Canadian subarctic site, which is the focus of a comprehensive biodiversity inventory using DNA barcoding. These three groups of aquatic insects exhibit a moderate level of species diversity, making them ideal for testing the feasibility of DNA barcoding for routine biotic surveys. We explore the correlation between the morphological species delineations, DNA barcode-based haplotype clusters delimited by a sequence threshold (2%), and a threshold-free approach to biodiversity quantification--phylogenetic diversity. Results A DNA barcode reference library is built for 112 EPT species for the focal region, consisting of 2277 COI sequences. Close correspondence was found between EPT morphospecies and haplotype clusters as designated using a standard threshold value. Similarly, the shapes of taxon accumulation curves based upon haplotype clusters were very similar to those generated using phylogenetic diversity accumulation curves, but were much more computationally efficient. Conclusion The results of this study will facilitate other lines of research on northern EPTs and also bode well for rapidly conducting initial biodiversity assessments in unknown EPT faunas. PMID:20003245

  6. Evaluation and Design of Genome-Wide CRISPR/SpCas9 Knockout Screens

    PubMed Central

    Hart, Traver; Tong, Amy Hin Yan; Chan, Katie; Van Leeuwen, Jolanda; Seetharaman, Ashwin; Aregger, Michael; Chandrashekhar, Megha; Hustedt, Nicole; Seth, Sahil; Noonan, Avery; Habsid, Andrea; Sizova, Olga; Nedyalkova, Lyudmila; Climie, Ryan; Tworzyanski, Leanne; Lawson, Keith; Sartori, Maria Augusta; Alibeh, Sabriyeh; Tieu, David; Masud, Sanna; Mero, Patricia; Weiss, Alexander; Brown, Kevin R.; Usaj, Matej; Billmann, Maximilian; Rahman, Mahfuzur; Costanzo, Michael; Myers, Chad L.; Andrews, Brenda J.; Boone, Charles; Durocher, Daniel; Moffat, Jason

    2017-01-01

    The adaptation of CRISPR/SpCas9 technology to mammalian cell lines is transforming the study of human functional genomics. Pooled libraries of CRISPR guide RNAs (gRNAs) targeting human protein-coding genes and encoded in viral vectors have been used to systematically create gene knockouts in a variety of human cancer and immortalized cell lines, in an effort to identify whether these knockouts cause cellular fitness defects. Previous work has shown that CRISPR screens are more sensitive and specific than pooled-library shRNA screens in similar assays, but currently there exists significant variability across CRISPR library designs and experimental protocols. In this study, we reanalyze 17 genome-scale knockout screens in human cell lines from three research groups, using three different genome-scale gRNA libraries. Using the Bayesian Analysis of Gene Essentiality algorithm to identify essential genes, we refine and expand our previously defined set of human core essential genes from 360 to 684 genes. We use this expanded set of reference core essential genes, CEG2, plus empirical data from six CRISPR knockout screens to guide the design of a sequence-optimized gRNA library, the Toronto KnockOut version 3.0 (TKOv3) library. We then demonstrate the high effectiveness of the library relative to reference sets of essential and nonessential genes, as well as other screens using similar approaches. The optimized TKOv3 library, combined with the CEG2 reference set, provide an efficient, highly optimized platform for performing and assessing gene knockout screens in human cell lines. PMID:28655737

  7. Pairwise Sequence Alignment Library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeff Daily, PNNL

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, amore » novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.« less

  8. The Hemiptera (Insecta) of Canada: Constructing a Reference Library of DNA Barcodes

    PubMed Central

    Gwiazdowski, Rodger A.; Foottit, Robert G.; Maw, H. Eric L.; Hebert, Paul D. N.

    2015-01-01

    DNA barcode reference libraries linked to voucher specimens create new opportunities for high-throughput identification and taxonomic re-evaluations. This study provides a DNA barcode library for about 45% of the recognized species of Canadian Hemiptera, and the publically available R workflow used for its generation. The current library is based on the analysis of 20,851 specimens including 1849 species belonging to 628 genera and 64 families. These individuals were assigned to 1867 Barcode Index Numbers (BINs), sequence clusters that often coincide with species recognized through prior taxonomy. Museum collections were a key source for identified specimens, but we also employed high-throughput collection methods that generated large numbers of unidentified specimens. Many of these specimens represented novel BINs that were subsequently identified by taxonomists, adding barcode coverage for additional species. Our analyses based on both approaches includes 94 species not listed in the most recent Canadian checklist, representing a potential 3% increase in the fauna. We discuss the development of our workflow in the context of prior DNA barcode library construction projects, emphasizing the importance of delineating a set of reference specimens to aid investigations in cases of nomenclatural and DNA barcode discordance. The identification for each specimen in the reference set can be annotated on the Barcode of Life Data System (BOLD), allowing experts to highlight questionable identifications; annotations can be added by any registered user of BOLD, and instructions for this are provided. PMID:25923328

  9. Improving draft genome contiguity with reference-derived in silico mate-pair libraries.

    PubMed

    Grau, José Horacio; Hackl, Thomas; Koepfli, Klaus-Peter; Hofreiter, Michael

    2018-05-01

    Contiguous genome assemblies are a highly valued biological resource because of the higher number of completely annotated genes and genomic elements that are usable compared to fragmented draft genomes. Nonetheless, contiguity is difficult to obtain if only low coverage data and/or only distantly related reference genome assemblies are available. In order to improve genome contiguity, we have developed Cross-Species Scaffolding-a new pipeline that imports long-range distance information directly into the de novo assembly process by constructing mate-pair libraries in silico. We show how genome assembly metrics and gene prediction dramatically improve with our pipeline by assembling two primate genomes solely based on ∼30x coverage of shotgun sequencing data.

  10. Epsilon-Q: An Automated Analyzer Interface for Mass Spectral Library Search and Label-Free Protein Quantification.

    PubMed

    Cho, Jin-Young; Lee, Hyoung-Joo; Jeong, Seul-Ki; Paik, Young-Ki

    2017-12-01

    Mass spectrometry (MS) is a widely used proteome analysis tool for biomedical science. In an MS-based bottom-up proteomic approach to protein identification, sequence database (DB) searching has been routinely used because of its simplicity and convenience. However, searching a sequence DB with multiple variable modification options can increase processing time, false-positive errors in large and complicated MS data sets. Spectral library searching is an alternative solution, avoiding the limitations of sequence DB searching and allowing the detection of more peptides with high sensitivity. Unfortunately, this technique has less proteome coverage, resulting in limitations in the detection of novel and whole peptide sequences in biological samples. To solve these problems, we previously developed the "Combo-Spec Search" method, which uses manually multiple references and simulated spectral library searching to analyze whole proteomes in a biological sample. In this study, we have developed a new analytical interface tool called "Epsilon-Q" to enhance the functions of both the Combo-Spec Search method and label-free protein quantification. Epsilon-Q performs automatically multiple spectral library searching, class-specific false-discovery rate control, and result integration. It has a user-friendly graphical interface and demonstrates good performance in identifying and quantifying proteins by supporting standard MS data formats and spectrum-to-spectrum matching powered by SpectraST. Furthermore, when the Epsilon-Q interface is combined with the Combo-Spec search method, called the Epsilon-Q system, it shows a synergistic function by outperforming other sequence DB search engines for identifying and quantifying low-abundance proteins in biological samples. The Epsilon-Q system can be a versatile tool for comparative proteome analysis based on multiple spectral libraries and label-free quantification.

  11. DNA Barcoding for Species Identification of Insect Skins: A Test on Chironomidae (Diptera) Pupal Exuviae

    PubMed Central

    Ekrem, Torbjørn; Stur, Elisabeth

    2017-01-01

    Abstract Chironomidae (Diptera) pupal exuviae samples are commonly used for biological monitoring of aquatic habitats. DNA barcoding has proved useful for species identification of chironomid life stages containing cellular tissue, but the barcoding success of chironomid pupal exuviae is unknown. We assessed whether standard DNA barcoding could be efficiently used for species identification of chironomid pupal exuviae when compared with morphological techniques and if there were differences in performance between temperate and tropical ecosystems, subfamilies, and tribes. PCR, sequence, and identification success differed significantly between geographic regions and taxonomic groups. For Norway, 27 out of 190 (14.2%) of pupal exuviae resulted in high-quality chironomid sequences that match species. For Costa Rica, 69 out of 190 (36.3%) Costa Rican pupal exuviae resulted in high-quality sequences, but none matched known species. Standard DNA barcoding of chironomid pupal exuviae had limited success in species identification of unknown specimens due to contaminations and lack of matching references in available barcode libraries, especially from Costa Rica. Therefore, we recommend future biodiversity studies that focus their efforts on understudied regions, to simultaneously use morphological and molecular identification techniques to identify all life stages of chironomids and populate the barcode reference library with identified sequences.

  12. HMMER web server: 2018 update.

    PubMed

    Potter, Simon C; Luciani, Aurélien; Eddy, Sean R; Park, Youngmi; Lopez, Rodrigo; Finn, Robert D

    2018-06-14

    The HMMER webserver [http://www.ebi.ac.uk/Tools/hmmer] is a free-to-use service which provides fast searches against widely used sequence databases and profile hidden Markov model (HMM) libraries using the HMMER software suite (http://hmmer.org). The results of a sequence search may be summarized in a number of ways, allowing users to view and filter the significant hits by domain architecture or taxonomy. For large scale usage, we provide an application programmatic interface (API) which has been expanded in scope, such that all result presentations are available via both HTML and API. Furthermore, we have refactored our JavaScript visualization library to provide standalone components for different result representations. These consume the aforementioned API and can be integrated into third-party websites. The range of databases that can be searched against has been expanded, adding four sequence datasets (12 in total) and one profile HMM library (6 in total). To help users explore the biological context of their results, and to discover new data resources, search results are now supplemented with cross references to other EMBL-EBI databases.

  13. A comparative study of ChIP-seq sequencing library preparation methods.

    PubMed

    Sundaram, Arvind Y M; Hughes, Timothy; Biondi, Shea; Bolduc, Nathalie; Bowman, Sarah K; Camilli, Andrew; Chew, Yap C; Couture, Catherine; Farmer, Andrew; Jerome, John P; Lazinski, David W; McUsic, Andrew; Peng, Xu; Shazand, Kamran; Xu, Feng; Lyle, Robert; Gilfillan, Gregor D

    2016-10-21

    ChIP-seq is the primary technique used to investigate genome-wide protein-DNA interactions. As part of this procedure, immunoprecipitated DNA must undergo "library preparation" to enable subsequent high-throughput sequencing. To facilitate the analysis of biopsy samples and rare cell populations, there has been a recent proliferation of methods allowing sequencing library preparation from low-input DNA amounts. However, little information exists on the relative merits, performance, comparability and biases inherent to these procedures. Notably, recently developed single-cell ChIP procedures employing microfluidics must also employ library preparation reagents to allow downstream sequencing. In this study, seven methods designed for low-input DNA/ChIP-seq sample preparation (Accel-NGS® 2S, Bowman-method, HTML-PCR, SeqPlex™, DNA SMART™, TELP and ThruPLEX®) were performed on five replicates of 1 ng and 0.1 ng input H3K4me3 ChIP material, and compared to a "gold standard" reference PCR-free dataset. The performance of each method was examined for the prevalence of unmappable reads, amplification-derived duplicate reads, reproducibility, and for the sensitivity and specificity of peak calling. We identified consistent high performance in a subset of the tested reagents, which should aid researchers in choosing the most appropriate reagents for their studies. Furthermore, we expect this work to drive future advances by identifying and encouraging use of the most promising methods and reagents. The results may also aid judgements on how comparable are existing datasets that have been prepared with different sample library preparation reagents.

  14. Palindromic Sequence Artifacts Generated during Next Generation Sequencing Library Preparation from Historic and Ancient DNA

    PubMed Central

    Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel

    2014-01-01

    Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104

  15. 16S rRNA partial gene sequencing for the differentiation and molecular subtyping of Listeria species.

    PubMed

    Hellberg, Rosalee S; Martin, Keely G; Keys, Ashley L; Haney, Christopher J; Shen, Yuelian; Smiley, R Derike

    2013-12-01

    Use of 16S rRNA partial gene sequencing within the regulatory workflow could greatly reduce the time and labor needed for confirmation and subtyping of Listeria monocytogenes. The goal of this study was to build a 16S rRNA partial gene reference library for Listeria spp. and investigate the potential for 16S rRNA molecular subtyping. A total of 86 isolates of Listeria representing L. innocua, L. seeligeri, L. welshimeri, and L. monocytogenes were obtained for use in building the custom library. Seven non-Listeria species and three additional strains of Listeria were obtained for use in exclusivity and food spiking tests. Isolates were sequenced for the partial 16S rRNA gene using the MicroSeq ID 500 Bacterial Identification Kit (Applied Biosystems). High-quality sequences were obtained for 84 of the custom library isolates and 23 unique 16S sequence types were discovered for use in molecular subtyping. All of the exclusivity strains were negative for Listeria and the three Listeria strains used in food spiking were consistently recovered and correctly identified at the species level. The spiking results also allowed for differentiation beyond the species level, as 87% of replicates for one strain and 100% of replicates for the other two strains consistently matched the same 16S type. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. A simple method for semi-random DNA amplicon fragmentation using the methylation-dependent restriction enzyme MspJI.

    PubMed

    Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W

    2015-04-11

    Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.

  17. Clone DB: an integrated NCBI resource for clone-associated data

    PubMed Central

    Schneider, Valerie A.; Chen, Hsiu-Chuan; Clausen, Cliff; Meric, Peter A.; Zhou, Zhigang; Bouk, Nathan; Husain, Nora; Maglott, Donna R.; Church, Deanna M.

    2013-01-01

    The National Center for Biotechnology Information (NCBI) Clone DB (http://www.ncbi.nlm.nih.gov/clone/) is an integrated resource providing information about and facilitating access to clones, which serve as valuable research reagents in many fields, including genome sequencing and variation analysis. Clone DB represents an expansion and replacement of the former NCBI Clone Registry and has records for genomic and cell-based libraries and clones representing more than 100 different eukaryotic taxa. Records provide details of library construction, associated sequences, map positions and information about resource distribution. Clone DB is indexed in the NCBI Entrez system and can be queried by fields that include organism, clone name, gene name and sequence identifier. Whenever possible, genomic clones are mapped to reference assemblies and their map positions provided in clone records. Clones mapping to specific genomic regions can also be searched for using the NCBI Clone Finder tool, which accepts queries based on sequence coordinates or features such as gene or transcript names. Clone DB makes reports of library, clone and placement data on its FTP site available for download. With Clone DB, users now have available to them a centralized resource that provides them with the tools they will need to make use of these important research reagents. PMID:23193260

  18. Droplet Digital™ PCR Next-Generation Sequencing Library QC Assay.

    PubMed

    Heredia, Nicholas J

    2018-01-01

    Digital PCR is a valuable tool to quantify next-generation sequencing (NGS) libraries precisely and accurately. Accurately quantifying NGS libraries enable accurate loading of the libraries on to the sequencer and thus improve sequencing performance by reducing under and overloading error. Accurate quantification also benefits users by enabling uniform loading of indexed/barcoded libraries which in turn greatly improves sequencing uniformity of the indexed/barcoded samples. The advantages gained by employing the Droplet Digital PCR (ddPCR™) library QC assay includes the precise and accurate quantification in addition to size quality assessment, enabling users to QC their sequencing libraries with confidence.

  19. Chat reference service in medical libraries: part 2--Trends in medical school libraries.

    PubMed

    Dee, Cheryl R

    2003-01-01

    An increasing number of medical school libraries offer chat service to provide immediate, high quality information at the time and point of need to students, faculty, staff, and health care professionals. Part 2 of Chat Reference Service in Medical Libraries presents a snapshot of the current trends in chat reference service in medical school libraries. In late 2002, 25 (21%) medical school libraries provided chat reference. Trends in chat reference services in medical school libraries were compiled from an exploration of medical school library Web sites and informal correspondence from medical school library personnel. Many medical libraries are actively investigating and planning new chat reference services, while others have decided not to pursue chat reference at this time. Anecdotal comments from medical school library staff provide insights into chat reference service.

  20. A high-speed on-chip pseudo-random binary sequence generator for multi-tone phase calibration

    NASA Astrophysics Data System (ADS)

    Gommé, Liesbeth; Vandersteen, Gerd; Rolain, Yves

    2011-07-01

    An on-chip reference generator is conceived by adopting the technique of decimating a pseudo-random binary sequence (PRBS) signal in parallel sequences. This is of great benefit when high-speed generation of PRBS and PRBS-derived signals is the objective. The design implemented standard CMOS logic is available in commercial libraries to provide the logic functions for the generator. The design allows the user to select the periodicity of the PRBS and the PRBS-derived signals. The characterization of the on-chip generator marks its performance and reveals promising specifications.

  1. Analysis of a diverse assemblage of diazotrophic bacteria from Spartina alterniflora using DGGE and clone library screening.

    PubMed

    Lovell, Charles R; Decker, Peter V; Bagwell, Christopher E; Thompson, Shelly; Matsui, George Y

    2008-05-01

    Methods to assess the diversity of the diazotroph assemblage in the rhizosphere of the salt marsh cordgrass, Spartina alterniflora were examined. The effectiveness of nifH PCR-denaturing gradient gel electrophoresis (DGGE) was compared to that of nifH clone library analysis. Seventeen DGGE gel bands were sequenced and yielded 58 nonidentical nifH sequences from a total of 67 sequences determined. A clone library constructed using the GC-clamp nifH primers that were employed in the PCR-DGGE (designated the GC-Library) yielded 83 nonidentical sequences from a total of 257 nifH sequences. A second library constructed using an alternate set of nifH primers (N-Library) yielded 83 nonidentical sequences from a total of 138 nifH sequences. Rarefaction curves for the libraries did not reach saturation, although the GC-Library curve was substantially dampened and appeared to be closer to saturation than the N-Library curve. Phylogenetic analyses showed that DGGE gel band sequencing recovered nifH sequences that were frequently sampled in the GC-Library, as well as sequences that were infrequently sampled, and provided a species composition assessment that was robust, efficient, and relatively inexpensive to obtain. Further, the DGGE method permits a large number of samples to be examined for differences in banding patterns, after which bands of interest can be sampled for sequence determination.

  2. Creation of reference DNA barcode library and authentication of medicinal plant raw drugs used in Ayurvedic medicine.

    PubMed

    Vassou, Sophie Lorraine; Nithaniyal, Stalin; Raju, Balaji; Parani, Madasamy

    2016-07-18

    Ayurveda is a system of traditional medicine that originated in ancient India, and it is still in practice. Medicinal plants are the backbone of Ayurveda, which heavily relies on the plant-derived therapeutics. While Ayurveda is becoming more popular in several countries throughout the World, lack of authenticated medicinal plant raw drugs is a growing concern. Our aim was to DNA barcode the medicinal plants that are listed in the Ayurvedic Pharmacopoeia of India (API) to create a reference DNA barcode library, and to use the same to authenticate the raw drugs that are sold in markets. We have DNA barcoded 347 medicinal plants using rbcL marker, and curated rbcL DNA barcodes for 27 medicinal plants from public databases. These sequences were used to create Ayurvedic Pharmacopoeia of India - Reference DNA Barcode Library (API-RDBL). This library was used to authenticate 100 medicinal plant raw drugs, which were in the form of powders (82) and seeds (18). Ayurvedic Pharmacopoeia of India - Reference DNA Barcode Library (API-RDBL) was created with high quality and authentic rbcL barcodes for 374 out of the 395 medicinal plants that are included in the API. The rbcL DNA barcode differentiated 319 species (85 %) with the pairwise divergence ranging between 0.2 and 29.9 %. PCR amplification and DNA sequencing success rate of rbcL marker was 100 % even for the poorly preserved medicinal plant raw drugs that were collected from local markets. DNA barcoding revealed that only 79 % raw drugs were authentic, and the remaining 21 % samples were adulterated. Further, adulteration was found to be much higher with powders (ca. 25 %) when compared to seeds (ca. 5 %). The present study demonstrated the utility of DNA barcoding in authenticating medicinal plant raw drugs, and found that approximately one fifth of the market samples were adulterated. Powdered raw drugs, which are very difficult to be identified by taxonomists as well as common people, seem to be the easy target for adulteration. Developing a quality control protocol for medicinal plant raw drugs by incorporating DNA barcoding as a component is essential to ensure safety to the consumers.

  3. Development of a dedicated peptide tandem mass spectral library for conservation science.

    PubMed

    Fremout, Wim; Dhaenens, Maarten; Saverwyns, Steven; Sanyova, Jana; Vandenabeele, Peter; Deforce, Dieter; Moens, Luc

    2012-05-30

    In recent years, the use of liquid chromatography tandem mass spectrometry (LC-MS/MS) on tryptic digests of cultural heritage objects has attracted much attention. It allows for unambiguous identification of peptides and proteins, and even in complex mixtures species-specific identification becomes feasible with minimal sample consumption. Determination of the peptides is commonly based on theoretical cleavage of known protein sequences and on comparison of the expected peptide fragments with those found in the MS/MS spectra. In this approach, complex computer programs, such as Mascot, perform well identifying known proteins, but fail when protein sequences are unknown or incomplete. Often, when trying to distinguish evolutionarily well preserved collagens of different species, Mascot lacks the required specificity. Complementary and often more accurate information on the proteins can be obtained using a reference library of MS/MS spectra of species-specific peptides. Therefore, a library dedicated to various sources of proteins in works of art was set up, with an initial focus on collagen rich materials. This paper discusses the construction and the advantages of this spectral library for conservation science, and its application on a number of samples from historical works of art. Copyright © 2012 Elsevier B.V. All rights reserved.

  4. Classifying the bacterial gut microbiota of termites and cockroaches: A curated phylogenetic reference database (DictDb).

    PubMed

    Mikaelyan, Aram; Köhler, Tim; Lampert, Niclas; Rohland, Jeffrey; Boga, Hamadi; Meuser, Katja; Brune, Andreas

    2015-10-01

    Recent developments in sequencing technology have given rise to a large number of studies that assess bacterial diversity and community structure in termite and cockroach guts based on large amplicon libraries of 16S rRNA genes. Although these studies have revealed important ecological and evolutionary patterns in the gut microbiota, classification of the short sequence reads is limited by the taxonomic depth and resolution of the reference databases used in the respective studies. Here, we present a curated reference database for accurate taxonomic analysis of the bacterial gut microbiota of dictyopteran insects. The Dictyopteran gut microbiota reference Database (DictDb) is based on the Silva database but was significantly expanded by the addition of clones from 11 mostly unexplored termite and cockroach groups, which increased the inventory of bacterial sequences from dictyopteran guts by 26%. The taxonomic depth and resolution of DictDb was significantly improved by a general revision of the taxonomic guide tree for all important lineages, including a detailed phylogenetic analysis of the Treponema and Alistipes complexes, the Fibrobacteres, and the TG3 phylum. The performance of this first documented version of DictDb (v. 3.0) using the revised taxonomic guide tree in the classification of short-read libraries obtained from termites and cockroaches was highly superior to that of the current Silva and RDP databases. DictDb uses an informative nomenclature that is consistent with the literature also for clades of uncultured bacteria and provides an invaluable tool for anyone exploring the gut community structure of termites and cockroaches. Copyright © 2015 Elsevier GmbH. All rights reserved.

  5. Selection of cholera toxin specific IgNAR single-domain antibodies from a naïve shark library.

    PubMed

    Liu, Jinny L; Anderson, George P; Delehanty, James B; Baumann, Richard; Hayhurst, Andrew; Goldman, Ellen R

    2007-03-01

    Shark immunoglobulin new antigen receptor (IgNAR, also referred to as NAR) variable domains (Vs) are single-domain antibody (sdAb) fragments containing only two hypervariable loop structures forming 3D topologies for a wide range of antigen recognition and binding. Their small size ( approximately 12kDa) and high solubility, thermostability and binding specificity make IgNARs an exceptional alternative source of engineered antibodies for sensor applications. Here, two new shark NAR V display libraries containing >10(7) unique clones from non-immunized (naïve) adult spiny dogfish (Squalus acanthias) and smooth dogfish (Mustelus canis) sharks were constructed. The most conserved consensus sequences derived from random clone sequence were compared with published nurse shark (Ginglymostoma cirratum) sequences. Cholera toxin (CT) was chosen for panning one of the naïve display libraries due to its severe pathogenicity and commercial availability. Three very similar CT binders were selected and purified soluble monomeric anti-CT sdAbs were characterized using Luminex(100) and traditional ELISA assays. These novel anti-CT sdAbs selected from our newly constructed shark NAR V sdAb library specifically bound to soluble antigen, without cross reacting with other irrelevant antigens. They also showed superior heat stability, exhibiting slow loss of activity over the course of one hour at high temperature (95 degrees C), while conventional antibodies lost all activity in the first 5-10min. The successful isolation of target specific sdAbs from one of our non-biased NAR libraries, demonstrate their ability to provide binders against an unacquainted antigen of interest.

  6. PuLSE: Quality control and quantification of peptide sequences explored by phage display libraries.

    PubMed

    Shave, Steven; Mann, Stefan; Koszela, Joanna; Kerr, Alastair; Auer, Manfred

    2018-01-01

    The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.

  7. A DNA Barcode Library for North American Pyraustinae (Lepidoptera: Pyraloidea: Crambidae).

    PubMed

    Yang, Zhaofu; Landry, Jean-François; Hebert, Paul D N

    2016-01-01

    Although members of the crambid subfamily Pyraustinae are frequently important crop pests, their identification is often difficult because many species lack conspicuous diagnostic morphological characters. DNA barcoding employs sequence diversity in a short standardized gene region to facilitate specimen identifications and species discovery. This study provides a DNA barcode reference library for North American pyraustines based upon the analysis of 1589 sequences recovered from 137 nominal species, 87% of the fauna. Data from 125 species were barcode compliant (>500bp, <1% n), and 99 of these taxa formed a distinct cluster that was assigned to a single BIN. The other 26 species were assigned to 56 BINs, reflecting frequent cases of deep intraspecific sequence divergence and a few instances of barcode sharing, creating a total of 155 BINs. Two systems for OTU designation, ABGD and BIN, were examined to check the correspondence between current taxonomy and sequence clusters. The BIN system performed better than ABGD in delimiting closely related species, while OTU counts with ABGD were influenced by the value employed for relative gap width. Different species with low or no interspecific divergence may represent cases of unrecognized synonymy, whereas those with high intraspecific divergence require further taxonomic scrutiny as they may involve cryptic diversity. The barcode library developed in this study will also help to advance understanding of relationships among species of Pyraustinae.

  8. A new comprehensive method for detection of livestock-related pathogenic viruses using a target enrichment system.

    PubMed

    Oba, Mami; Tsuchiaka, Shinobu; Omatsu, Tsutomu; Katayama, Yukie; Otomaru, Konosuke; Hirata, Teppei; Aoki, Hiroshi; Murata, Yoshiteru; Makino, Shinji; Nagai, Makoto; Mizutani, Tetsuya

    2018-01-08

    We tested usefulness of a target enrichment system SureSelect, a comprehensive viral nucleic acid detection method, for rapid identification of viral pathogens in feces samples of cattle, pigs and goats. This system enriches nucleic acids of target viruses in clinical/field samples by using a library of biotinylated RNAs with sequences complementary to the target viruses. The enriched nucleic acids are amplified by PCR and subjected to next generation sequencing to identify the target viruses. In many samples, SureSelect target enrichment method increased efficiencies for detection of the viruses listed in the biotinylated RNA library. Furthermore, this method enabled us to determine nearly full-length genome sequence of porcine parainfluenza virus 1 and greatly increased Breadth, a value indicating the ratio of the mapping consensus length in the reference genome, in pig samples. Our data showed usefulness of SureSelect target enrichment system for comprehensive analysis of genomic information of various viruses in field samples. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Reducing DNA context dependence in bacterial promoters

    PubMed Central

    Carr, Swati B.; Densmore, Douglas M.

    2017-01-01

    Variation in the DNA sequence upstream of bacterial promoters is known to affect the expression levels of the products they regulate, sometimes dramatically. While neutral synthetic insulator sequences have been found to buffer promoters from upstream DNA context, there are no established methods for designing effective insulator sequences with predictable effects on expression levels. We address this problem with Degenerate Insulation Screening (DIS), a novel method based on a randomized 36-nucleotide insulator library and a simple, high-throughput, flow-cytometry-based screen that randomly samples from a library of 436 potential insulated promoters. The results of this screen can then be compared against a reference uninsulated device to select a set of insulated promoters providing a precise level of expression. We verify this method by insulating the constitutive, inducible, and repressible promotors of a four transcriptional-unit inverter (NOT-gate) circuit, finding both that order dependence is largely eliminated by insulation and that circuit performance is also significantly improved, with a 5.8-fold mean improvement in on/off ratio. PMID:28422998

  10. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    PubMed Central

    Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

    2011-01-01

    Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358

  11. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

    PubMed

    Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

    2017-11-13

    A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

  12. Single haplotype assembly of the human genome from a hydatidiform mole.

    PubMed

    Steinberg, Karyn Meltz; Schneider, Valerie A; Graves-Lindsay, Tina A; Fulton, Robert S; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C; Church, Deanna M; Eichler, Evan E; Wilson, Richard K

    2014-12-01

    A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly. © 2014 Steinberg et al.; Published by Cold Spring Harbor Laboratory Press.

  13. Single haplotype assembly of the human genome from a hydatidiform mole

    PubMed Central

    Steinberg, Karyn Meltz; Schneider, Valerie A.; Graves-Lindsay, Tina A.; Fulton, Robert S.; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A.; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C.; Church, Deanna M.; Eichler, Evan E.; Wilson, Richard K.

    2014-01-01

    A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly. PMID:25373144

  14. Construction of a scFv Library with Synthetic, Non-combinatorial CDR Diversity.

    PubMed

    Bai, Xuelian; Shim, Hyunbo

    2017-01-01

    Many large synthetic antibody libraries have been designed, constructed, and successfully generated high-quality antibodies suitable for various demanding applications. While synthetic antibody libraries have many advantages such as optimized framework sequences and a broader sequence landscape than natural antibodies, their sequence diversities typically are generated by random combinatorial synthetic processes which cause the incorporation of many undesired CDR sequences. Here, we describe the construction of a synthetic scFv library using oligonucleotide mixtures that contain predefined, non-combinatorially synthesized CDR sequences. Each CDR is first inserted to a master scFv framework sequence and the resulting single-CDR libraries are subjected to a round of proofread panning. The proofread CDR sequences are assembled to produce the final scFv library with six diversified CDRs.

  15. Library construction for next-generation sequencing: Overviews and challenges

    PubMed Central

    Head, Steven R.; Komori, H. Kiyomi; LaMere, Sarah A.; Whisenant, Thomas; Van Nieuwerburgh, Filip; Salomon, Daniel R.; Ordoukhanian, Phillip

    2014-01-01

    High-throughput sequencing, also known as next-generation sequencing (NGS), has revolutionized genomic research. In recent years, NGS technology has steadily improved, with costs dropping and the number and range of sequencing applications increasing exponentially. Here, we examine the critical role of sequencing library quality and consider important challenges when preparing NGS libraries from DNA and RNA sources. Factors such as the quantity and physical characteristics of the RNA or DNA source material as well as the desired application (i.e., genome sequencing, targeted sequencing, RNA-seq, ChIP-seq, RIP-seq, and methylation) are addressed in the context of preparing high quality sequencing libraries. In addition, the current methods for preparing NGS libraries from single cells are also discussed. PMID:24502796

  16. Chicken microsatellite markers isolated from libraries enriched for simple tandem repeats.

    PubMed

    Gibbs, M; Dawson, D A; McCamley, C; Wardle, A F; Armour, J A; Burke, T

    1997-12-01

    The total number of microsatellite loci is considered to be at least 10-fold lower in avian species than in mammalian species. Therefore, efficient large-scale cloning of chicken microsatellites, as required for the construction of a high-resolution linkage map, is facilitated by the construction of libraries using an enrichment strategy. In this study, a plasmid library enriched for tandem repeats was constructed from chicken genomic DNA by hybridization selection. Using this technique the proportion of recombinant clones that cross-hybridized to probes containing simple tandem repeats was raised to 16%, compared with < 0.1% in a non-enriched library. Primers were designed from 121 different sequences. Polymerase chain reaction (PCR) analysis of two chicken reference pedigrees enabled 72 loci to be localized within the collaborative chicken genetic map, and at least 30 of the remaining loci have been shown to be informative in these or other crosses.

  17. Library orientation on videotape: production planning and administrative support.

    PubMed

    Shedlock, J; Tawyea, E W

    1989-01-01

    New student-faculty-staff orientation is an important public service in a medical library and demands creativity, imagination, teaching skill, coordination, and cooperation on the part of public services staff. The Northwestern University Medical Library (NUML) implemented a video production service in the spring of 1986 and used the new service to produce an orientation videotape for incoming students, new faculty, and medical center staff. Planning is an important function in video production, and the various phases of outlining topics, drafting scripts, matching video sequences, and actual taping of video, voice, and music are described. The NUML orientation videotape demonstrates how reference and audiovisual services merge talent and skills to benefit the library user. Videotape production, however, cannot happen in a vacuum of good intentions and high ideals. This paper also presents the management support and cost analysis needed to make video production services a reality for use by public service departments.

  18. libgapmis: extending short-read alignments

    PubMed Central

    2013-01-01

    Background A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of mismatches in the alignment; however, their ability to allow for gaps varies greatly, with many performing poorly or not allowing them at all. The seed-and-extend strategy is applied in most short-read alignment programmes. After aligning a substring of the reference sequence against the high-quality prefix of a short read--the seed--an important problem is to find the best possible alignment between a substring of the reference sequence succeeding and the remaining suffix of low quality of the read--extend. The fact that the reads are rather short and that the gap occurrence frequency observed in various studies is rather low suggest that aligning (parts of) those reads with a single gap is in fact desirable. Results In this article, we present libgapmis, a library for extending pairwise short-read alignments. Apart from the standard CPU version, it includes ultrafast SSE- and GPU-based implementations. libgapmis is based on an algorithm computing a modified version of the traditional dynamic-programming matrix for sequence alignment. Extensive experimental results demonstrate that the functions of the CPU version provided in this library accelerate the computations by a factor of 20 compared to other programmes. The analogous SSE- and GPU-based implementations accelerate the computations by a factor of 6 and 11, respectively, compared to the CPU version. The library also provides the user the flexibility to split the read into fragments, based on the observed gap occurrence frequency and the length of the read, thereby allowing for a variable, but bounded, number of gaps in the alignment. Conclusions We present libgapmis, a library for extending pairwise short-read alignments. We show that libgapmis is better-suited and more efficient than existing algorithms for this task. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any short-read alignment pipeline. The open-source code of libgapmis is available at http://www.exelixis-lab.org/gapmis. PMID:24564250

  19. Exploring Writing Workshop in the K-2 Classroom: Discovering Our Voices. Bill Harp Professional Teachers Library Series.

    ERIC Educational Resources Information Center

    Rickards, Debbie; Hawes, Shirl

    This book is meant as a resource that can be used by experienced writing teachers as a practical reference when planning writing lessons. The book includes a sequence of instruction, lesson ideas, enrichment options, literature and poetry connections, student samples, and all the ready-to-use materials teachers need to implement the…

  20. The NISTmAb tryptic peptide spectral library for monoclonal antibody characterization.

    PubMed

    Dong, Qian; Liang, Yuxue; Yan, Xinjian; Markey, Sanford P; Mirokhin, Yuri A; Tchekhovskoi, Dmitrii V; Bukhari, Tallat H; Stein, Stephen E

    2018-04-01

    We describe the creation of a mass spectral library composed of all identifiable spectra derived from the tryptic digest of the NISTmAb IgG1κ. The library is a unique reference spectral collection developed from over six million peptide-spectrum matches acquired by liquid chromatography-mass spectrometry (LC-MS) over a wide range of collision energy. Conventional one-dimensional (1D) LC-MS was used for various digestion conditions and 20- and 24-fraction two-dimensional (2D) LC-MS studies permitted in-depth analyses of single digests. Computer methods were developed for automated analysis of LC-MS isotopic clusters to determine the attributes for all ions detected in the 1D and 2D studies. The library contains a selection of over 12,600 high-quality tandem spectra of more than 3,300 peptide ions identified and validated by accurate mass, differential elution pattern, and expected peptide classes in peptide map experiments. These include a variety of biologically modified peptide spectra involving glycosylated, oxidized, deamidated, glycated, and N/C-terminal modified peptides, as well as artifacts. A complete glycation profile was obtained for the NISTmAb with spectra for 58% and 100% of all possible glycation sites in the heavy and light chains, respectively. The site-specific quantification of methionine oxidation in the protein is described. The utility of this reference library is demonstrated by the analysis of a commercial monoclonal antibody (adalimumab, Humira®), where 691 peptide ion spectra are identifiable in the constant regions, accounting for 60% coverage for both heavy and light chains. The NIST reference library platform may be used as a tool for facile identification of the primary sequence and post-translational modifications, as well as the recognition of LC-MS method-induced artifacts for human and recombinant IgG antibodies. Its development also provides a general method for creating comprehensive peptide libraries of individual proteins.

  1. The NISTmAb tryptic peptide spectral library for monoclonal antibody characterization

    PubMed Central

    Dong, Qian; Liang, Yuxue; Yan, Xinjian; Markey, Sanford P.; Mirokhin, Yuri A.; Tchekhovskoi, Dmitrii V.; Bukhari, Tallat H.; Stein, Stephen E.

    2018-01-01

    ABSTRACT We describe the creation of a mass spectral library composed of all identifiable spectra derived from the tryptic digest of the NISTmAb IgG1κ. The library is a unique reference spectral collection developed from over six million peptide-spectrum matches acquired by liquid chromatography-mass spectrometry (LC-MS) over a wide range of collision energy. Conventional one-dimensional (1D) LC-MS was used for various digestion conditions and 20- and 24-fraction two-dimensional (2D) LC-MS studies permitted in-depth analyses of single digests. Computer methods were developed for automated analysis of LC-MS isotopic clusters to determine the attributes for all ions detected in the 1D and 2D studies. The library contains a selection of over 12,600 high-quality tandem spectra of more than 3,300 peptide ions identified and validated by accurate mass, differential elution pattern, and expected peptide classes in peptide map experiments. These include a variety of biologically modified peptide spectra involving glycosylated, oxidized, deamidated, glycated, and N/C-terminal modified peptides, as well as artifacts. A complete glycation profile was obtained for the NISTmAb with spectra for 58% and 100% of all possible glycation sites in the heavy and light chains, respectively. The site-specific quantification of methionine oxidation in the protein is described. The utility of this reference library is demonstrated by the analysis of a commercial monoclonal antibody (adalimumab, Humira®), where 691 peptide ion spectra are identifiable in the constant regions, accounting for 60% coverage for both heavy and light chains. The NIST reference library platform may be used as a tool for facile identification of the primary sequence and post-translational modifications, as well as the recognition of LC-MS method-induced artifacts for human and recombinant IgG antibodies. Its development also provides a general method for creating comprehensive peptide libraries of individual proteins. PMID:29425077

  2. Improving prokaryotic transposable elements identification using a combination of de novo and profile HMM methods.

    PubMed

    Kamoun, Choumouss; Payen, Thibaut; Hua-Van, Aurélie; Filée, Jonathan

    2013-10-11

    Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches. In this study, we have developed three different work flows dedicated to ISs and MITEs detection: the first two use de novo methods detecting either repeated sequences or presence of Inverted Repeats; the third one use 28 in-house transposase alignment profiles with HMM search methods. We have compared the respective performances of each method using a reference dataset of 30 archaeal and 30 bacterial genomes in addition to simulated and real metagenomes. Compared to a BLAST-based method using ISFinder as library, de novo methods significantly improve ISs and MITEs detection. For example, in the 30 archaeal genomes, we discovered 30 new elements (+20%) in addition to the 141 multi-copies elements already detected by the BLAST approach. Many of the new elements correspond to ISs belonging to unknown or highly divergent families. The total number of MITEs has even doubled with the discovery of elements displaying very limited sequence similarities with their respective autonomous partners (mainly in the Inverted Repeats of the elements). Concerning metagenomes, with the exception of short reads data (<300 bp) for which both techniques seem equally limited, profile HMM searches considerably ameliorate the detection of transposase encoding genes (up to +50%) generating low level of false positives compare to BLAST-based methods. Compared to classical BLAST-based methods, the sensitivity of de novo and profile HMM methods developed in this study allow a better and more reliable detection of transposons in prokaryotic genomes and metagenomes. We believed that future studies implying ISs and MITEs identification in genomic data should combine at least one de novo and one library-based method, with optimal results obtained by running the two de novo methods in addition to a library-based search. For metagenomic data, profile HMM search should be favored, a BLAST-based step is only useful to the final annotation into groups and families.

  3. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    PubMed Central

    Matochko, Wadim L.; Derda, Ratmir

    2013-01-01

    Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||ni||, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (S a). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of S a and use them to define the sequencing operator (S e q). Sequencing without any bias and errors is S e q = S a IN, where IN is a N × N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (C E N), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process. PMID:24416071

  4. Assessing DNA Barcodes for Species Identification in North American Reptiles and Amphibians in Natural History Collections.

    PubMed

    Chambers, E Anne; Hebert, Paul D N

    2016-01-01

    High rates of species discovery and loss have led to the urgent need for more rapid assessment of species diversity in the herpetofauna. DNA barcoding allows for the preliminary identification of species based on sequence divergence. Prior DNA barcoding work on reptiles and amphibians has revealed higher biodiversity counts than previously estimated due to cases of cryptic and undiscovered species. Past studies have provided DNA barcodes for just 14% of the North American herpetofauna, revealing the need for expanded coverage. This study extends the DNA barcode reference library for North American herpetofauna, assesses the utility of this approach in aiding species delimitation, and examines the correspondence between current species boundaries and sequence clusters designated by the BIN system. Sequences were obtained from 730 specimens, representing 274 species (43%) from the North American herpetofauna. Mean intraspecific divergences were 1% and 3%, while average congeneric sequence divergences were 16% and 14% in amphibians and reptiles, respectively. BIN assignments corresponded with current species boundaries in 79% of amphibians, 100% of turtles, and 60% of squamates. Deep divergences (>2%) were noted in 35% of squamate and 16% of amphibian species, and low divergences (<2%) occurred in 12% of reptiles and 23% of amphibians, patterns reflected in BIN assignments. Sequence recovery declined with specimen age, and variation in recovery success was noted among collections. Within collections, barcodes effectively flagged seven mislabeled tissues, and barcode fragments were recovered from five formalin-fixed specimens. This study demonstrates that DNA barcodes can effectively flag errors in museum collections, while BIN splits and merges reveal taxa belonging to deeply diverged or hybridizing lineages. This study is the first effort to compile a reference library of DNA barcodes for herpetofauna on a continental scale.

  5. Assessing DNA Barcodes for Species Identification in North American Reptiles and Amphibians in Natural History Collections

    PubMed Central

    Chambers, E. Anne; Hebert, Paul D. N.

    2016-01-01

    Background High rates of species discovery and loss have led to the urgent need for more rapid assessment of species diversity in the herpetofauna. DNA barcoding allows for the preliminary identification of species based on sequence divergence. Prior DNA barcoding work on reptiles and amphibians has revealed higher biodiversity counts than previously estimated due to cases of cryptic and undiscovered species. Past studies have provided DNA barcodes for just 14% of the North American herpetofauna, revealing the need for expanded coverage. Methodology/Principal Findings This study extends the DNA barcode reference library for North American herpetofauna, assesses the utility of this approach in aiding species delimitation, and examines the correspondence between current species boundaries and sequence clusters designated by the BIN system. Sequences were obtained from 730 specimens, representing 274 species (43%) from the North American herpetofauna. Mean intraspecific divergences were 1% and 3%, while average congeneric sequence divergences were 16% and 14% in amphibians and reptiles, respectively. BIN assignments corresponded with current species boundaries in 79% of amphibians, 100% of turtles, and 60% of squamates. Deep divergences (>2%) were noted in 35% of squamate and 16% of amphibian species, and low divergences (<2%) occurred in 12% of reptiles and 23% of amphibians, patterns reflected in BIN assignments. Sequence recovery declined with specimen age, and variation in recovery success was noted among collections. Within collections, barcodes effectively flagged seven mislabeled tissues, and barcode fragments were recovered from five formalin-fixed specimens. Conclusions/Significance This study demonstrates that DNA barcodes can effectively flag errors in museum collections, while BIN splits and merges reveal taxa belonging to deeply diverged or hybridizing lineages. This study is the first effort to compile a reference library of DNA barcodes for herpetofauna on a continental scale. PMID:27116180

  6. Genome-Wide Analysis of Differentially Expressed Genes Relevant to Rhizome Formation in Lotus Root (Nelumbo nucifera Gaertn)

    PubMed Central

    Yin, Jingjing; Li, Liangjun; Chen, Xuehao

    2013-01-01

    Lotus root is a popular wetland vegetable which produces edible rhizome. At the molecular level, the regulation of rhizome formation is very complex, which has not been sufficiently addressed in research. In this study, to identify differentially expressed genes (DEGs) in lotus root, four libraries (L1 library: stolon stage, L2 library: initial swelling stage, L3 library: middle swelling stage, L4: later swelling stage) were constructed from the rhizome development stages. High-throughput tag-sequencing technique was used which is based on Solexa Genome Analyzer Platform. Approximately 5.0 million tags were sequenced, and 4542104, 4474755, 4777919, and 4750348 clean tags including 151282, 137476, 215872, and 166005 distinct tags were obtained after removal of low quality tags from each library respectively. More than 43% distinct tags were unambiguous tags mapping to the reference genes, and 40% were unambiguous tag-mapped genes. From L1, L2, L3, and L4, total 20471, 18785, 23448, and 21778 genes were annotated, after mapping their functions in existing databases. Profiling of gene expression in L1/L2, L2/L3, and L3/L4 libraries were different among most of the selected 20 DEGs. Most of the DEGs in L1/L2 libraries were relevant to fiber development and stress response, while in L2/L3 and L3/L4 libraries, major of the DEGs were involved in metabolism of energy and storage. All up-regulated transcriptional factors in four libraries and 14 important rhizome formation-related genes in four libraries were also identified. In addition, the expression of 9 genes from identified DEGs was performed by qRT-PCR method. In a summary, this study provides a comprehensive understanding of gene expression during the rhizome formation in lotus root. PMID:23840598

  7. Digital reference service: trends in academic health science libraries.

    PubMed

    Dee, Cheryl R

    2005-01-01

    Two years after the initial 2002 study, a greater number of academic health science libraries are offering digital reference chat services, and this number appears poised to grow in the coming years. This 2004 follow-up study found that 36 (27%) of the academic health science libraries examined provide digital chat reference services; this was an approximately 6% increase over the 25 libraries (21%) located in 2002. Trends in digital reference services in academic health science libraries were derived from the exploration of academic health science library Web sites and from digital correspondence with academic health science library personnel using e-mail and chat. This article presents an overview of the current state of digital reference service in academic health science libraries.

  8. AmpliVar: mutation detection in high-throughput sequence from amplicon-based libraries.

    PubMed

    Hsu, Arthur L; Kondrashova, Olga; Lunke, Sebastian; Love, Clare J; Meldrum, Cliff; Marquis-Nicholson, Renate; Corboy, Greg; Pham, Kym; Wakefield, Matthew; Waring, Paul M; Taylor, Graham R

    2015-04-01

    Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data. © 2015 WILEY PERIODICALS, INC.

  9. Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries.

    PubMed

    Lam, Kathy N; Charles, Trevor C

    2015-01-01

    Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.

  10. R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring.

    PubMed

    Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès

    2016-01-01

    Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/. © The Author(s) 2016. Published by Oxford University Press.

  11. R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring

    PubMed Central

    Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès

    2016-01-01

    Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/ PMID:26989149

  12. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  13. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  14. Spectra library assisted de novo peptide sequencing for HCD and ETD spectra pairs.

    PubMed

    Yan, Yan; Zhang, Kaizhong

    2016-12-23

    De novo peptide sequencing via tandem mass spectrometry (MS/MS) has been developed rapidly in recent years. With the use of spectra pairs from the same peptide under different fragmentation modes, performance of de novo sequencing is greatly improved. Currently, with large amount of spectra sequenced everyday, spectra libraries containing tens of thousands of annotated experimental MS/MS spectra become available. These libraries provide information of the spectra properties, thus have the potential to be used with de novo sequencing to improve its performance. In this study, an improved de novo sequencing method assisted with spectra library is proposed. It uses spectra libraries as training datasets and introduces significant scores of the features used in our previous de novo sequencing method for HCD and ETD spectra pairs. Two pairs of HCD and ETD spectral datasets were used to test the performance of the proposed method and our previous method. The results show that this proposed method achieves better sequencing accuracy with higher ranked correct sequences and less computational time. This paper proposed an advanced de novo sequencing method for HCD and ETD spectra pair and used information from spectra libraries and significant improved previous similar methods.

  15. Rethinking Virtual Reference

    ERIC Educational Resources Information Center

    Tenopir, Carol

    2004-01-01

    Virtual reference services seem a natural extension of libraries digital collections and the emphasis on access to the library anytime, anywhere. If patrons use the library from home, it makes sense to provide them with person-to-person online reference. The Library of Congress (LC), OCLC, and several large library systems have developed and…

  16. The ‘thousand-dollar genome': an ethical exploration

    PubMed Central

    Dondorp, Wybo J; de Wert, Guido M W R

    2013-01-01

    Sequencing an individual's complete genome is expected to be possible for a relatively low sum ‘one thousand dollars' within a few years. Sequencing refers to determining the order of base pairs that make up the genome. The result is a library of three billion letter combinations. Cheap whole-genome sequencing is of greatest importance to medical scientific research. Comparing individual complete genomes will lead to a better understanding of the contribution genetic variation makes to health and disease. As knowledge increases, the ‘thousand-dollar genome' will also become increasingly important to healthcare. The applications that come within reach raise a number of ethical questions. This monitoring report addresses the issue. PMID:23677179

  17. Immune-Related Transcriptome of Coptotermes formosanus Shiraki Workers: The Defense Mechanism

    PubMed Central

    Hussain, Abid; Li, Yi-Feng; Cheng, Yu; Liu, Yang; Chen, Chuan-Cheng; Wen, Shuo-Yang

    2013-01-01

    Formosan subterranean termites, Coptotermes formosanus Shiraki, live socially in microbial-rich habitats. To understand the molecular mechanism by which termites combat pathogenic microbes, a full-length normalized cDNA library and four Suppression Subtractive Hybridization (SSH) libraries were constructed from termite workers infected with entomopathogenic fungi (Metarhizium anisopliae and Beauveria bassiana), Gram-positive Bacillus thuringiensis and Gram-negative Escherichia coli, and the libraries were analyzed. From the high quality normalized cDNA library, 439 immune-related sequences were identified. These sequences were categorized as pattern recognition receptors (47 sequences), signal modulators (52 sequences), signal transducers (137 sequences), effectors (39 sequences) and others (164 sequences). From the SSH libraries, 27, 17, 22 and 15 immune-related genes were identified from each SSH library treated with M. anisopliae, B. bassiana, B. thuringiensis and E. coli, respectively. When the normalized cDNA library was compared with the SSH libraries, 37 immune-related clusters were found in common; 56 clusters were identified in the SSH libraries, and 259 were identified in the normalized cDNA library. The immune-related gene expression pattern was further investigated using quantitative real time PCR (qPCR). Important immune-related genes were characterized, and their potential functions were discussed based on the integrated analysis of the results. We suggest that normalized cDNA and SSH libraries enable us to discover functional genes transcriptome. The results remarkably expand our knowledge about immune-inducible genes in C. formosanus Shiraki and enable the future development of novel control strategies for the management of Formosan subterranean termites. PMID:23874972

  18. Capturing the 'ome': the expanding molecular toolbox for RNA and DNA library construction.

    PubMed

    Boone, Morgane; De Koker, Andries; Callewaert, Nico

    2018-04-06

    All sequencing experiments and most functional genomics screens rely on the generation of libraries to comprehensively capture pools of targeted sequences. In the past decade especially, driven by the progress in the field of massively parallel sequencing, numerous studies have comprehensively assessed the impact of particular manipulations on library complexity and quality, and characterized the activities and specificities of several key enzymes used in library construction. Fortunately, careful protocol design and reagent choice can substantially mitigate many of these biases, and enable reliable representation of sequences in libraries. This review aims to guide the reader through the vast expanse of literature on the subject to promote informed library generation, independent of the application.

  19. Use of the melting curve assay as a means for high-throughput quantification of Illumina sequencing libraries.

    PubMed

    Shinozuka, Hiroshi; Forster, John W

    2016-01-01

    Background. Multiplexed sequencing is commonly performed on massively parallel short-read sequencing platforms such as Illumina, and the efficiency of library normalisation can affect the quality of the output dataset. Although several library normalisation approaches have been established, none are ideal for highly multiplexed sequencing due to issues of cost and/or processing time. Methods. An inexpensive and high-throughput library quantification method has been developed, based on an adaptation of the melting curve assay. Sequencing libraries were subjected to the assay using the Bio-Rad Laboratories CFX Connect(TM) Real-Time PCR Detection System. The library quantity was calculated through summation of reduction of relative fluorescence units between 86 and 95 °C. Results.PCR-enriched sequencing libraries are suitable for this quantification without pre-purification of DNA. Short DNA molecules, which ideally should be eliminated from the library for subsequent processing, were differentiated from the target DNA in a mixture on the basis of differences in melting temperature. Quantification results for long sequences targeted using the melting curve assay were correlated with those from existing methods (R (2) > 0.77), and that observed from MiSeq sequencing (R (2) = 0.82). Discussion.The results of multiplexed sequencing suggested that the normalisation performance of the described method is equivalent to that of another recently reported high-throughput bead-based method, BeNUS. However, costs for the melting curve assay are considerably lower and processing times shorter than those of other existing methods, suggesting greater suitability for highly multiplexed sequencing applications.

  20. Improving the Conservation of Mediterranean Chondrichthyans: The ELASMOMED DNA Barcode Reference Library

    PubMed Central

    Arculeo, Marco; Bonello, Juan J.; Bonnici, Leanne; Cannas, Rita; Carbonara, Pierluigi; Cau, Alessandro; Charilaou, Charis; El Ouamari, Najib; Fiorentino, Fabio; Follesa, Maria Cristina; Garofalo, Germana; Golani, Daniel; Guarniero, Ilaria; Hanner, Robert; Hemida, Farid; Kada, Omar; Lo Brutto, Sabrina; Mancusi, Cecilia; Morey, Gabriel; Schembri, Patrick J.; Serena, Fabrizio; Sion, Letizia; Stagioni, Marco; Tursi, Angelo; Vrgoc, Nedo; Steinke, Dirk; Tinti, Fausto

    2017-01-01

    Cartilaginous fish are particularly vulnerable to anthropogenic stressors and environmental change because of their K-selected reproductive strategy. Accurate data from scientific surveys and landings are essential to assess conservation status and to develop robust protection and management plans. Currently available data are often incomplete or incorrect as a result of inaccurate species identifications, due to a high level of morphological stasis, especially among closely related taxa. Moreover, several diagnostic characters clearly visible in adult specimens are less evident in juveniles. Here we present results generated by the ELASMOMED Consortium, a regional network aiming to sample and DNA-barcode the Mediterranean Chondrichthyans with the ultimate goal to provide a comprehensive DNA barcode reference library. This library will support and improve the molecular taxonomy of this group and the effectiveness of management and conservation measures. We successfully barcoded 882 individuals belonging to 42 species (17 sharks, 24 batoids and one chimaera), including four endemic and several threatened ones. Morphological misidentifications were found across most orders, further confirming the need for a comprehensive DNA barcoding library as a valuable tool for the reliable identification of specimens in support of taxonomist who are reviewing current identification keys. Despite low intraspecific variation among their barcode sequences and reduced samples size, five species showed preliminary evidence of phylogeographic structure. Overall, the ELASMOMED initiative further emphasizes the key role accurate DNA barcoding libraries play in establishing reliable diagnostic species specific features in otherwise taxonomically problematic groups for biodiversity management and conservation actions. PMID:28107413

  1. Redefining reference in an academic health sciences library: planning for change.

    PubMed

    Gray, S A; Brower, S; Munger, H; Start, A; White, P

    2001-01-01

    Deciding that changes in the pattern of questions at the reference desk required focused consideration, the reference librarians at the Health Sciences Library of the University at Buffalo held a planning retreat. Technology-induced changes in the information-seeking behavior and reference needs of the library's clientele caused a reassessment of how these needs could best be met and what is the best use of librarians' time. The librarians considered current trends in reference in other academic libraries, the specific needs of the clientele of the Health Sciences Library, and the strengths and expertise of the library staff. The results of this structured discussion produced ideas for redefining reference to provide customized services for the clients and environment.

  2. A comprehensive DNA barcode database for Central European beetles with a focus on Germany: adding more than 3500 identified species to BOLD.

    PubMed

    Hendrich, Lars; Morinière, Jérôme; Haszprunar, Gerhard; Hebert, Paul D N; Hausmann, Axel; Köhler, Frank; Balke, Michael

    2015-07-01

    Beetles are the most diverse group of animals and are crucial for ecosystem functioning. In many countries, they are well established for environmental impact assessment, but even in the well-studied Central European fauna, species identification can be very difficult. A comprehensive and taxonomically well-curated DNA barcode library could remedy this deficit and could also link hundreds of years of traditional knowledge with next generation sequencing technology. However, such a beetle library is missing to date. This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well-identified species (53% of the German fauna) with representatives from 97 of 103 families (94%). This study is the first comprehensive regional test of the efficiency of DNA barcoding for beetles with a focus on Germany. Sequences ≥500 bp were recovered from 63% of the specimens analysed (15 948 of 25 294) with short sequences from another 997 specimens. Whereas most specimens (92.2%) could be unambiguously assigned to a single known species by sequence diversity at CO1, 1089 specimens (6.8%) were assigned to more than one Barcode Index Number (BIN), creating 395 BINs which need further study to ascertain if they represent cryptic species, mitochondrial introgression, or simply regional variation in widespread species. We found 409 specimens (2.6%) that shared a BIN assignment with another species, most involving a pair of closely allied species as 43 BINs were involved. Most of these taxa were separated by barcodes although sequence divergences were low. Only 155 specimens (0.97%) show identical or overlapping clusters. © 2014 John Wiley & Sons Ltd.

  3. A large synthetic peptide and phosphopeptide reference library for mass spectrometry-based proteomics.

    PubMed

    Marx, Harald; Lemeer, Simone; Schliep, Jan Erik; Matheron, Lucrece; Mohammed, Shabaz; Cox, Jürgen; Mann, Matthias; Heck, Albert J R; Kuster, Bernhard

    2013-06-01

    We present a peptide library and data resource of >100,000 synthetic, unmodified peptides and their phosphorylated counterparts with known sequences and phosphorylation sites. Analysis of the library by mass spectrometry yielded a data set that we used to evaluate the merits of different search engines (Mascot and Andromeda) and fragmentation methods (beam-type collision-induced dissociation (HCD) and electron transfer dissociation (ETD)) for peptide identification. We also compared the sensitivities and accuracies of phosphorylation-site localization tools (Mascot Delta Score, PTM score and phosphoRS), and we characterized the chromatographic behavior of peptides in the library. We found that HCD identified more peptides and phosphopeptides than did ETD, that phosphopeptides generally eluted later from reversed-phase columns and were easier to identify than unmodified peptides and that current computational tools for proteomics can still be substantially improved. These peptides and spectra will facilitate the development, evaluation and improvement of experimental and computational proteomic strategies, such as separation techniques and the prediction of retention times and fragmentation patterns.

  4. Construction of random sheared fosmid library from Chinese cabbage and its use for Brassica rapa genome sequencing project.

    PubMed

    Park, Tae-Ho; Park, Beom-Seok; Kim, Jin-A; Hong, Joon Ki; Jin, Mina; Seol, Young-Joo; Mun, Jeong-Hwan

    2011-01-01

    As a part of the Multinational Genome Sequencing Project of Brassica rapa, linkage group R9 and R3 were sequenced using a bacterial artificial chromosome (BAC) by BAC strategy. The current physical contigs are expected to cover approximately 90% euchromatins of both chromosomes. As the project progresses, BAC selection for sequence extension becomes more limited because BAC libraries are restriction enzyme-specific. To support the project, a random sheared fosmid library was constructed. The library consists of 97536 clones with average insert size of approximately 40 kb corresponding to seven genome equivalents, assuming a Chinese cabbage genome size of 550 Mb. The library was screened with primers designed at the end of sequences of nine points of scaffold gaps where BAC clones cannot be selected to extend the physical contigs. The selected positive clones were end-sequenced to check the overlap between the fosmid clones and the adjacent BAC clones. Nine fosmid clones were selected and fully sequenced. The sequences revealed two completed gap filling and seven sequence extensions, which can be used for further selection of BAC clones confirming that the fosmid library will facilitate the sequence completion of B. rapa. Copyright © 2011. Published by Elsevier Ltd.

  5. BrAD-seq: Breath Adapter Directional sequencing: a streamlined, ultra-simple and fast library preparation protocol for strand specific mRNA library construction.

    PubMed

    Townsley, Brad T; Covington, Michael F; Ichihashi, Yasunori; Zumstein, Kristina; Sinha, Neelima R

    2015-01-01

    Next Generation Sequencing (NGS) is driving rapid advancement in biological understanding and RNA-sequencing (RNA-seq) has become an indispensable tool for biology and medicine. There is a growing need for access to these technologies although preparation of NGS libraries remains a bottleneck to wider adoption. Here we report a novel method for the production of strand specific RNA-seq libraries utilizing the terminal breathing of double-stranded cDNA to capture and incorporate a sequencing adapter. Breath Adapter Directional sequencing (BrAD-seq) reduces sample handling and requires far fewer enzymatic steps than most available methods to produce high quality strand-specific RNA-seq libraries. The method we present is optimized for 3-prime Digital Gene Expression (DGE) libraries and can easily extend to full transcript coverage shotgun (SHO) type strand-specific libraries and is modularized to accommodate a diversity of RNA and DNA input materials. BrAD-seq offers a highly streamlined and inexpensive option for RNA-seq libraries.

  6. Robust Sub-nanomolar Library Preparation for High Throughput Next Generation Sequencing.

    PubMed

    Wu, Wells W; Phue, Je-Nie; Lee, Chun-Ting; Lin, Changyi; Xu, Lai; Wang, Rong; Zhang, Yaqin; Shen, Rong-Fong

    2018-05-04

    Current library preparation protocols for Illumina HiSeq and MiSeq DNA sequencers require ≥2 nM initial library for subsequent loading of denatured cDNA onto flow cells. Such amounts are not always attainable from samples having a relatively low DNA or RNA input; or those for which a limited number of PCR amplification cycles is preferred (less PCR bias and/or more even coverage). A well-tested sub-nanomolar library preparation protocol for Illumina sequencers has however not been reported. The aim of this study is to provide a much needed working protocol for sub-nanomolar libraries to achieve outcomes as informative as those obtained with the higher library input (≥ 2 nM) recommended by Illumina's protocols. Extensive studies were conducted to validate a robust sub-nanomolar (initial library of 100 pM) protocol using PhiX DNA (as a control), genomic DNA (Bordetella bronchiseptica and microbial mock community B for 16S rRNA gene sequencing), messenger RNA, microRNA, and other small noncoding RNA samples. The utility of our protocol was further explored for PhiX library concentrations as low as 25 pM, which generated only slightly fewer than 50% of the reads achieved under the standard Illumina protocol starting with > 2 nM. A sub-nanomolar library preparation protocol (100 pM) could generate next generation sequencing (NGS) results as robust as the standard Illumina protocol. Following the sub-nanomolar protocol, libraries with initial concentrations as low as 25 pM could also be sequenced to yield satisfactory and reproducible sequencing results.

  7. Allele Identification for Transcriptome-Based Population Genomics in the Invasive Plant Centaurea solstitialis

    PubMed Central

    Dlugosch, Katrina M.; Lai, Zhao; Bonin, Aurélie; Hierro, José; Rieseberg, Loren H.

    2013-01-01

    Transcriptome sequences are becoming more broadly available for multiple individuals of the same species, providing opportunities to derive population genomic information from these datasets. Using the 454 Life Science Genome Sequencer FLX and FLX-Titanium next-generation platforms, we generated 11−430 Mbp of sequence for normalized cDNA for 40 wild genotypes of the invasive plant Centaurea solstitialis, yellow starthistle, from across its worldwide distribution. We examined the impact of sequencing effort on transcriptome recovery and overlap among individuals. To do this, we developed two novel publicly available software pipelines: SnoWhite for read cleaning before assembly, and AllelePipe for clustering of loci and allele identification in assembled datasets with or without a reference genome. AllelePipe is designed specifically for cases in which read depth information is not appropriate or available to assist with disentangling closely related paralogs from allelic variation, as in transcriptome or previously assembled libraries. We find that modest applications of sequencing effort recover most of the novel sequences present in the transcriptome of this species, including single-copy loci and a representative distribution of functional groups. In contrast, the coverage of variable sites, observation of heterozygosity, and overlap among different libraries are all highly dependent on sequencing effort. Nevertheless, the information gained from overlapping regions was informative regarding coarse population structure and variation across our small number of population samples, providing the first genetic evidence in support of hypothesized invasion scenarios. PMID:23390612

  8. Comparison of large-insert, small-insert and pyrosequencing libraries for metagenomic analysis.

    PubMed

    Danhorn, Thomas; Young, Curtis R; DeLong, Edward F

    2012-11-01

    The development of DNA sequencing methods for characterizing microbial communities has evolved rapidly over the past decades. To evaluate more traditional, as well as newer methodologies for DNA library preparation and sequencing, we compared fosmid, short-insert shotgun and 454 pyrosequencing libraries prepared from the same metagenomic DNA samples. GC content was elevated in all fosmid libraries, compared with shotgun and 454 libraries. Taxonomic composition of the different libraries suggested that this was caused by a relative underrepresentation of dominant taxonomic groups with low GC content, notably Prochlorales and the SAR11 cluster, in fosmid libraries. While these abundant taxa had a large impact on library representation, we also observed a positive correlation between taxon GC content and fosmid library representation in other low-GC taxa, suggesting a general trend. Analysis of gene category representation in different libraries indicated that the functional composition of a library was largely a reflection of its taxonomic composition, and no additional systematic biases against particular functional categories were detected at the level of sequencing depth in our samples. Another important but less predictable factor influencing the apparent taxonomic and functional library composition was the read length afforded by the different sequencing technologies. Our comparisons and analyses provide a detailed perspective on the influence of library type on the recovery of microbial taxa in metagenomic libraries and underscore the different uses and utilities of more traditional, as well as contemporary 'next-generation' DNA library construction and sequencing technologies for exploring the genomics of the natural microbial world.

  9. Transcriptome Analysis of the Scleractinian Coral Stylophora pistillata

    PubMed Central

    Salmon-Divon, Mali; Katzenellenbogen, Mark; Tambutté, Sylvie; Bertucci, Anthony; Hoegh-Guldberg, Ove; Deleury, Emeline; Allemand, Denis; Levy, Oren

    2014-01-01

    The principal architects of coral reefs are the scleractinian corals; these species are divided in two major clades referred to as “robust” and “complex” corals. Although the molecular diversity of the “complex” clade has received considerable attention, with several expressed sequence tag (EST) libraries and a complete genome sequence having been constructed, the “robust” corals have received far less attention, despite the fact that robust corals have been prominent focal points for ecological and physiological studies. Filling this gap affords important opportunities to extend these studies and to improve our understanding of the differences between the two major clades. Here, we present an EST library from Stylophora pistillata (Esper 1797) and systematically analyze the assembled transcripts compared to putative homologs from the complete proteomes of six well-characterized metazoans: Nematostella vectensis, Hydra magnipapillata, Caenorhabditis elegans, Drosophila melanogaster, Strongylocentrotus purpuratus, Ciona intestinalis and Homo sapiens. Furthermore, comparative analyses of the Stylophora pistillata ESTs were performed against several Cnidaria from the Scleractinia, Actiniaria and Hydrozoa, as well as against other stony corals separately. Functional characterization of S. pistillata transcripts into KOG/COG categories and further description of Wnt and bone morphogenetic protein (BMP) signaling pathways showed that the assembled EST library provides sufficient data and coverage. These features of this new library suggest considerable opportunities for extending our understanding of the molecular and physiological behavior of “robust” corals. PMID:24551124

  10. Simultaneous digital quantification and fluorescence-based size characterization of massively parallel sequencing libraries.

    PubMed

    Laurie, Matthew T; Bertout, Jessica A; Taylor, Sean D; Burton, Joshua N; Shendure, Jay A; Bielas, Jason H

    2013-08-01

    Due to the high cost of failed runs and suboptimal data yields, quantification and determination of fragment size range are crucial steps in the library preparation process for massively parallel sequencing (or next-generation sequencing). Current library quality control methods commonly involve quantification using real-time quantitative PCR and size determination using gel or capillary electrophoresis. These methods are laborious and subject to a number of significant limitations that can make library calibration unreliable. Herein, we propose and test an alternative method for quality control of sequencing libraries using droplet digital PCR (ddPCR). By exploiting a correlation we have discovered between droplet fluorescence and amplicon size, we achieve the joint quantification and size determination of target DNA with a single ddPCR assay. We demonstrate the accuracy and precision of applying this method to the preparation of sequencing libraries.

  11. Preparation of metagenomic libraries from naturally occurring marine viruses.

    PubMed

    Solonenko, Sergei A; Sullivan, Matthew B

    2013-01-01

    Microbes are now well recognized as major drivers of the biogeochemical cycling that fuels the Earth, and their viruses (phages) are known to be abundant and important in microbial mortality, horizontal gene transfer, and modulating microbial metabolic output. Investigation of environmental phages has been frustrated by an inability to culture the vast majority of naturally occurring diversity coupled with the lack of robust, quantitative, culture-independent methods for studying this uncultured majority. However, for double-stranded DNA phages, a quantitative viral metagenomic sample-to-sequence workflow now exists. Here, we review these advances with special emphasis on the technical details of preparing DNA sequencing libraries for metagenomic sequencing from environmentally relevant low-input DNA samples. Library preparation steps broadly involve manipulating the sample DNA by fragmentation, end repair and adaptor ligation, size fractionation, and amplification. One critical area of future research and development is parallel advances for alternate nucleic acid types such as single-stranded DNA and RNA viruses that are also abundant in nature. Combinations of recent advances in fragmentation (e.g., acoustic shearing and tagmentation), ligation reactions (adaptor-to-template ratio reference table availability), size fractionation (non-gel-sizing), and amplification (linear amplification for deep sequencing and linker amplification protocols) enhance our ability to generate quantitatively representative metagenomic datasets from low-input DNA samples. Such datasets are already providing new insights into the role of viruses in marine systems and will continue to do so as new environments are explored and synergies and paradigms emerge from large-scale comparative analyses. © 2013 Elsevier Inc. All rights reserved.

  12. A genomic library-based amplification approach (GL-PCR) for the mapping of multiple IS6110 insertion sites and strain differentiation of Mycobacterium tuberculosis.

    PubMed

    Namouchi, Amine; Mardassi, Helmi

    2006-11-01

    Evidence suggests that insertion of the IS6110 element is not without consequence to the biology of Mycobacterium tuberculosis complex strains. Thus, mapping of multiple IS6110 insertion sites in the genome of biomedically relevant clinical isolates would result in a better understanding of the role of this mobile element, particularly with regard to transmission, adaptability and virulence. In the present paper, we describe a versatile strategy, referred to as GL-PCR, that amplifies IS6110-flanking sequences based on the construction of a genomic library. M. tuberculosis chromosomal DNA is fully digested with HincII and then ligated into a plasmid vector between T7 and T3 promoter sequences. The ligation reaction product is transformed into Escherichia coli and selective PCR amplification targeting both 5' and 3' IS6110-flanking sequences are performed on the plasmid library DNA. For this purpose, four separate PCR reactions are performed, each combining an outward primer specific for one IS6110 end with either T7 or T3 primer. Determination of the nucleotide sequence of the PCR products generated from a single ligation reaction allowed mapping of 21 out of the 24 IS6110 copies of two 12 banded M. tuberculosis strains, yielding an overall sensitivity of 87,5%. Furthermore, by simply comparing the migration pattern of GL-PCR-generated products, the strategy proved to be as valuable as IS6110 RFLP for molecular typing of M. tuberculosis complex strains. Importantly, GL-PCR was able to discriminate between strains differing by a single IS6110 band.

  13. Assessment of antibody library diversity through next generation sequencing and technical error compensation

    PubMed Central

    Lisi, Simonetta; Chirichella, Michele; Arisi, Ivan; Goracci, Martina; Cremisi, Federico; Cattaneo, Antonino

    2017-01-01

    Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its complexity (also known as diversity), i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library complexity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Complexity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring complexity from such a small sampling is, however, very rudimental and gives limited information about the real diversity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library complexity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the complexity, taking in consideration the sequencing error. PMID:28505201

  14. Assessment of antibody library diversity through next generation sequencing and technical error compensation.

    PubMed

    Fantini, Marco; Pandolfini, Luca; Lisi, Simonetta; Chirichella, Michele; Arisi, Ivan; Terrigno, Marco; Goracci, Martina; Cremisi, Federico; Cattaneo, Antonino

    2017-01-01

    Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its complexity (also known as diversity), i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library complexity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Complexity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring complexity from such a small sampling is, however, very rudimental and gives limited information about the real diversity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library complexity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the complexity, taking in consideration the sequencing error.

  15. Prospective identification of parasitic sequences in phage display screens

    PubMed Central

    Matochko, Wadim L.; Cory Li, S.; Tang, Sindy K.Y.; Derda, Ratmir

    2014-01-01

    Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 109 random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the ‘parasites’), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 ‘parasites’. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼106 diverse clones prevents the biased selection of parasitic clones. PMID:24217917

  16. Transferring the Characteristics of Naturally Occurring and Biased Antibody Repertoires to Human Antibody Libraries by Trapping CDRH3 Sequences

    PubMed Central

    Venet, Sophie; Ravn, Ulla; Buatois, Vanessa; Gueneau, Franck; Calloud, Sébastien; Kosco-Vilbois, Marie; Fischer, Nicolas

    2012-01-01

    Antibody repertoires are characterized by diversity as they vary not only amongst individuals and post antigen exposure but also differ significantly between vertebrate species. Such plasticity can be exploited to generate human antibody libraries featuring hallmarks of these diverse repertoires. In this study, the focus was to capture CDRH3 sequences, as this region generally accounts for most of the interaction energy with antigen. Sequences from human as well as non-human sources were successfully integrated into human antibody libraries. Next generation sequencing of these libraries proved that the CDRH3 lengths and amino acid composition corresponded to the species of origin. Specific CDRH3 sequences, biased towards the recognition of a model antigen either by immunizing mice or by selecting with phage display, were then integrated into another set of libraries. From these antigen biased libraries, highly potent antibodies were more frequently isolated, indicating that the characteristics of an immune repertoire is transferrable via CDRH3 sequences into a human antibody library. Taken together, these data demonstrate that the properties of naturally or experimentally biased repertoires can be effectively harnessed for the generation of targeted human antibody libraries, substantially increasing the probability of isolating antibodies suitable for therapeutic and diagnostic applications. PMID:22937053

  17. A method for the further assembly of targeted unigenes in a transcriptome after assembly by Trinity

    PubMed Central

    Xiao, Xinlong; Ma, Jinbiao; Sun, Yufang; Yao, Yinan

    2015-01-01

    RNA-sequencing has been widely used to obtain high throughput transcriptome sequences in various species, but the assembly of a full set of complete transcripts is still a significant challenge. Judging by the number of expected transcripts and assembled unigenes in a transcriptome library, we believe that some unigenes could be reassembled. In this study, using the nitrate transporter (NRT) gene family and phosphate transporter (PHT) gene family in Salicornia europaea as examples, we introduced an approach to further assemble unigenes found in transcriptome libraries which had been previously generated by Trinity. To find the unigenes of a particular transcript that contained gaps, we respectively selected 16 NRT candidate unigene pairs and 12 PHT candidate unigene pairs for which the two unigenes had the same annotations, the same expression patterns among various RNA-seq samples, and different positions of the proteins coded as mapped to a reference protein. To fill a gap between the two unigenes, PCR was performed using primers that mapped to the two unigenes and the PCR products were sequenced, which demonstrated that 5 unigene pairs of NRT and 3 unigene pairs of PHT could be reassembled when the gaps were filled using the corresponding PCR product sequences. This fast and simple method will reduce the redundancy of targeted unigenes and allow acquisition of complete coding sequences (CDS). PMID:26528307

  18. Suppression Subtractive Hybridization Reveals Transcript Profiling of Chlorella under Heterotrophy to Photoautotrophy Transition

    PubMed Central

    Huang, Jianke; Wang, Weiliang; Yin, Weibo; Hu, Zanmin; Li, Yuanguang

    2012-01-01

    Background Microalgae have been extensively investigated and exploited because of their competitive nutritive bioproducts and biofuel production ability. Chlorella are green algae that can grow well heterotrophically and photoautotrophically. Previous studies proved that shifting from heterotrophy to photoautotrophy in light-induced environments causes photooxidative damage as well as distinct physiologic features that lead to dynamic changes in Chlorella intracellular components, which have great potential in algal health food and biofuel production. However, the molecular mechanisms underlying the trophic transition remain unclear. Methodology/Principal Findings In this study, suppression subtractive hybridization strategy was employed to screen and characterize genes that are differentially expressed in response to the light-induced shift from heterotrophy to photoautotrophy. Expressed sequence tags (ESTs) were obtained from 770 and 803 randomly selected clones among the forward and reverse libraries, respectively. Sequence analysis identified 544 unique genes in the two libraries. The functional annotation of the assembled unigenes demonstrated that 164 (63.1%) from the forward library and 62 (21.8%) from the reverse showed significant similarities with the sequences in the NCBI non-redundant database. The time-course expression patterns of 38 selected differentially expressed genes further confirmed their responsiveness to a diverse trophic status. The majority of the genes enriched in the subtracted libraries were associated with energy metabolism, amino acid metabolism, protein synthesis, carbohydrate metabolism, and stress defense. Conclusions/Significance The data presented here offer the first insights into the molecular foundation underlying the diverse microalgal trophic niche. In addition, the results can be used as a reference for unraveling candidate genes associated with the transition of Chlorella from heterotrophy to photoautotrophy, which holds great potential for further improving its lipid and nutrient production. PMID:23209737

  19. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids

    PubMed Central

    Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

    2010-01-01

    Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279–284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614

  20. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

    PubMed Central

    2011-01-01

    Background BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. Results This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. Conclusions Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed. PMID:21794110

  1. Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes.

    PubMed

    Feltus, Frank A; Saski, Christopher A; Mockaitis, Keithanne; Haiminen, Niina; Parida, Laxmi; Smith, Zachary; Ford, James; Staton, Margaret E; Ficklin, Stephen P; Blackmon, Barbara P; Cheng, Chun-Huai; Schnell, Raymond J; Kuhn, David N; Motamayor, Juan-Carlos

    2011-07-27

    BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such a region can be approached with next-generation whole-genome sequencing and assembly as if it were an independent small genome. Using the minimum tiling path as a guide, specific BAC clones representing the prioritized genomic interval are selected, pooled, and used to prepare a sequencing library. This pooled BAC approach was taken to sequence and assemble a QTL-rich region, of ~3 Mbp and represented by twenty-seven BACs, on linkage group 5 of the Theobroma cacao cv. Matina 1-6 genome. Using various mixtures of read coverages from paired-end and linear 454 libraries, multiple assemblies of varied quality were generated. Quality was assessed by comparing the assembly of 454 reads with a subset of ten BACs individually sequenced and assembled using Sanger reads. A mixture of reads optimal for assembly was identified. We found, furthermore, that a quality assembly suitable for serving as a reference genome template could be obtained even with a reduced depth of sequencing coverage. Annotation of the resulting assembly revealed several genes potentially responsible for three T. cacao traits: black pod disease resistance, bean shape index, and pod weight. Our results, as with other pooled BAC sequencing reports, suggest that pooling portions of a minimum tiling path derived from a BAC-based physical map is an effective method to target sub-genomic regions for sequencing. While we focused on a single QTL region, other QTL regions of importance could be similarly sequenced allowing for biological discovery to take place before a high quality whole-genome assembly is completed.

  2. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

    PubMed Central

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  3. Evaluating a Chat Reference Service at the University of South Alabama's Baugh Biomedical Library

    ERIC Educational Resources Information Center

    Clanton, Clista C.; Staggs, Geneva B.; Williams, Thomas L.

    2006-01-01

    The University of South Alabama's Baugh Biomedical Library recently initiated a chat reference service targeted at distance education students in the biomedical sciences. After one year of service, the library conducted an evaluation of the chat reference to assess the success of this mode of reference service. Both traditional reference and…

  4. Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

    PubMed

    Aigrain, Louise; Gu, Yong; Quail, Michael A

    2016-06-13

    The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.

  5. Next-generation sequencing library construction on a surface.

    PubMed

    Feng, Kuan; Costa, Justin; Edwards, Jeremy S

    2018-05-30

    Next-generation sequencing (NGS) has revolutionized almost all fields of biology, agriculture and medicine, and is widely utilized to analyse genetic variation. Over the past decade, the NGS pipeline has been steadily improved, and the entire process is currently relatively straightforward. However, NGS instrumentation still requires upfront library preparation, which can be a laborious process, requiring significant hands-on time. Herein, we present a simple but robust approach to streamline library preparation by utilizing surface bound transposases to construct DNA libraries directly on a flowcell surface. The surface bound transposases directly fragment genomic DNA while simultaneously attaching the library molecules to the flowcell. We sequenced and analysed a Drosophila genome library generated by this surface tagmentation approach, and we showed that our surface bound library quality was comparable to the quality of the library from a commercial kit. In addition to the time and cost savings, our approach does not require PCR amplification of the library, which eliminates potential problems associated with PCR duplicates. We described the first study to construct libraries directly on a flowcell. We believe our technique could be incorporated into the existing Illumina sequencing pipeline to simplify the workflow, reduce costs, and improve data quality.

  6. Reference Materials and Services for a Small Hospital Library. 5th Revised Edition.

    ERIC Educational Resources Information Center

    Kesti, Julie, Comp.; Graham, Elaine, Comp.

    This manual suggests and describes recommended reference services and sources for a small hospital library. Focusing on reference services, the first section includes information on ready-reference services; bibliographic search services, including taking and processing a request for a bibliography, National Library of Medicine literature…

  7. Factors Influencing Competence and Performance of Reference Librarians in Academic Libraries in Nigeria.

    ERIC Educational Resources Information Center

    Omoniyi, Joseph O.

    2002-01-01

    Reports on research that focused on competence and performance of reference libraries in academic libraries in Nigeria. Includes recommendations for improving competence and performance, including the reference process; staffing and other resources; education for reference librarians; knowledge of modern technology; and emotional stability and…

  8. DNA barcodes for bio-surveillance: regulated and economically important arthropod plant pests.

    PubMed

    Ashfaq, Muhammad; Hebert, Paul D N

    2016-11-01

    Many of the arthropod species that are important pests of agriculture and forestry are impossible to discriminate morphologically throughout all of their life stages. Some cannot be differentiated at any life stage. Over the past decade, DNA barcoding has gained increasing adoption as a tool to both identify known species and to reveal cryptic taxa. Although there has not been a focused effort to develop a barcode library for them, reference sequences are now available for 77% of the 409 species of arthropods documented on major pest databases. Aside from developing the reference library needed to guide specimen identifications, past barcode studies have revealed that a significant fraction of arthropod pests are a complex of allied taxa. Because of their importance as pests and disease vectors impacting global agriculture and forestry, DNA barcode results on these arthropods have significant implications for quarantine detection, regulation, and management. The current review discusses these implications in light of the presence of cryptic species in plant pests exposed by DNA barcoding.

  9. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    PubMed

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  10. Next-generation sequencing library preparation method for identification of RNA viruses on the Ion Torrent Sequencing Platform.

    PubMed

    Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng

    2018-05-09

    Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.

  11. Normalization of environmental metagenomic DNA enhances the discovery of under-represented microbial community members.

    PubMed

    Ramond, J-B; Makhalanyane, T P; Tuffin, M I; Cowan, D A

    2015-04-01

    Normalization is a procedure classically employed to detect rare sequences in cellular expression profiles (i.e. cDNA libraries). Here, we present a normalization protocol involving the direct treatment of extracted environmental metagenomic DNA with S1 nuclease, referred to as normalization of metagenomic DNA: NmDNA. We demonstrate that NmDNA, prior to post hoc PCR-based experiments (16S rRNA gene T-RFLP fingerprinting and clone library), increased the diversity of sequences retrieved from environmental microbial communities by detection of rarer sequences. This approach could be used to enhance the resolution of detection of ecologically relevant rare members in environmental microbial assemblages and therefore is promising in enabling a better understanding of ecosystem functioning. This study is the first testing 'normalization' on environmental metagenomic DNA (mDNA). The aim of this procedure was to improve the identification of rare phylotypes in environmental communities. Using hypoliths as model systems, we present evidence that this post-mDNA extraction molecular procedure substantially enhances the detection of less common phylotypes and could even lead to the discovery of novel microbial genotypes within a given environment. © 2014 The Society for Applied Microbiology.

  12. Raising the Bar or Training Library Technicians To Assume Reference Responsibilities.

    ERIC Educational Resources Information Center

    Brandys, Barbara; Daghita, Joan; Whitmore, Susan

    This paper reports on a program at the National Institutes of Health (NIH) Library that was instituted to train library technicians to work at the Information Desk as Reference Assistants; the objectives of the program were to train library technicians to become reference assistants, to free up librarians' time for new work assignments, and to…

  13. The Implications of Library Anxiety for Academic Reference Services: A Review of the Literature

    ERIC Educational Resources Information Center

    Carlile, Heather

    2007-01-01

    Academic reference librarians continually observe that many students are embarrassed about not knowing how to use the library and are reluctant to approach the reference desk. The theory of library anxiety offers an explanation, proposing that a fear of being in and using libraries serves as a psychological barrier, hindering many university…

  14. A Survey of the Usability of Digital Reference Services on Academic Health Science Library Web Sites

    ERIC Educational Resources Information Center

    Dee, Cheryl; Allen, Maryellen

    2006-01-01

    Reference interactions with patrons in a digital library environment using digital reference services (DRS) has become widespread. However, such services in many libraries appear to be underutilized. A study surveying the ease and convenience of such services for patrons in over 100 academic health science library Web sites suggests that…

  15. Horse Racing at the Library: How One Library System Increased the Usage of Some of Its Online Databases

    ERIC Educational Resources Information Center

    Kurhan, Scott H.; Griffing, Elizabeth A.

    2011-01-01

    Reference services in public libraries are changing dramatically. The Internet, online databases, and shrinking budgets are all making it necessary for non-traditional reference staff to become familiar with online reference tools. Recognizing the need for cross-training, Chesapeake Public Library (CPL) developed a program called the Database…

  16. Eliminating traditional reference services in an academic health sciences library: a case study

    PubMed Central

    Schulte, Stephanie J

    2011-01-01

    Question: How were traditional librarian reference desk services successfully eliminated at one health sciences library? Setting: The analysis was done at an academic health sciences library at a major research university. Method: A gap analysis was performed, evaluating changes in the first eleven months through analysis of reference transaction and instructional session data. Main Results: Substantial increases were seen in the overall number of specialized reference transactions and those conducted by librarians lasting more than thirty minutes. The number of reference transactions overall increased after implementing the new model. Several new small-scale instructional initiatives began, though perhaps not directly related to the new model. Conclusion: Traditional reference desk services were eliminated at one academic health sciences library without negative impact on reference and instructional statistics. Eliminating ties to the confines of the physical library due to staffing reference desk hours removed one significant barrier to a more proactive liaison program. PMID:22022221

  17. A Deep-Coverage Tomato BAC Library and Prospects Toward Development of an STC Framework for Genome Sequencing

    PubMed Central

    Budiman, Muhammad A.; Mao, Long; Wood, Todd C.; Wing, Rod A.

    2000-01-01

    Recently a new strategy using BAC end sequences as sequence-tagged connectors (STCs) was proposed for whole-genome sequencing projects. In this study, we present the construction and detailed characterization of a 15.0 haploid genome equivalent BAC library for the cultivated tomato, Lycopersicon esculentum cv. Heinz 1706. The library contains 129,024 clones with an average insert size of 117.5 kb and a chloroplast content of 1.11%. BAC end sequences from 1490 ends were generated and analyzed as a preliminary evaluation for using this library to develop an STC framework to sequence the tomato genome. A total of 1205 BAC end sequences (80.9%) were obtained, with an average length of 360 high-quality bases, and were searched against the GenBank database. Using a cutoff expectation value of <10−6, and combining the results from BLASTN, BLASTX, and TBLASTX searches, 24.3% of the BAC end sequences were similar to known sequences, of which almost half (48.7%) share sequence similarities to retrotransposons and 7% to known genes. Some of the transposable element sequences were the first reported in tomato, such as sequences similar to maize transposon Activator (Ac) ORF and tobacco pararetrovirus-like sequences. Interestingly, there were no BAC end sequences similar to the highly repeated TGRI and TGRII elements. However, the majority (70.3%) of STCs did not share significant sequence similarities to any sequences in GenBank at either the DNA or predicted protein levels, indicating that a large portion of the tomato genome is still unknown. Our data demonstrate that this BAC library is suitable for developing an STC database to sequence the tomato genome. The advantages of developing an STC framework for whole-genome sequencing of tomato are discussed. [The BAC end sequences described in this paper have been deposited in the GenBank data library under accession nos. AQ367111–AQ368361.] PMID:10645957

  18. De novo transcriptome assembly for a non-model species, the blood-sucking bug Triatoma brasiliensis, a vector of Chagas disease.

    PubMed

    Marchant, A; Mougel, F; Almeida, C; Jacquin-Joly, E; Costa, J; Harry, M

    2015-04-01

    High throughput sequencing (HTS) provides new research opportunities for work on non-model organisms, such as differential expression studies between populations exposed to different environmental conditions. However, such transcriptomic studies first require the production of a reference assembly. The choice of sampling procedure, sequencing strategy and assembly workflow is crucial. To develop a reliable reference transcriptome for Triatoma brasiliensis, the major Chagas disease vector in Northeastern Brazil, different de novo assembly protocols were generated using various datasets and software. Both 454 and Illumina sequencing technologies were applied on RNA extracted from antennae and mouthparts from single or pooled individuals. The 454 library yielded 278 Mb. Fifteen Illumina libraries were constructed and yielded nearly 360 million RNA-seq single reads and 46 million RNA-seq paired-end reads for nearly 45 Gb. For the 454 reads, we used three assemblers, Newbler, CAP3 and/or MIRA and for the Illumina reads, the Trinity assembler. Ten assembly workflows were compared using these programs separately or in combination. To compare the assemblies obtained, quantitative and qualitative criteria were used, including contig length, N50, contig number and the percentage of chimeric contigs. Completeness of the assemblies was estimated using the CEGMA pipeline. The best assembly (57,657 contigs, completeness of 80 %, <1 % chimeric contigs) was a hybrid assembly leading to recommend the use of (1) a single individual with large representation of biological tissues, (2) merging both long reads and short paired-end Illumina reads, (3) several assemblers in order to combine the specific advantages of each.

  19. Probe-Directed Degradation (PDD) for Flexible Removal of Unwanted cDNA Sequences from RNA-Seq Libraries.

    PubMed

    Archer, Stuart K; Shirokikh, Nikolay E; Preiss, Thomas

    2015-04-01

    Most applications for RNA-seq require the depletion of abundant transcripts to gain greater coverage of the underlying transcriptome. The sequences to be targeted for depletion depend on application and species and in many cases may not be supported by commercial depletion kits. This unit describes a method for generating RNA-seq libraries that incorporates probe-directed degradation (PDD), which can deplete any unwanted sequence set, with the low-bias split-adapter method of library generation (although many other library generation methods are in principle compatible). The overall strategy is suitable for applications requiring customized sequence depletion or where faithful representation of fragment ends and lack of sequence bias is paramount. We provide guidelines to rapidly design specific probes against the target sequence, and a detailed protocol for library generation using the split-adapter method including several strategies for streamlining the technique and reducing adapter dimer content. Copyright © 2015 John Wiley & Sons, Inc.

  20. Automated design of degenerate codon libraries.

    PubMed

    Mena, Marco A; Daugherty, Patrick S

    2005-12-01

    Degenerate codon libraries are frequently used in protein engineering and evolution studies but are often limited to targeting a small number of positions to adequately limit the search space. To mitigate this, codon degeneracy can be limited using heuristics or previous knowledge of the targeted positions. To automate design of libraries given a set of amino acid sequences, an algorithm (LibDesign) was developed that generates a set of possible degenerate codon libraries, their resulting size, and their score relative to a user-defined scoring function. A gene library of a specified size can then be constructed that is representative of the given amino acid distribution or that includes specific sequences or combinations thereof. LibDesign provides a new tool for automated design of high-quality protein libraries that more effectively harness existing sequence-structure information derived from multiple sequence alignment or computational protein design data.

  1. Enterprise Reference Library

    NASA Technical Reports Server (NTRS)

    Bickham, Grandin; Saile, Lynn; Havelka, Jacque; Fitts, Mary

    2011-01-01

    Introduction: Johnson Space Center (JSC) offers two extensive libraries that contain journals, research literature and electronic resources. Searching capabilities are available to those individuals residing onsite or through a librarian s search. Many individuals have rich collections of references, but no mechanisms to share reference libraries across researchers, projects, or directorates exist. Likewise, information regarding which references are provided to which individuals is not available, resulting in duplicate requests, redundant labor costs and associated copying fees. In addition, this tends to limit collaboration between colleagues and promotes the establishment of individual, unshared silos of information The Integrated Medical Model (IMM) team has utilized a centralized reference management tool during the development, test, and operational phases of this project. The Enterprise Reference Library project expands the capabilities developed for IMM to address the above issues and enhance collaboration across JSC. Method: After significant market analysis for a multi-user reference management tool, no available commercial tool was found to meet this need, so a software program was built around a commercial tool, Reference Manager 12 by The Thomson Corporation. A use case approach guided the requirements development phase. The premise of the design is that individuals use their own reference management software and export to SharePoint when their library is incorporated into the Enterprise Reference Library. This results in a searchable user-specific library application. An accompanying share folder will warehouse the electronic full-text articles, which allows the global user community to access full -text articles. Discussion: An enterprise reference library solution can provide a multidisciplinary collection of full text articles. This approach improves efficiency in obtaining and storing reference material while greatly reducing labor, purchasing and duplication costs. Most importantly, increasing collaboration across research groups provides unprecedented access to information relevant to NASA s mission. Conclusion: This project is an expansion and cost-effective leveraging of the existing JSC centralized library. Adding key word and author search capabilities and an alert function for notifications about new articles, based on users profiles, represent examples of future enhancements.

  2. Filling reference gaps via assembling DNA barcodes using high-throughput sequencing-moving toward barcoding the world.

    PubMed

    Liu, Shanlin; Yang, Chentao; Zhou, Chengran; Zhou, Xin

    2017-12-01

    Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)-based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn't show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. © The Authors 2017. Published by Oxford University Press.

  3. Sequencing, Annotation and Analysis of the Syrian Hamster (Mesocricetus auratus) Transcriptome

    PubMed Central

    Tchitchek, Nicolas; Safronetz, David; Rasmussen, Angela L.; Martens, Craig; Virtaneva, Kimmo; Porcella, Stephen F.; Feldmann, Heinz

    2014-01-01

    Background The Syrian hamster (golden hamster, Mesocricetus auratus) is gaining importance as a new experimental animal model for multiple pathogens, including emerging zoonotic diseases such as Ebola. Nevertheless there are currently no publicly available transcriptome reference sequences or genome for this species. Results A cDNA library derived from mRNA and snRNA isolated and pooled from the brains, lungs, spleens, kidneys, livers, and hearts of three adult female Syrian hamsters was sequenced. Sequence reads were assembled into 62,482 contigs and 111,796 reads remained unassembled (singletons). This combined contig/singleton dataset, designated as the Syrian hamster transcriptome, represents a total of 60,117,204 nucleotides. Our Mesocricetus auratus Syrian hamster transcriptome mapped to 11,648 mouse transcripts representing 9,562 distinct genes, and mapped to a similar number of transcripts and genes in the rat. We identified 214 quasi-complete transcripts based on mouse annotations. Canonical pathways involved in a broad spectrum of fundamental biological processes were significantly represented in the library. The Syrian hamster transcriptome was aligned to the current release of the Chinese hamster ovary (CHO) cell transcriptome and genome to improve the genomic annotation of this species. Finally, our Syrian hamster transcriptome was aligned against 14 other rodents, primate and laurasiatheria species to gain insights about the genetic relatedness and placement of this species. Conclusions This Syrian hamster transcriptome dataset significantly improves our knowledge of the Syrian hamster's transcriptome, especially towards its future use in infectious disease research. Moreover, this library is an important resource for the wider scientific community to help improve genome annotation of the Syrian hamster and other closely related species. Furthermore, these data provide the basis for development of expression microarrays that can be used in functional genomics studies. PMID:25398096

  4. A High-Quality Reference Genome for the Invasive Mosquitofish Gambusia affinis Using a Chicago Library.

    PubMed

    Hoffberg, Sandra L; Troendle, Nicholas J; Glenn, Travis C; Mahmud, Ousman; Louha, Swarnali; Chalopin, Domitille; Bennetzen, Jeffrey L; Mauricio, Rodney

    2018-04-27

    The western mosquitofish, Gambusia affinis, is a freshwater poecilid fish native to the southeastern United States but with a global distribution due to widespread human introduction. Gambusia affinis has been used as a model species for a broad range of evolutionary and ecological studies. We sequenced the genome of a male G. affinis to facilitate genetic studies in diverse fields including invasion biology and comparative genetics. We generated Illumina short read data from paired-end libraries and in vitro proximity-ligation libraries. We obtained 54.9× coverage, N50 contig length of 17.6 kb, and N50 scaffold length of 6.65 Mb. Compared to two other species in the Poeciliidae family, G. affinis has slightly fewer genes that have shorter total, exon, and intron length on average. Using a set of universal single-copy orthologs in fish genomes, we found 95.5% of these genes were complete in the G. affinis assembly. The number of transposable elements in the G. affinis assembly is similar to those of closely related species. The high-quality genome sequence and annotations we report will be valuable resources for scientists to map the genetic architecture of traits of interest in this species. Copyright © 2018, G3: Genes, Genomes, Genetics.

  5. Using populations of human and microbial genomes for organism detection in metagenomes

    DOE PAGES

    Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel; ...

    2015-04-29

    Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less

  6. Using populations of human and microbial genomes for organism detection in metagenomes.

    PubMed

    Ames, Sasha K; Gardner, Shea N; Marti, Jose Manuel; Slezak, Tom R; Gokhale, Maya B; Allen, Jonathan E

    2015-07-01

    Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected. © 2015 Ames et al.; Published by Cold Spring Harbor Laboratory Press.

  7. Using populations of human and microbial genomes for organism detection in metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel

    Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less

  8. Colony-PCR Is a Rapid Method for DNA Amplification of Hyphomycetes

    PubMed Central

    Walch, Georg; Knapp, Maria; Rainer, Georg; Peintner, Ursula

    2016-01-01

    Fungal pure cultures identified with both classical morphological methods and through barcoding sequences are a basic requirement for reliable reference sequences in public databases. Improved techniques for an accelerated DNA barcode reference library construction will result in considerably improved sequence databases covering a wider taxonomic range. Fast, cheap, and reliable methods for obtaining DNA sequences from fungal isolates are, therefore, a valuable tool for the scientific community. Direct colony PCR was already successfully established for yeasts, but has not been evaluated for a wide range of anamorphic soil fungi up to now, and a direct amplification protocol for hyphomycetes without tissue pre-treatment has not been published so far. Here, we present a colony PCR technique directly from fungal hyphae without previous DNA extraction or other prior manipulation. Seven hundred eighty-eight fungal strains from 48 genera were tested with a success rate of 86%. PCR success varied considerably: DNA of fungi belonging to the genera Cladosporium, Geomyces, Fusarium, and Mortierella could be amplified with high success. DNA of soil-borne yeasts was always successfully amplified. Absidia, Mucor, Trichoderma, and Penicillium isolates had noticeably lower PCR success. PMID:29376929

  9. VIP: Vortex Image Processing Package for High-contrast Direct Imaging

    NASA Astrophysics Data System (ADS)

    Gomez Gonzalez, Carlos Alberto; Wertz, Olivier; Absil, Olivier; Christiaens, Valentin; Defrère, Denis; Mawet, Dimitri; Milli, Julien; Absil, Pierre-Antoine; Van Droogenbroeck, Marc; Cantalloube, Faustine; Hinz, Philip M.; Skemer, Andrew J.; Karlsson, Mikael; Surdej, Jean

    2017-07-01

    We present the Vortex Image Processing (VIP) library, a python package dedicated to astronomical high-contrast imaging. Our package relies on the extensive python stack of scientific libraries and aims to provide a flexible framework for high-contrast data and image processing. In this paper, we describe the capabilities of VIP related to processing image sequences acquired using the angular differential imaging (ADI) observing technique. VIP implements functionalities for building high-contrast data processing pipelines, encompassing pre- and post-processing algorithms, potential source position and flux estimation, and sensitivity curve generation. Among the reference point-spread function subtraction techniques for ADI post-processing, VIP includes several flavors of principal component analysis (PCA) based algorithms, such as annular PCA and incremental PCA algorithms capable of processing big datacubes (of several gigabytes) on a computer with limited memory. Also, we present a novel ADI algorithm based on non-negative matrix factorization, which comes from the same family of low-rank matrix approximations as PCA and provides fairly similar results. We showcase the ADI capabilities of the VIP library using a deep sequence on HR 8799 taken with the LBTI/LMIRCam and its recently commissioned L-band vortex coronagraph. Using VIP, we investigated the presence of additional companions around HR 8799 and did not find any significant additional point source beyond the four known planets. VIP is available at http://github.com/vortex-exoplanet/VIP and is accompanied with Jupyter notebook tutorials illustrating the main functionalities of the library.

  10. Off the Shelf: Trends in the Purchase and Use of Electronic Reference Books

    ERIC Educational Resources Information Center

    Korah, Abe; Cassidy, Erin; Elmore, Eric; Jerabek, Ann

    2009-01-01

    What is the future direction of reference books? What types of policies are libraries implementing regarding the purchase of electronic reference books? Are libraries still buying hard copy reference items when an electronic equivalent is available? This paper discusses a national survey of libraries regarding the purchase and use of electronic…

  11. Analysis of the Questions Asked through Digital and Face-to-Face Reference Services

    ERIC Educational Resources Information Center

    Tsuji, Keita; Arai, Shunsuke; Suga, Reina; Ikeuchi, Atsushi; Yoshikane, Fuyuki

    2013-01-01

    In Japan, only a few public libraries provide e-mail reference services. To help public libraries start e-mail reference services, the authors investigated reference questions received by libraries via e-mail and traditional face-to-face services. The authors found that research questions are more frequently observed among e-mail questions and…

  12. User Preferences in Reference Services: Virtual Reference and Academic Libraries

    ERIC Educational Resources Information Center

    Cummings, Joel; Cummings, Lara; Frederiksen, Linda

    2007-01-01

    This study examines the use of chat in an academic library's user population and where virtual reference services might fit within the spectrum of public services offered by academic libraries. Using questionnaires, this research demonstrates that many within the academic community are open to the idea of chat-based reference or using chat for…

  13. Browsing the Virtual Stacks: Making Electronic Reference Tools More Visible to Users

    ERIC Educational Resources Information Center

    Peterson, Elizabeth

    2008-01-01

    Electronic reference resources are expanding traditional print reference collections far beyond the walls of the library building. While the library literature has seen a debate rage about the various merits and pitfalls of electronic reference sources, no one disputes they are here to stay. As more of the library content becomes available through…

  14. Combined Use of 16S Ribosomal DNA and 16S rRNA To Study the Bacterial Community of Polychlorinated Biphenyl-Polluted Soil

    PubMed Central

    Nogales, Balbina; Moore, Edward R. B.; Llobet-Brossa, Enrique; Rossello-Mora, Ramon; Amann, Rudolf; Timmis, Kenneth N.

    2001-01-01

    The bacterial diversity assessed from clone libraries prepared from rRNA (two libraries) and ribosomal DNA (rDNA) (one library) from polychlorinated biphenyl (PCB)-polluted soil has been analyzed. A good correspondence of the community composition found in the two types of library was observed. Nearly 29% of the cloned sequences in the rDNA library were identical to sequences in the rRNA libraries. More than 60% of the total cloned sequence types analyzed were grouped in phylogenetic groups (a clone group with sequence similarity higher than 97% [98% for Burkholderia and Pseudomonas-type clones]) represented in both types of libraries. Some of those phylogenetic groups, mostly represented by a single (or pair) of cloned sequence type(s), were observed in only one of the types of library. An important difference between the libraries was the lack of clones representative of the Actinobacteria in the rDNA library. The PCB-polluted soil exhibited a high bacterial diversity which included representatives of two novel lineages. The apparent abundance of bacteria affiliated to the beta-subclass of the Proteobacteria, and to the genus Burkholderia in particular, was confirmed by fluorescence in situ hybridization analysis. The possible influence on apparent diversity of low template concentrations was assessed by dilution of the RNA template prior to amplification by reverse transcription-PCR. Although differences in the composition of the two rRNA libraries obtained from high and low RNA concentrations were observed, the main components of the bacterial community were represented in both libraries, and therefore their detection was not compromised by the lower concentrations of template used in this study. PMID:11282645

  15. Expressed sequence tags from heat-shocked seagrass Zostera noltii (Hornemann) from its southern distribution range.

    PubMed

    Massa, Sónia I; Pearson, Gareth A; Aires, Tânia; Kube, Michael; Olsen, Jeanine L; Reinhardt, Richard; Serrão, Ester A; Arnaud-Haond, Sophie

    2011-09-01

    Predicted global climate change threatens the distributional ranges of species worldwide. We identified genes expressed in the intertidal seagrass Zostera noltii during recovery from a simulated low tide heat-shock exposure. Five Expressed Sequence Tag (EST) libraries were compared, corresponding to four recovery times following sub-lethal temperature stress, and a non-stressed control. We sequenced and analyzed 7009 sequence reads from 30min, 2h, 4h and 24h after the beginning of the heat-shock (AHS), and 1585 from the control library, for a total of 8594 sequence reads. Among 51 Tentative UniGenes (TUGs) exhibiting significantly different expression between libraries, 19 (37.3%) were identified as 'molecular chaperones' and were over-expressed following heat-shock, while 12 (23.5%) were 'photosynthesis TUGs' generally under-expressed in heat-shocked plants. A time course analysis of expression showed a rapid increase in expression of the molecular chaperone class, most of which were heat-shock proteins; which increased from 2 sequence reads in the control library to almost 230 in the 30min AHS library, followed by a slow decrease during further recovery. In contrast, 'photosynthesis TUGs' were under-expressed 30min AHS compared with the control library, and declined progressively with recovery time in the stress libraries, with a total of 29 sequence reads 24h AHS, compared with 125 in the control. A total of 4734 TUGs were screened for EST-Single Sequence Repeats (EST-SSRs) and 86 microsatellites were identified. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. The campaign to DNA barcode all fishes, FISH-BOL.

    PubMed

    Ward, R D; Hanner, R; Hebert, P D N

    2009-02-01

    FISH-BOL, the Fish Barcode of Life campaign, is an international research collaboration that is assembling a standardized reference DNA sequence library for all fishes. Analysis is targeting a 648 base pair region of the mitochondrial cytochrome c oxidase I (COI) gene. More than 5000 species have already been DNA barcoded, with an average of five specimens per species, typically vouchers with authoritative identifications. The barcode sequence from any fish, fillet, fin, egg or larva can be matched against these reference sequences using BOLD; the Barcode of Life Data System (http://www.barcodinglife.org). The benefits of barcoding fishes include facilitating species identification, highlighting cases of range expansion for known species, flagging previously overlooked species and enabling identifications where traditional methods cannot be applied. Results thus far indicate that barcodes separate c. 98 and 93% of already described marine and freshwater fish species, respectively. Several specimens with divergent barcode sequences have been confirmed by integrative taxonomic analysis as new species. Past concerns in relation to the use of fish barcoding for species discrimination are discussed. These include hybridization, recent radiations, regional differentiation in barcode sequences and nuclear copies of the barcode region. However, current results indicate these issues are of little concern for the great majority of specimens.

  17. Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes.

    PubMed

    Oyola, Samuel O; Otto, Thomas D; Gu, Yong; Maslen, Gareth; Manske, Magnus; Campino, Susana; Turner, Daniel J; Macinnis, Bronwyn; Kwiatkowski, Dominic P; Swerdlow, Harold P; Quail, Michael A

    2012-01-03

    Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences. We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates. We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.

  18. The Papillomavirus Episteme: a major update to the papillomavirus sequence database.

    PubMed

    Van Doorslaer, Koenraad; Li, Zhiwen; Xirasagar, Sandhya; Maes, Piet; Kaminsky, David; Liou, David; Sun, Qiang; Kaur, Ramandeep; Huyen, Yentram; McBride, Alison A

    2017-01-04

    The Papillomavirus Episteme (PaVE) is a database of curated papillomavirus genomic sequences, accompanied by web-based sequence analysis tools. This update describes the addition of major new features. The papillomavirus genomes within PaVE have been further annotated, and now includes the major spliced mRNA transcripts. Viral genes and transcripts can be visualized on both linear and circular genome browsers. Evolutionary relationships among PaVE reference protein sequences can be analysed using multiple sequence alignments and phylogenetic trees. To assist in viral discovery, PaVE offers a typing tool; a simplified algorithm to determine whether a newly sequenced virus is novel. PaVE also now contains an image library containing gross clinical and histopathological images of papillomavirus infected lesions. Database URL: https://pave.niaid.nih.gov/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  19. Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

    PubMed Central

    Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

    2018-01-01

    Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139

  20. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    PubMed

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  1. Digital transcriptome profiling using selective hexamer priming for cDNA synthesis.

    PubMed

    Armour, Christopher D; Castle, John C; Chen, Ronghua; Babak, Tomas; Loerch, Patrick; Jackson, Stuart; Shah, Jyoti K; Dey, John; Rohl, Carol A; Johnson, Jason M; Raymond, Christopher K

    2009-09-01

    We developed a procedure for the preparation of whole transcriptome cDNA libraries depleted of ribosomal RNA from only 1 microg of total RNA. The method relies on a collection of short, computationally selected oligonucleotides, called 'not-so-random' (NSR) primers, to obtain full-length, strand-specific representation of nonribosomal RNA transcripts. In this study we validated the technique by profiling human whole brain and universal human reference RNA using ultra-high-throughput sequencing.

  2. Establishment of a matrix-assisted laser desorption ionization time-of-flight mass spectrometry database for rapid identification of infectious achlorophyllous green micro-algae of the genus Prototheca.

    PubMed

    Murugaiyan, J; Ahrholdt, J; Kowbel, V; Roesler, U

    2012-05-01

    The possibility of using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) for rapid identification of pathogenic and non-pathogenic species of the genus Prototheca has been recently demonstrated. A unique reference database of MALDI-TOF MS profiles for type and reference strains of the six generally accepted Prototheca species was established. The database quality was reinforced after the acquisition of 27 spectra for selected Prototheca strains, with three biological and technical replicates for each of 18 type and reference strains of Prototheca and four strains of Chlorella. This provides reproducible and unique spectra covering a wide m/z range (2000-20 000 Da) for each of the strains used in the present study. The reproducibility of the spectra was further confirmed by employing composite correlation index calculation and main spectra library (MSP) dendrogram creation, available with MALDI Biotyper software. The MSP dendrograms obtained were comparable with the 18S rDNA sequence-based dendrograms. These reference spectra were successfully added to the Bruker database, and the efficiency of identification was evaluated by cross-reference-based and unknown Prototheca identification. It is proposed that the addition of further strains would reinforce the reference spectra library for rapid identification of Prototheca strains to the genus and species/genotype level. © 2011 The Authors. Clinical Microbiology and Infection © 2011 European Society of Clinical Microbiology and Infectious Diseases.

  3. Linking adults and immatures of South African marine fishes.

    PubMed

    Steinke, Dirk; Connell, Allan D; Hebert, Paul D N

    2016-11-01

    The early life-history stages of fishes are poorly known, impeding acquisition of the identifications needed to monitor larval recruitment and year-class strength. A comprehensive database of COI sequences, linked to authoritatively identified voucher specimens, promises to change this situation, representing a significant advance for fisheries science. Barcode records were obtained from 2526 early larvae and pelagic eggs of fishes collected on the inshore shelf within 5 km of the KwaZulu-Natal coast, about 50 km south of Durban, South Africa. Barcodes were also obtained from 3215 adults, representing 946 South African fish species. Using the COI reference library on BOLD based on adults, 89% of the immature fishes could be identified to a species level; they represented 450 species. Most of the uncertain sequences could be assigned to a genus, family, or order; only 92 specimens (4%) were unassigned. Accumulation curves based on inference of phylogenetic diversity indicate near-completeness of the collecting effort. The entire set of adult and larval fishes included 1006 species, representing 43% of all fish species known from South African waters. However, this total included 189 species not previously recorded from this region. The fact that almost 90% of the immatures gained a species identification demonstrates the power and completeness of the DNA barcode reference library for fishes generated during the 10 years of FishBOL.

  4. Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout

    PubMed Central

    Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo

    2015-01-01

    Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877

  5. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    USDA-ARS?s Scientific Manuscript database

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  6. Newborn Screening: MedlinePlus Health Topic

    MedlinePlus

    ... deficiency (National Library of Medicine) Genetics Home Reference: glutaric acidemia type I (National Library of Medicine) Genetics Home Reference: glutaric acidemia type II (National Library of Medicine) Genetics ...

  7. Small RNA Library Preparation Method for Next-Generation Sequencing Using Chemical Modifications to Prevent Adapter Dimer Formation.

    PubMed

    Shore, Sabrina; Henderson, Jordana M; Lebedev, Alexandre; Salcedo, Michelle P; Zon, Gerald; McCaffrey, Anton P; Paul, Natasha; Hogrefe, Richard I

    2016-01-01

    For most sample types, the automation of RNA and DNA sample preparation workflows enables high throughput next-generation sequencing (NGS) library preparation. Greater adoption of small RNA (sRNA) sequencing has been hindered by high sample input requirements and inherent ligation side products formed during library preparation. These side products, known as adapter dimer, are very similar in size to the tagged library. Most sRNA library preparation strategies thus employ a gel purification step to isolate tagged library from adapter dimer contaminants. At very low sample inputs, adapter dimer side products dominate the reaction and limit the sensitivity of this technique. Here we address the need for improved specificity of sRNA library preparation workflows with a novel library preparation approach that uses modified adapters to suppress adapter dimer formation. This workflow allows for lower sample inputs and elimination of the gel purification step, which in turn allows for an automatable sRNA library preparation protocol.

  8. When species matches are unavailable are DNA barcodes correctly assigned to higher taxa? An assessment using sphingid moths

    PubMed Central

    2011-01-01

    Background When a specimen belongs to a species not yet represented in DNA barcode reference libraries there is disagreement over the effectiveness of using sequence comparisons to assign the query accurately to a higher taxon. Library completeness and the assignment criteria used have been proposed as critical factors affecting the accuracy of such assignments but have not been thoroughly investigated. We explored the accuracy of assignments to genus, tribe and subfamily in the Sphingidae, using the almost complete global DNA barcode reference library (1095 species) available for this family. Costa Rican sphingids (118 species), a well-documented, diverse subset of the family, with each of the tribes and subfamilies represented were used as queries. We simulated libraries with different levels of completeness (10-100% of the available species), and recorded assignments (positive or ambiguous) and their accuracy (true or false) under six criteria. Results A liberal tree-based criterion assigned 83% of queries accurately to genus, 74% to tribe and 90% to subfamily, compared to a strict tree-based criterion, which assigned 75% of queries accurately to genus, 66% to tribe and 84% to subfamily, with a library containing 100% of available species (but excluding the species of the query). The greater number of true positives delivered by more relaxed criteria was negatively balanced by the occurrence of more false positives. This effect was most sharply observed with libraries of the lowest completeness where, for example at the genus level, 32% of assignments were false positives with the liberal criterion versus < 1% when using the strict. We observed little difference (< 8% using the liberal criterion) however, in the overall accuracy of the assignments between the lowest and highest levels of library completeness at the tribe and subfamily level. Conclusions Our results suggest that when using a strict tree-based criterion for higher taxon assignment with DNA barcodes, the likelihood of assigning a query a genus name incorrectly is very low, if a genus name is provided it has a high likelihood of being accurate, and if no genus match is available the query can nevertheless be assigned to a subfamily with high accuracy regardless of library completeness. DNA barcoding often correctly assigned sphingid moths to higher taxa when species matches were unavailable, suggesting that barcode reference libraries can be useful for higher taxon assignments long before they achieve complete species coverage. PMID:21806794

  9. Virtual Reference Transcript Analysis: A Few Models.

    ERIC Educational Resources Information Center

    Smyth, Joanne

    2003-01-01

    Describes the introduction of virtual, or digital, reference service at the University of New Brunswick libraries. Highlights include analyzing transcripts from LIVE (Library Information in a Virtual Environment); reference question types; ACRL (Association of College and Research Libraries) information literacy competency standards; and the Big 6…

  10. Reference Desk Is Not Dead Yet: A Perspective from the National Medical Library of Cuba

    ERIC Educational Resources Information Center

    Arroyo, Sonia Santana

    2015-01-01

    There persists an intense debate on whether or not the traditional reference desk should be in academic libraries. Yet, despite many anti-desk studies, the place of the reference desk still remains. This paper aims to review the current significance of the reference desk for some libraries, as well as the importance of choosing the proper…

  11. A maize map standard with sequenced core markers, grass genome reference points and 932 expressed sequence tagged sites (ESTs) in a 1736-locus map.

    PubMed Central

    Davis, G L; McMullen, M D; Baysdorfer, C; Musket, T; Grant, D; Staebell, M; Xu, G; Polacco, M; Koster, L; Melia-Hancock, S; Houchins, K; Chao, S; Coe, E H

    1999-01-01

    We have constructed a 1736-locus maize genome map containing1156 loci probed by cDNAs, 545 probed by random genomic clones, 16 by simple sequence repeats (SSRs), 14 by isozymes, and 5 by anonymous clones. Sequence information is available for 56% of the loci with 66% of the sequenced loci assigned functions. A total of 596 new ESTs were mapped from a B73 library of 5-wk-old shoots. The map contains 237 loci probed by barley, oat, wheat, rice, or tripsacum clones, which serve as grass genome reference points in comparisons between maize and other grass maps. Ninety core markers selected for low copy number, high polymorphism, and even spacing along the chromosome delineate the 100 bins on the map. The average bin size is 17 cM. Use of bin assignments enables comparison among different maize mapping populations and experiments including those involving cytogenetic stocks, mutants, or quantitative trait loci. Integration of nonmaize markers in the map extends the resources available for gene discovery beyond the boundaries of maize mapping information into the expanse of map, sequence, and phenotype information from other grass species. This map provides a foundation for numerous basic and applied investigations including studies of gene organization, gene and genome evolution, targeted cloning, and dissection of complex traits. PMID:10388831

  12. A DNA Barcode Library for North American Ephemeroptera: Progress and Prospects

    PubMed Central

    Webb, Jeffrey M.; Jacobus, Luke M.; Funk, David H.; Zhou, Xin; Kondratieff, Boris; Geraci, Christy J.; DeWalt, R. Edward; Baird, Donald J.; Richard, Barton; Phillips, Iain; Hebert, Paul D. N.

    2012-01-01

    DNA barcoding of aquatic macroinvertebrates holds much promise as a tool for taxonomic research and for providing the reliable identifications needed for water quality assessment programs. A prerequisite for identification using barcodes is a reliable reference library. We gathered 4165 sequences from the barcode region of the mitochondrial cytochrome c oxidase subunit I gene representing 264 nominal and 90 provisional species of mayflies (Insecta: Ephemeroptera) from Canada, Mexico, and the United States. No species shared barcode sequences and all can be identified with barcodes with the possible exception of some Caenis. Minimum interspecific distances ranged from 0.3–24.7% (mean: 12.5%), while the average intraspecific divergence was 1.97%. The latter value was inflated by the presence of very high divergences in some taxa. In fact, nearly 20% of the species included two or three haplotype clusters showing greater than 5.0% sequence divergence and some values are as high as 26.7%. Many of the species with high divergences are polyphyletic and likely represent species complexes. Indeed, many of these polyphyletic species have numerous synonyms and individuals in some barcode clusters show morphological attributes characteristic of the synonymized species. In light of our findings, it is imperative that type or topotype specimens be sequenced to correctly associate barcode clusters with morphological species concepts and to determine the status of currently synonymized species. PMID:22666447

  13. Reference and Bibliographic Instruction: A Survey of Philosophy Statements in LIBRAS Libraries.

    ERIC Educational Resources Information Center

    Keith, Ellen; Kohut, Dave

    1998-01-01

    A survey of statements of philosophy on reference and bibliographic instruction from libraries in small, Chicago area liberal arts colleges and universities revealed a wide variety of relationships between the two services. The study concluded that most of the libraries surveyed compartmentalized reference and bibliographic instruction. (PEN)

  14. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    PubMed

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  15. Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

    PubMed Central

    2011-01-01

    Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the library has a large number of transcription factors and will be interesting for discovery and validation of drought or abiotic stress related genes in common bean. PMID:22118559

  16. Molecular architecture of classical cytological landmarks: Centromeres and telomeres

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Meyne, J.

    1994-11-01

    Both the human telomere repeat and the pericentromeric repeat sequence (GGAAT)n were isolated based on evolutionary conservation. Their isolation was based on the premise that chromosomal features as structurally and functionally important as telomeres and centromeres should be highly conserved. Both sequences were isolated by high stringency screening of a human repetitive DNA library with rodent repetitive DNA. The pHuR library (plasmid Human Repeat) used for this project was enriched for repetitive DNA by using a modification of the standard DNA library preparation method. Usually DNA for a library is cut with restriction enzymes, packaged, infected, and the library ismore » screened. A problem with this approach is that many tandem repeats don`t have any (or many) common restriction sites. Therefore, many of the repeat sequences will not be represented in the library because they are not restricted to a viable length for the vector used. To prepare the pHuR library, human DNA was mechanically sheared to a small size. These relatively short DNA fragments were denatured and then renatured to C{sub o}t 50. Theoretically only repetitive DNA sequences should renature under C{sub o}t 50 conditions. The single-stranded regions were digested using S1 nuclease, leaving the double-stranded, renatured repeat sequences.« less

  17. The Total Library System (TLS). Volume 2. The Files.

    DTIC Science & Technology

    1983-01-01

    file name. USER private By user sequence number. PROFILE private By user sequence number. TITL2 public By short form of the Library of Congress number...SHELF public By short form of the Library of Congress number. CATCOST private By requisition number. CATCHG private By Library of Congress number...including edition year and copy number--as on label of book, etc. CONTIN public By short form of the Library of Congress number. LIST public One record

  18. Genome-Wide Profiling of RNA–Protein Interactions Using CLIP-Seq

    PubMed Central

    Stork, Cheryl; Zheng, Sika

    2017-01-01

    UV crosslinking immunoprecipitation (CLIP) is an increasingly popular technique to study protein–RNA interactions in tissues and cells. Whole cells or tissues are ultraviolet irradiated to generate a covalent bond between RNA and proteins that are in close contact. After partial RNase digestion, antibodies specific to an RNA binding protein (RBP) or a protein–epitope tag is then used to immunoprecipitate the protein–RNA complexes. After stringent washing and gel separation the RBP–RNA complex is excised. The RBP is protease digested to allow purification of the bound RNA. Reverse transcription of the RNA followed by high-throughput sequencing of the cDNA library is now often used to identify protein bound RNA on a genome-wide scale. UV irradiation can result in cDNA truncations and/or mutations at the crosslink sites, which complicates the alignment of the sequencing library to the reference genome and the identification of the crosslinking sites. Meanwhile, one or more amino acids of a crosslinked RBP can remain attached to its bound RNA due to incomplete digestion of the protein. As a result, reverse transcriptase may not read through the crosslink sites, and produce cDNA ending at the crosslinked nucleotide. This is harnessed by one variant of CLIP methods to identify crosslinking sites at a nucleotide resolution. This method, individual nucleotide resolution CLIP (iCLIP) circularizes cDNA to capture the truncated cDNA and also increases the efficiency of ligating sequencing adapters to the library. Here, we describe the detailed procedure of iCLIP. PMID:26965263

  19. Learning to Provide 3D Virtual Reference: A Library Science Assignment

    ERIC Educational Resources Information Center

    Johnson, Megan; Purpur, Geraldine; Abbott, Lisa T.

    2009-01-01

    In spring semester 2009, two of the authors taught LIB 5020--Information Sources & Services to graduate library science students at Appalachian State University. The course covers information seeking patterns and provides an overview of reference services. The course is also designed to examine and evaluate library reference materials and…

  20. Virtual Reference Services--Down-Under: A Cautionary Tale.

    ERIC Educational Resources Information Center

    Wagner, Gulten S.

    Digital reference services at university libraries in Australia and New Zealand are a recent phenomena dating back to the late 1990s--following the developments in Web-based online library services. This paper examines the move towards the provision of e-mail reference services based on the study of 16 randomly chosen university libraries in…

  1. Library and Information Networks: Centralization and Decentralization.

    ERIC Educational Resources Information Center

    Segal, JoAnn S.

    1988-01-01

    Describes the development of centralized library networks and the current factors that make library sharing on a smaller scale feasible. The discussion covers the need to decide the level at which library cooperation should occur and the possibility of linking via the Open System Interface Reference Model. (37 references) (CLB)

  2. Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

    PubMed Central

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

    2013-01-01

    For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960

  3. Combinatorial pooling enables selective sequencing of the barley gene space.

    PubMed

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J

    2013-04-01

    For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  4. Filling reference gaps via assembling DNA barcodes using high-throughput sequencing—moving toward barcoding the world

    PubMed Central

    Zhou, Chengran

    2017-01-01

    Abstract Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)–based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn’t show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. PMID:29077841

  5. Utility of NIST Whole-Genome Reference Materials for the Technical Validation of a Multigene Next-Generation Sequencing Test.

    PubMed

    Shum, Bennett O V; Henner, Ilya; Belluoccio, Daniele; Hinchcliffe, Marcus J

    2017-07-01

    The sensitivity and specificity of next-generation sequencing laboratory developed tests (LDTs) are typically determined by an analyte-specific approach. Analyte-specific validations use disease-specific controls to assess an LDT's ability to detect known pathogenic variants. Alternatively, a methods-based approach can be used for LDT technical validations. Methods-focused validations do not use disease-specific controls but use benchmark reference DNA that contains known variants (benign, variants of unknown significance, and pathogenic) to assess variant calling accuracy of a next-generation sequencing workflow. Recently, four whole-genome reference materials (RMs) from the National Institute of Standards and Technology (NIST) were released to standardize methods-based validations of next-generation sequencing panels across laboratories. We provide a practical method for using NIST RMs to validate multigene panels. We analyzed the utility of RMs in validating a novel newborn screening test that targets 70 genes, called NEO1. Despite the NIST RM variant truth set originating from multiple sequencing platforms, replicates, and library types, we discovered a 5.2% false-negative variant detection rate in the RM truth set genes that were assessed in our validation. We developed a strategy using complementary non-RM controls to demonstrate 99.6% sensitivity of the NEO1 test in detecting variants. Our findings have implications for laboratories or proficiency testing organizations using whole-genome NIST RMs for testing. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  6. Draft genome and reference transcriptomic resources for the urticating pine defoliator Thaumetopoea pityocampa (Lepidoptera: Notodontidae).

    PubMed

    Gschloessl, B; Dorkeld, F; Berges, H; Beydon, G; Bouchez, O; Branco, M; Bretaudeau, A; Burban, C; Dubois, E; Gauthier, P; Lhuillier, E; Nichols, J; Nidelet, S; Rocha, S; Sauné, L; Streiff, R; Gautier, M; Kerdelhué, C

    2018-05-01

    The pine processionary moth Thaumetopoea pityocampa (Lepidoptera: Notodontidae) is the main pine defoliator in the Mediterranean region. Its urticating larvae cause severe human and animal health concerns in the invaded areas. This species shows a high phenotypic variability for various traits, such as phenology, fecundity and tolerance to extreme temperatures. This study presents the construction and analysis of extensive genomic and transcriptomic resources, which are an obligate prerequisite to understand their underlying genetic architecture. Using a well-studied population from Portugal with peculiar phenological characteristics, the karyotype was first determined and a first draft genome of 537 Mb total length was assembled into 68,292 scaffolds (N50 = 164 kb). From this genome assembly, 29,415 coding genes were predicted. To circumvent some limitations for fine-scale physical mapping of genomic regions of interest, a 3X coverage BAC library was also developed. In particular, 11 BACs from this library were individually sequenced to assess the assembly quality. Additionally, de novo transcriptomic resources were generated from various developmental stages sequenced with HiSeq and MiSeq Illumina technologies. The reads were de novo assembled into 62,376 and 63,175 transcripts, respectively. Then, a robust subset of the genome-predicted coding genes, the de novo transcriptome assemblies and previously published 454/Sanger data were clustered to obtain a high-quality and comprehensive reference transcriptome consisting of 29,701 bona fide unigenes. These sequences covered 99% of the cegma and 88% of the busco highly conserved eukaryotic genes and 84% of the busco arthropod gene set. Moreover, 90% of these transcripts could be localized on the draft genome. The described information is available via a genome annotation portal (http://bipaa.genouest.org/sp/thaumetopoea_pityocampa/). © 2018 John Wiley & Sons Ltd.

  7. The European sea bass Dicentrarchus labrax genome puzzle: comparative BAC-mapping and low coverage shotgun sequencing

    PubMed Central

    2010-01-01

    Background Food supply from the ocean is constrained by the shortage of domesticated and selected fish. Development of genomic models of economically important fishes should assist with the removal of this bottleneck. European sea bass Dicentrarchus labrax L. (Moronidae, Perciformes, Teleostei) is one of the most important fishes in European marine aquaculture; growing genomic resources put it on its way to serve as an economic model. Results End sequencing of a sea bass genomic BAC-library enabled the comparative mapping of the sea bass genome using the three-spined stickleback Gasterosteus aculeatus genome as a reference. BAC-end sequences (102,690) were aligned to the stickleback genome. The number of mappable BACs was improved using a two-fold coverage WGS dataset of sea bass resulting in a comparative BAC-map covering 87% of stickleback chromosomes with 588 BAC-contigs. The minimum size of 83 contigs covering 50% of the reference was 1.2 Mbp; the largest BAC-contig comprised 8.86 Mbp. More than 22,000 BAC-clones aligned with both ends to the reference genome. Intra-chromosomal rearrangements between sea bass and stickleback were identified. Size distributions of mapped BACs were used to calculate that the genome of sea bass may be only 1.3 fold larger than the 460 Mbp stickleback genome. Conclusions The BAC map is used for sequencing single BACs or BAC-pools covering defined genomic entities by second generation sequencing technologies. Together with the WGS dataset it initiates a sea bass genome sequencing project. This will allow the quantification of polymorphisms through resequencing, which is important for selecting highly performing domesticated fish. PMID:20105308

  8. Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

    PubMed

    Caruccio, Nicholas

    2011-01-01

    DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.

  9. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  10. Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

    PubMed

    Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

    2009-06-01

    The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.

  11. DNA Barcoding Identifies Argentine Fishes from Marine and Brackish Waters

    PubMed Central

    Mabragaña, Ezequiel; Díaz de Astarloa, Juan Martín; Hanner, Robert; Zhang, Junbin; González Castro, Mariano

    2011-01-01

    Background DNA barcoding has been advanced as a promising tool to aid species identification and discovery through the use of short, standardized gene targets. Despite extensive taxonomic studies, for a variety of reasons the identification of fishes can be problematic, even for experts. DNA barcoding is proving to be a useful tool in this context. However, its broad application is impeded by the need to construct a comprehensive reference sequence library for all fish species. Here, we make a regional contribution to this grand challenge by calibrating the species discrimination efficiency of barcoding among 125 Argentine fish species, representing nearly one third of the known fauna, and examine the utility of these data to address several key taxonomic uncertainties pertaining to species in this region. Methodology/Principal Findings Specimens were collected and morphologically identified during crusies conducted between 2005 and 2008. The standard BARCODE fragment of COI was amplified and bi-directionally sequenced from 577 specimens (mean of 5 specimens/species), and all specimens and sequence data were archived and interrogated using analytical tools available on the Barcode of Life Data System (BOLD; www.barcodinglife.org). Nearly all species exhibited discrete clusters of closely related haplogroups which permitted the discrimination of 95% of the species (i.e. 119/125) examined while cases of shared haplotypes were detected among just three species-pairs. Notably, barcoding aided the identification of a new species of skate, Dipturus argentinensis, permitted the recognition of Genypterus brasiliensis as a valid species and questions the generic assignment of Paralichthys isosceles. Conclusions/Significance This study constitutes a significant contribution to the global barcode reference sequence library for fishes and demonstrates the utility of barcoding for regional species identification. As an independent assessment of alpha taxonomy, barcodes provide robust support for most morphologically based taxon concepts and also highlight key areas of taxonomic uncertainty worthy of reappraisal. PMID:22174860

  12. Efficient engineering of chromosomal ribosome binding site libraries in mismatch repair proficient Escherichia coli.

    PubMed

    Oesterle, Sabine; Gerngross, Daniel; Schmitt, Steven; Roberts, Tania Michelle; Panke, Sven

    2017-09-26

    Multiplexed gene expression optimization via modulation of gene translation efficiency through ribosome binding site (RBS) engineering is a valuable approach for optimizing artificial properties in bacteria, ranging from genetic circuits to production pathways. Established algorithms design smart RBS-libraries based on a single partially-degenerate sequence that efficiently samples the entire space of translation initiation rates. However, the sequence space that is accessible when integrating the library by CRISPR/Cas9-based genome editing is severely restricted by DNA mismatch repair (MMR) systems. MMR efficiency depends on the type and length of the mismatch and thus effectively removes potential library members from the pool. Rather than working in MMR-deficient strains, which accumulate off-target mutations, or depending on temporary MMR inactivation, which requires additional steps, we eliminate this limitation by developing a pre-selection rule of genome-library-optimized-sequences (GLOS) that enables introducing large functional diversity into MMR-proficient strains with sequences that are no longer subject to MMR-processing. We implement several GLOS-libraries in Escherichia coli and show that GLOS-libraries indeed retain diversity during genome editing and that such libraries can be used in complex genome editing operations such as concomitant deletions. We argue that this approach allows for stable and efficient fine tuning of chromosomal functions with minimal effort.

  13. Expert Systems for Libraries at SCIL [Small Computers in Libraries]'88.

    ERIC Educational Resources Information Center

    Kochtanek, Thomas R.; And Others

    1988-01-01

    Six brief papers on expert systems for libraries cover (1) a knowledge-based approach to database design; (2) getting started in expert systems; (3) using public domain software to develop a business reference system; (4) a music cataloging inquiry system; (5) linguistic analysis of reference transactions; and (6) a model of a reference librarian.…

  14. Improving the Virtual Reference Experience: How Closely Do Academic Libraries Adhere to RUSA Guidelines?

    ERIC Educational Resources Information Center

    Platt, Jessica; Benson, Pete

    2010-01-01

    The purpose of this study was to measure the degree to which academic libraries or library staff members throughout the United States adhere to the Guidelines for Virtual Reference Services provided by the Reference & User Services Association (RUSA). The results of the study were analyzed to identify specific areas where improvement is needed…

  15. Resource Delivery and Teaching in Live Chat Reference: Comparing Two Libraries

    ERIC Educational Resources Information Center

    Dempsey, Paula R.

    2017-01-01

    This study investigates how reference staff at two libraries balance teaching with resource delivery in live chat reference. Analysis of 410 transcripts from one week shows that one library tends to deliver more resources from a wider range of database suggestions, to take more time in chat interactions, and to incorporate more teaching behavior…

  16. Changes in Library Technology and Reference Desk Statistics: Is There a Relationship?

    ERIC Educational Resources Information Center

    Thomsett-Scott, Beth; Reese, Patricia E.

    2006-01-01

    The incorporation of technology into library processes has tremendously impacted staff and users alike. The University of North Texas (UNT) Libraries is no exception. Sixteen years of reference statistics are analyzed to examine the relationships between the implementation of CD-ROMs and web-based resources and the number of reference questions.…

  17. Content Knowledge of the Internet Among Public Reference Librarians at the Cleveland Heights Public Library.

    ERIC Educational Resources Information Center

    Augustine, Matthew J.

    The Internet is becoming an increasingly important medium of electronic information dissemination, and thus an increasingly important library reference tool. This study examines the Internet skills of a sample group of 15 public reference librarians in the Adult Services Department at the Cleveland Heights Public Library. The Internet skills that…

  18. A DNA barcode library for Germany's mayflies, stoneflies and caddisflies (Ephemeroptera, Plecoptera and Trichoptera).

    PubMed

    Morinière, Jérôme; Hendrich, Lars; Balke, Michael; Beermann, Arne J; König, Tobias; Hess, Monika; Koch, Stefan; Müller, Reinhard; Leese, Florian; Hebert, Paul D N; Hausmann, Axel; Schubart, Christoph D; Haszprunar, Gerhard

    2017-11-01

    Mayflies, stoneflies and caddisflies (Ephemeroptera, Plecoptera and Trichoptera) are prominent representatives of aquatic macroinvertebrates, commonly used as indicator organisms for water quality and ecosystem assessments. However, unambiguous morphological identification of EPT species, especially their immature life stages, is a challenging, yet fundamental task. A comprehensive DNA barcode library based upon taxonomically well-curated specimens is needed to overcome the problematic identification. Once available, this library will support the implementation of fast, cost-efficient and reliable DNA-based identifications and assessments of ecological status. This study represents a major step towards a DNA barcode reference library as it covers for two-thirds of Germany's EPT species including 2,613 individuals belonging to 363 identified species. As such, it provides coverage for 38 of 44 families (86%) and practically all major bioindicator species. DNA barcode compliant sequences (≥500 bp) were recovered from 98.74% of the analysed specimens. Whereas most species (325, i.e., 89.53%) were unambiguously assigned to a single Barcode Index Number (BIN) by its COI sequence, 38 species (18 Ephemeroptera, nine Plecoptera and 11 Trichoptera) were assigned to a total of 89 BINs. Most of these additional BINs formed nearest neighbour clusters, reflecting the discrimination of geographical subclades of a currently recognized species. BIN sharing was uncommon, involving only two species pairs of Ephemeroptera. Interestingly, both maximum pairwise and nearest neighbour distances were substantially higher for Ephemeroptera compared to Plecoptera and Trichoptera, possibly indicating older speciation events, stronger positive selection or faster rate of molecular evolution. © 2017 John Wiley & Sons Ltd.

  19. Bilingual Virtual Reference: It's Better than Searching the Open Web

    ERIC Educational Resources Information Center

    Lupien, Pascal

    2004-01-01

    Online library services have been mostly available uniquely in English. To serve the diverse communities that uses library services, a growing number of libraries have been investigating the ins and outs of offering virtual reference in languages other than English. Developing this kind of service in languages spoken by various library user groups…

  20. Benchmarking Reference Desk Service in Academic Health Science Libraries: A Preliminary Survey.

    ERIC Educational Resources Information Center

    Robbins, Kathryn; Daniels, Kathleen

    2001-01-01

    This preliminary study was designed to benchmark patron perceptions of reference desk services at academic health science libraries, using a standard questionnaire. Responses were compared to determine the library that provided the highest-quality service overall and along five service dimensions. All libraries were rated very favorably, but none…

  1. Electronic Resource Expenditure and the Decline in Reference Transaction Statistics in Academic Libraries

    ERIC Educational Resources Information Center

    Dubnjakovic, Ana

    2012-01-01

    The current study investigates factors influencing increase in reference transactions in a typical week in academic libraries across the United States of America. Employing multiple regression analysis and general linear modeling, variables of interest from the "Academic Library Survey (ALS) 2006" survey (sample size 3960 academic libraries) were…

  2. [Current applications of high-throughput DNA sequencing technology in antibody drug research].

    PubMed

    Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

    2012-03-01

    Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.

  3. K-shuff: A Novel Algorithm for Characterizing Structural and Compositional Diversity in Gene Libraries.

    PubMed

    Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A; Rathbun, Stephen L; Whitman, William B

    2016-01-01

    K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.

  4. Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species

    PubMed Central

    Wu, Chengcang; Proestou, Dina; Carter, Dorothy; Nicholson, Erica; Santos, Filippe; Zhao, Shaying; Zhang, Hong-Bin; Goldsmith, Marian R

    2009-01-01

    Background Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion The high-quality and large-insert BAC libraries of the insects, together with the identified BACs containing genes of interest, provide valuable information, resources and tools for comprehensive understanding and studies of the insect genomes and for addressing many fundamental questions in Lepidoptera. The sample of the genomic sequences provides the first insight into the constitution and evolution of the insect genomes. PMID:19558662

  5. CD-ROM as a Library Equivalent.

    ERIC Educational Resources Information Center

    Reid, Harold T.

    1992-01-01

    Provides a brief overview of CD-ROM products that emulate a library's print collection. Highlights include reference works, including "Microsoft Bookshelf" and "Toolworks Reference Library"; the "New Grolier Electronic Encyclopedia"; periodicals and news services, including "Magazine Rack" and "Front…

  6. Gene expression profiles responses to aphid feeding in chrysanthemum (Chrysanthemum morifolium).

    PubMed

    Xia, Xiaolong; Shao, Yafeng; Jiang, Jiafu; Ren, Liping; Chen, Fadi; Fang, Weimin; Guan, Zhiyong; Chen, Sumei

    2014-12-02

    Chrysanthemum is an important ornamental plant all over the world. It is easily attacked by aphid, Macrosiphoniella sanbourni. The molecular mechanisms of plant defense responses to aphid are only partially understood. Here, we investigate the gene expression changes in response to aphid feeding in chrysanthemum leaf by RNA-Seq technology. Three libraries were generated from pooled leaf tissues of Chrysanthemum morifolium 'nannongxunzhang' that were collected at different time points with (Y) or without (CK) aphid infestations and mock puncture treatment (Z), and sequenced using an Illumina HiSeqTM 2000 platform. A total of 7,363,292, 7,215,860 and 7,319,841 clean reads were obtained in library CK, Y and Z, respectively. The proportion of clean reads was >97.29% in each library. Approximately 76.35% of the clean reads were mapped to a reference gene database including all known chrysanthemum unigene sequences. 1,157, 527 and 340 differentially expressed genes (DEGs) were identified in the comparison of CK-VS-Y, CK-VS-Z and Z-VS-Y, respectively. These DEGs were involved in phytohormone signaling, cell wall biosynthesis, photosynthesis, reactive oxygen species (ROS) pathway and transcription factor regulatory networks, and so on. Changes in gene expression induced by aphid feeding are shown to be multifaceted. There are various forms of crosstalk between different pathways those genes belonging to, which would allow plants to fine-tune its defense responses.

  7. RapTOR: Automated Sequencing Library Preparation and Suppression for Rapid Pathogen Characterization (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Lane, Todd

    2018-05-18

    Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  8. RapTOR: Automated Sequencing Library Preparation and Suppression for Rapid Pathogen Characterization (7th Annual SFAF Meeting, 2012)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lane, Todd

    2012-06-01

    Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  9. BAC-End Sequence-Based SNP Mining in Allotetraploid Cotton (Gossypium) Utilizing Resequencing Data, Phylogenetic Inferences, and Perspectives for Genetic Mapping

    PubMed Central

    Hulse-Kemp, Amanda M.; Ashrafi, Hamid; Stoffel, Kevin; Zheng, Xiuting; Saski, Christopher A.; Scheffler, Brian E.; Fang, David D.; Chen, Z. Jeffrey; Van Deynze, Allen; Stelly, David M.

    2015-01-01

    A bacterial artificial chromosome library and BAC-end sequences for cultivated cotton (Gossypium hirsutum L.) have recently been developed. This report presents genome-wide single nucleotide polymorphism (SNP) mining utilizing resequencing data with BAC-end sequences as a reference by alignment of 12 G. hirsutum L. lines, one G. barbadense L. line, and one G. longicalyx Hutch and Lee line. A total of 132,262 intraspecific SNPs have been developed for G. hirsutum, whereas 223,138 and 470,631 interspecific SNPs have been developed for G. barbadense and G. longicalyx, respectively. Using a set of interspecific SNPs, 11 randomly selected and 77 SNPs that are putatively associated with the homeologous chromosome pair 12 and 26, we mapped 77 SNPs into two linkage groups representing these chromosomes, spanning a total of 236.2 cM in an interspecific F2 population (G. barbadense 3-79 × G. hirsutum TM-1). The mapping results validated the approach for reliably producing large numbers of both intraspecific and interspecific SNPs aligned to BAC-ends. This will allow for future construction of high-density integrated physical and genetic maps for cotton and other complex polyploid genomes. The methods developed will allow for future Gossypium resequencing data to be automatically genotyped for identified SNPs along the BAC-end sequence reference for anchoring sequence assemblies and comparative studies. PMID:25858960

  10. Open resource metagenomics: a model for sharing metagenomic libraries.

    PubMed

    Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C

    2011-11-30

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project.

  11. Open resource metagenomics: a model for sharing metagenomic libraries

    PubMed Central

    Neufeld, J.D.; Engel, K.; Cheng, J.; Moreno-Hagelsieb, G.; Rose, D.R.; Charles, T.C.

    2011-01-01

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM2BL [1]). The CM2BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project. PMID:22180823

  12. Model-based reconstruction of synthetic promoter library in Corynebacterium glutamicum.

    PubMed

    Zhang, Shuanghong; Liu, Dingyu; Mao, Zhitao; Mao, Yufeng; Ma, Hongwu; Chen, Tao; Zhao, Xueming; Wang, Zhiwen

    2018-05-01

    To develop an efficient synthetic promoter library for fine-tuned expression of target genes in Corynebacterium glutamicum. A synthetic promoter library for C. glutamicum was developed based on conserved sequences of the - 10 and - 35 regions. The synthetic promoter library covered a wide range of strengths, ranging from 1 to 193% of the tac promoter. 68 promoters were selected and sequenced for correlation analysis between promoter sequence and strength with a statistical model. A new promoter library was further reconstructed with improved promoter strength and coverage based on the results of correlation analysis. Tandem promoter P70 was finally constructed with increased strength by 121% over the tac promoter. The promoter library developed in this study showed a great potential for applications in metabolic engineering and synthetic biology for the optimization of metabolic networks. To the best of our knowledge, this is the first reconstruction of synthetic promoter library based on statistical analysis of C. glutamicum.

  13. In Silico Identification of Protein Disulfide Isomerase Gene Families in the De Novo Assembled Transcriptomes of Four Different Species of the Genus Conus.

    PubMed

    Figueroa-Montiel, Andrea; Ramos, Marco A; Mares, Rosa E; Dueñas, Salvador; Pimienta, Genaro; Ortiz, Ernesto; Possani, Lourival D; Licea-Navarro, Alexei F

    2016-01-01

    Small peptides isolated from the venom of the marine snails belonging to the genus Conus have been largely studied because of their therapeutic value. These peptides can be classified in two groups. The largest one is composed by peptides rich in disulfide bonds, and referred to as conotoxins. Despite the importance of conotoxins given their pharmacology value, little is known about the protein disulfide isomerase (PDI) enzymes that are required to catalyze their correct folding. To discover the PDIs that may participate in the folding and structural maturation of conotoxins, the transcriptomes of the venom duct of four different species of Conus from the peninsula of Baja California (Mexico) were assembled. Complementary DNA (cDNA) libraries were constructed for each species and sequenced using a Genome Analyzer Illumina platform. The raw RNA-seq data was converted into transcript sequences using Trinity, a de novo assembler that allows the grouping of reads into contigs without a reference genome. An N50 value of 605 was established as a reference for future assemblies of Conus transcriptomes using this software. Transdecoder was used to extract likely coding sequences from Trinity transcripts, and PDI-specific sequence motif "APWCGHCK" was used to capture potential PDIs. An in silico analysis was performed to characterize the group of PDI protein sequences encoded by the duct-transcriptome of each species. The computational approach entailed a structural homology characterization, based on the presence of functional Thioredoxin-like domains. Four different PDI families were characterized, which are constituted by a total of 41 different gene sequences. The sequences had an average of 65% identity with other PDIs. Using MODELLER 9.14, the homology-based three-dimensional structure prediction of a subset of the sequences reported, showed the expected thioredoxin fold which was confirmed by a "simulated annealing" method.

  14. Opening the treasure chest: A DNA-barcoding primer set for most higher taxa of Central European birds and mammals from museum collections

    PubMed Central

    Schäffer, Sylvia; Zachos, Frank E.

    2017-01-01

    DNA-barcoding is a rapidly developing method for efficiently identifying samples to species level by means of short standard DNA sequences. However, reliable species assignment requires the availability of a comprehensive DNA barcode reference library, and hence numerous initiatives aim at generating such barcode databases for particular taxa or geographic regions. Historical museum collections represent a potentially invaluable source for the DNA-barcoding of many taxa. This is particularly true for birds and mammals, for which collecting fresh (voucher) material is often very difficult to (nearly) impossible due to the special animal welfare and conservation regulations that apply to vertebrates in general, and birds and mammals in particular. Moreover, even great efforts might not guarantee sufficiently complete sampling of fresh material in a short period of time. DNA extracted from historical samples is usually degraded, such that only short fragments can be amplified, rendering the recovery of the barcoding region as a single fragment impossible. Here, we present a new set of primers that allows the efficient amplification and sequencing of the entire barcoding region in most higher taxa of Central European birds and mammals in six overlapping fragments, thus greatly increasing the value of historical museum collections for generating DNA barcode reference libraries. Applying our new primer set in recently established NGS protocols promises to further increase the efficiency of barcoding old bird and mammal specimens. PMID:28358863

  15. Opening the treasure chest: A DNA-barcoding primer set for most higher taxa of Central European birds and mammals from museum collections.

    PubMed

    Schäffer, Sylvia; Zachos, Frank E; Koblmüller, Stephan

    2017-01-01

    DNA-barcoding is a rapidly developing method for efficiently identifying samples to species level by means of short standard DNA sequences. However, reliable species assignment requires the availability of a comprehensive DNA barcode reference library, and hence numerous initiatives aim at generating such barcode databases for particular taxa or geographic regions. Historical museum collections represent a potentially invaluable source for the DNA-barcoding of many taxa. This is particularly true for birds and mammals, for which collecting fresh (voucher) material is often very difficult to (nearly) impossible due to the special animal welfare and conservation regulations that apply to vertebrates in general, and birds and mammals in particular. Moreover, even great efforts might not guarantee sufficiently complete sampling of fresh material in a short period of time. DNA extracted from historical samples is usually degraded, such that only short fragments can be amplified, rendering the recovery of the barcoding region as a single fragment impossible. Here, we present a new set of primers that allows the efficient amplification and sequencing of the entire barcoding region in most higher taxa of Central European birds and mammals in six overlapping fragments, thus greatly increasing the value of historical museum collections for generating DNA barcode reference libraries. Applying our new primer set in recently established NGS protocols promises to further increase the efficiency of barcoding old bird and mammal specimens.

  16. A new single-nucleotide polymorphism database for rainbow trout generated through whole genome re-sequencing

    USDA-ARS?s Scientific Manuscript database

    Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...

  17. Bacterial Artificial Chromosome Libraries for Mouse Sequencing and Functional Analysis

    PubMed Central

    Osoegawa, Kazutoyo; Tateno, Minako; Woon, Peng Yeong; Frengen, Eirik; Mammoser, Aaron G.; Catanese, Joseph J.; Hayashizaki, Yoshihide; de Jong, Pieter J.

    2000-01-01

    Bacterial artificial chromosome (BAC) and P1-derived artificial chromosome (PAC) libraries providing a combined 33-fold representation of the murine genome have been constructed using two different restriction enzymes for genomic digestion. A large-insert PAC library was prepared from the 129S6/SvEvTac strain in a bacterial/mammalian shuttle vector to facilitate functional gene studies. For genome mapping and sequencing, we prepared BAC libraries from the 129S6/SvEvTac and the C57BL/6J strains. The average insert sizes for the three libraries range between 130 kb and 200 kb. Based on the numbers of clones and the observed average insert sizes, we estimate each library to have slightly in excess of 10-fold genome representation. The average number of clones found after hybridization screening with 28 probes was in the range of 9–14 clones per marker. To explore the fidelity of the genomic representation in the three libraries, we analyzed three contigs, each established after screening with a single unique marker. New markers were established from the end sequences and screened against all the contig members to determine if any of the BACs and PACs are chimeric or rearranged. Only one chimeric clone and six potential deletions have been observed after extensive analysis of 113 PAC and BAC clones. Seventy-one of the 113 clones were conclusively nonchimeric because both end markers or sequences were mapped to the other confirmed contig members. We could not exclude chimerism for the remaining 41 clones because one or both of the insert termini did not contain unique sequence to design markers. The low rate of chimerism, ∼1%, and the low level of detected rearrangements support the anticipated usefulness of the BAC libraries for genome research. [The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AQ797173–AQ797398.] PMID:10645956

  18. Trinucleotide cassettes increase diversity of T7 phage-displayed peptide library.

    PubMed

    Krumpe, Lauren R H; Schumacher, Kathryn M; McMahon, James B; Makowski, Lee; Mori, Toshiyuki

    2007-10-05

    Amino acid sequence diversity is introduced into a phage-displayed peptide library by randomizing library oligonucleotide DNA. We recently evaluated the diversity of peptide libraries displayed on T7 lytic phage and M13 filamentous phage and showed that T7 phage can display a more diverse amino acid sequence repertoire due to differing processes of viral morphogenesis. In this study, we evaluated and compared the diversity of a 12-mer T7 phage-displayed peptide library randomized using codon-corrected trinucleotide cassettes with a T7 and an M13 12-mer phage-displayed peptide library constructed using the degenerate codon randomization method. We herein demonstrate that the combination of trinucleotide cassette amino acid codon randomization and T7 phage display construction methods resulted in a significant enhancement to the functional diversity of a 12-mer peptide library. This novel library exhibited superior amino acid uniformity and order-of-magnitude increases in amino acid sequence diversity as compared to degenerate codon randomized peptide libraries. Comparative analyses of the biophysical characteristics of the 12-mer peptide libraries revealed the trinucleotide cassette-randomized library to be a unique resource. The combination of T7 phage display and trinucleotide cassette randomization resulted in a novel resource for the potential isolation of binding peptides for new and previously studied molecular targets.

  19. A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome.

    PubMed

    Allali, Imane; Arnold, Jason W; Roach, Jeffrey; Cadenas, Maria Belen; Butz, Natasha; Hassan, Hosni M; Koci, Matthew; Ballou, Anne; Mendoza, Mary; Ali, Rizwana; Azcarate-Peril, M Andrea

    2017-09-13

    Advancements in Next Generation Sequencing (NGS) technologies regarding throughput, read length and accuracy had a major impact on microbiome research by significantly improving 16S rRNA amplicon sequencing. As rapid improvements in sequencing platforms and new data analysis pipelines are introduced, it is essential to evaluate their capabilities in specific applications. The aim of this study was to assess whether the same project-specific biological conclusions regarding microbiome composition could be reached using different sequencing platforms and bioinformatics pipelines. Chicken cecum microbiome was analyzed by 16S rRNA amplicon sequencing using Illumina MiSeq, Ion Torrent PGM, and Roche 454 GS FLX Titanium platforms, with standard and modified protocols for library preparation. We labeled the bioinformatics pipelines included in our analysis QIIME1 and QIIME2 (de novo OTU picking [not to be confused with QIIME version 2 commonly referred to as QIIME2]), QIIME3 and QIIME4 (open reference OTU picking), UPARSE1 and UPARSE2 (each pair differs only in the use of chimera depletion methods), and DADA2 (for Illumina data only). GS FLX+ yielded the longest reads and highest quality scores, while MiSeq generated the largest number of reads after quality filtering. Declines in quality scores were observed starting at bases 150-199 for GS FLX+ and bases 90-99 for MiSeq. Scores were stable for PGM-generated data. Overall microbiome compositional profiles were comparable between platforms; however, average relative abundance of specific taxa varied depending on sequencing platform, library preparation method, and bioinformatics analysis. Specifically, QIIME with de novo OTU picking yielded the highest number of unique species and alpha diversity was reduced with UPARSE and DADA2 compared to QIIME. The three platforms compared in this study were capable of discriminating samples by treatment, despite differences in diversity and abundance, leading to similar biological conclusions. Our results demonstrate that while there were differences in depth of coverage and phylogenetic diversity, all workflows revealed comparable treatment effects on microbial diversity. To increase reproducibility and reliability and to retain consistency between similar studies, it is important to consider the impact on data quality and relative abundance of taxa when selecting NGS platforms and analysis tools for microbiome studies.

  20. The Role of Virtual Reference in Library Web Site Design: A Qualitative Source for Usage Data

    ERIC Educational Resources Information Center

    Powers, Amanda Clay; Shedd, Julie; Hill, Clay

    2011-01-01

    Gathering qualitative information about usage behavior of library Web sites is a time-consuming process requiring the active participation of patron communities. Libraries that collect virtual reference transcripts, however, hold valuable data regarding how the library Web site is used that could benefit Web designers. An analysis of virtual…

  1. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

    PubMed Central

    Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier

    2008-01-01

    Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152

  2. Multiplexed microsatellite recovery using massively parallel sequencing

    USGS Publications Warehouse

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  3. Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

    PubMed Central

    Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

    2010-01-01

    Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877

  4. In vitro selection using a dual RNA library that allows primerless selection

    PubMed Central

    Jarosch, Florian; Buchner, Klaus; Klussmann, Sven

    2006-01-01

    High affinity target-binding aptamers are identified from random oligonucleotide libraries by an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX). Since the SELEX process includes a PCR amplification step the randomized region of the oligonucleotide libraries need to be flanked by two fixed primer binding sequences. These primer binding sites are often difficult to truncate because they may be necessary to maintain the structure of the aptamer or may even be part of the target binding motif. We designed a novel type of RNA library that carries fixed sequences which constrain the oligonucleotides into a partly double-stranded structure, thereby minimizing the risk that the primer binding sequences become part of the target-binding motif. Moreover, the specific design of the library including the use of tandem RNA Polymerase promoters allows the selection of oligonucleotides without any primer binding sequences. The library was used to select aptamers to the mirror-image peptide of ghrelin. Ghrelin is a potent stimulator of growth-hormone release and food intake. After selection, the identified aptamer sequences were directly synthesized in their mirror-image configuration. The final 44 nt-Spiegelmer, named NOX-B11-3, blocks ghrelin action in a cell culture assay displaying an IC50 of 4.5 nM at 37°C. PMID:16855281

  5. A novel process of viral vector barcoding and library preparation enables high-diversity library generation and recombination-free paired-end sequencing

    PubMed Central

    Davidsson, Marcus; Diaz-Fernandez, Paula; Schwich, Oliver D.; Torroba, Marcos; Wang, Gang; Björklund, Tomas

    2016-01-01

    Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode. PMID:27874090

  6. Miniaturization Technologies for Efficient Single-Cell Library Preparation for Next-Generation Sequencing.

    PubMed

    Mora-Castilla, Sergio; To, Cuong; Vaezeslami, Soheila; Morey, Robert; Srinivasan, Srimeenakshi; Dumdie, Jennifer N; Cook-Andersen, Heidi; Jenkins, Joby; Laurent, Louise C

    2016-08-01

    As the cost of next-generation sequencing has decreased, library preparation costs have become a more significant proportion of the total cost, especially for high-throughput applications such as single-cell RNA profiling. Here, we have applied novel technologies to scale down reaction volumes for library preparation. Our system consisted of in vitro differentiated human embryonic stem cells representing two stages of pancreatic differentiation, for which we prepared multiple biological and technical replicates. We used the Fluidigm (San Francisco, CA) C1 single-cell Autoprep System for single-cell complementary DNA (cDNA) generation and an enzyme-based tagmentation system (Nextera XT; Illumina, San Diego, CA) with a nanoliter liquid handler (mosquito HTS; TTP Labtech, Royston, UK) for library preparation, reducing the reaction volume down to 2 µL and using as little as 20 pg of input cDNA. The resulting sequencing data were bioinformatically analyzed and correlated among the different library reaction volumes. Our results showed that decreasing the reaction volume did not interfere with the quality or the reproducibility of the sequencing data, and the transcriptional data from the scaled-down libraries allowed us to distinguish between single cells. Thus, we have developed a process to enable efficient and cost-effective high-throughput single-cell transcriptome sequencing. © 2016 Society for Laboratory Automation and Screening.

  7. The Test of Reference.

    ERIC Educational Resources Information Center

    Childers, Thomas

    1980-01-01

    Reports the results of an unobtrusive study, from a user's viewpoint, of reference services available in the Suffolk Cooperative Library System. The study raises questions of policy centering around user expectations of library reference services. (RAA)

  8. Next-generation transcriptome sequencing, SNP discovery and validation in four market classes of peanut, Arachis hypogaea L.

    PubMed

    Chopra, Ratan; Burow, Gloria; Farmer, Andrew; Mudge, Joann; Simpson, Charles E; Wilkins, Thea A; Baring, Michael R; Puppala, Naveen; Chamberlin, Kelly D; Burow, Mark D

    2015-06-01

    Single-nucleotide polymorphisms, which can be identified in the thousands or millions from comparisons of transcriptome or genome sequences, are ideally suited for making high-resolution genetic maps, investigating population evolutionary history, and discovering marker-trait linkages. Despite significant results from their use in human genetics, progress in identification and use in plants, and particularly polyploid plants, has lagged. As part of a long-term project to identify and use SNPs suitable for these purposes in cultivated peanut, which is tetraploid, we generated transcriptome sequences of four peanut cultivars, namely OLin, New Mexico Valencia C, Tamrun OL07 and Jupiter, which represent the four major market classes of peanut grown in the world, and which are important economically to the US southwest peanut growing region. CopyDNA libraries of each genotype were used to generate 2 × 54 paired-end reads using an Illumina GAIIx sequencer. Raw reads were mapped to a custom reference consisting of Tifrunner 454 sequences plus peanut ESTs in GenBank, compromising 43,108 contigs; 263,840 SNP and indel variants were identified among four genotypes compared to the reference. A subset of 6 variants was assayed across 24 genotypes representing four market types using KASP chemistry to assess the criteria for SNP selection. Results demonstrated that transcriptome sequencing can identify SNPs usable as selectable DNA-based markers in complex polyploid species such as peanut. Criteria for effective use of SNPs as markers are discussed in this context.

  9. SPlinted Ligation Adapter Tagging (SPLAT), a novel library preparation method for whole genome bisulphite sequencing

    PubMed Central

    Manlig, Erika; Wahlberg, Per

    2017-01-01

    Abstract Sodium bisulphite treatment of DNA combined with next generation sequencing (NGS) is a powerful combination for the interrogation of genome-wide DNA methylation profiles. Library preparation for whole genome bisulphite sequencing (WGBS) is challenging due to side effects of the bisulphite treatment, which leads to extensive DNA damage. Recently, a new generation of methods for bisulphite sequencing library preparation have been devised. They are based on initial bisulphite treatment of the DNA, followed by adaptor tagging of single stranded DNA fragments, and enable WGBS using low quantities of input DNA. In this study, we present a novel approach for quick and cost effective WGBS library preparation that is based on splinted adaptor tagging (SPLAT) of bisulphite-converted single-stranded DNA. Moreover, we validate SPLAT against three commercially available WGBS library preparation techniques, two of which are based on bisulphite treatment prior to adaptor tagging and one is a conventional WGBS method. PMID:27899585

  10. Trends in reference usage statistics in an academic health sciences library.

    PubMed

    De Groote, Sandra L; Hitchcock, Kristin; McGowan, Richard

    2007-01-01

    To examine reference questions asked through traditional means at an academic health sciences library and place this data within the context of larger trends in reference services. Detailed data on the types of reference questions asked were collected during two one-month periods in 2003 and 2004. General statistics documenting broad categories of questions were compiled over a fifteen-year period. Administrative data show a steady increase in questions from 1990 to 1997/98 (23,848 to 48,037, followed by a decline through 2004/05 to 10,031. The distribution of reference questions asked over the years has changed-including a reduction in mediated searches 2,157 in 1990/91 to 18 in 2004/05, an increase in instruction 1,284 in 1993/94 to 1,897 in 2004/05 and an increase in digital reference interactions 0 in 1999/2000 to 581 in 2004/05. The most commonly asked questions at the current reference desk are about journal holdings 19%, book holdings 12%, and directional issues 12%. This study provides a unique snapshot of reference services in the contemporary library, where both online and offline services are commonplace. Changes in questions have impacted the way the library provides services, but traditional reference remains the core of information services in this health sciences library.

  11. Analysis of BAC end sequences in oak, a keystone forest tree species, providing insight into the composition of its genome

    PubMed Central

    2011-01-01

    Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357

  12. Library preparation and data analysis packages for rapid genome sequencing.

    PubMed

    Pomraning, Kyle R; Smith, Kristina M; Bredeweg, Erin L; Connolly, Lanelle R; Phatale, Pallavi A; Freitag, Michael

    2012-01-01

    High-throughput sequencing (HTS) has quickly become a valuable tool for comparative genetics and genomics and is now regularly carried out in laboratories that are not connected to large sequencing centers. Here we describe an updated version of our protocol for constructing single- and paired-end Illumina sequencing libraries, beginning with purified genomic DNA. The present protocol can also be used for "multiplexing," i.e. the analysis of several samples in a single flowcell lane by generating "barcoded" or "indexed" Illumina sequencing libraries in a way that is independent from Illumina-supported methods. To analyze sequencing results, we suggest several independent approaches but end users should be aware that this is a quickly evolving field and that currently many alignment (or "mapping") and counting algorithms are being developed and tested.

  13. Construction of a plant-transformation-competent BIBAC library and genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.)

    PubMed Central

    2013-01-01

    Background Cotton, one of the world’s leading crops, is important to the world’s textile and energy industries, and is a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction of a plant-transformation-competent binary bacterial artificial chromosome (BIBAC) library and comparative genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.) with one of its diploid putative progenitor species, G. raimondii Ulbr. Results We constructed the cotton BIBAC library in a vector competent for high-molecular-weight DNA transformation in different plant species through either Agrobacterium or particle bombardment. The library contains 76,800 clones with an average insert size of 135 kb, providing an approximate 99% probability of obtaining at least one positive clone from the library using a single-copy probe. The quality and utility of the library were verified by identifying BIBACs containing genes important for fiber development, fiber cellulose biosynthesis, seed fatty acid metabolism, cotton-nematode interaction, and bacterial blight resistance. In order to gain an insight into the Upland cotton genome and its relationship with G. raimondii, we sequenced nearly 10,000 BIBAC ends (BESs) randomly selected from the library, generating approximately one BES for every 250 kb along the Upland cotton genome. The retroelement Gypsy/DIRS1 family predominates in the Upland cotton genome, accounting for over 77% of all transposable elements. From the BESs, we identified 1,269 simple sequence repeats (SSRs), of which 1,006 were new, thus providing additional markers for cotton genome research. Surprisingly, comparative sequence analysis showed that Upland cotton is much more diverged from G. raimondii at the genomic sequence level than expected. There seems to be no significant difference between the relationships of the Upland cotton D- and A-subgenomes with the G. raimondii genome, even though G. raimondii contains a D genome (D5). Conclusions The library represents the first BIBAC library in cotton and related species, thus providing tools useful for integrative physical mapping, large-scale genome sequencing and large-scale functional analysis of the Upland cotton genome. Comparative sequence analysis provides insights into the Upland cotton genome, and a possible mechanism underlying the divergence and evolution of polyploid Upland cotton from its diploid putative progenitor species, G. raimondii. PMID:23537070

  14. K-shuff: A Novel Algorithm for Characterizing Structural and Compositional Diversity in Gene Libraries

    PubMed Central

    Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A.; Rathbun, Stephen L.; Whitman, William B.

    2016-01-01

    K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley’s K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing. PMID:27911946

  15. Reference Books in Special Media. Reference Circular No. 82-4.

    ERIC Educational Resources Information Center

    Library of Congress, Washington, DC. National Library Service for the Blind and Physically Handicapped.

    Based on information contained in producers' catalogs and on responses to a survey conducted by the Reference Section of the Library of Congress National Library Service (NLS) for the Blind and Physically Handicapped, this publication lists reference materials produced in braille or in large type, and sound recordings of reference works available…

  16. The Peer Reference Counseling Program at Odum Library. Training Manual.

    ERIC Educational Resources Information Center

    Lawrence, Tamiko Danielle; Thomas, Susan; Winston, Mark

    The Peer Reference Counseling Program at Valdosta State University (Georgia) is a program designed to provide students with a unique opportunity to work at the Reference Desk at Odum Library. This program employs minority students and trains them to work at the Reference Desk answering basic reference questions and utilizing as well as…

  17. High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency.

    PubMed

    Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J; Burnett, John C; Zhou, Jiehua

    2016-09-22

    The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct "biased sequences" and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the "biased sequences" was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy.

  18. Illuminating choices for library prep: a comparison of library preparation methods for whole genome sequencing of Cryptococcus neoformans using Illumina HiSeq.

    PubMed

    Rhodes, Johanna; Beale, Mathew A; Fisher, Matthew C

    2014-01-01

    The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use.

  19. Biopanning of polypeptides binding to bovine ephemeral fever virus G1 protein from phage display peptide library.

    PubMed

    Hou, Peili; Zhao, Guimin; He, Chengqiang; Wang, Hongmei; He, Hongbin

    2018-01-04

    The bovine ephemeral fever virus (BEFV) glycoprotein neutralization site 1 (also referred as G 1 protein), is a critical protein responsible for virus infectivity and eliciting immune-protection, however, binding peptides of BEFV G 1 protein are still unclear. Thus, the aim of the present study was to screen specific polypeptides, which bind BEFV G 1 protein with high-affinity and inhibit BEFV replication. The purified BEFV G 1 was coated and then reacted with the M13-based Ph.D.-7 phage random display library. The peptides for target binding were automated sequenced after four rounds of enrichment biopanning. The amino acid sequences of polypeptide displayed on positive clones were deduced and the affinity of positive polypeptides with BEFV G 1 was assayed by ELISA. Then the roles of specific G 1 -binding peptides in the context of BEFV infection were analyzed. The results showed that 27 specific peptide ligands displaying 11 different amino acid sequences were obtained, and the T18 and T25 clone had a higher affinity to G 1 protein than the other clones. Then their antiviral roles of two phage clones (T25 and T18) showed that both phage polypeptide T25 and T18 exerted inhibition on BEFV replication compared to control group. Moreover, synthetic peptide based on T18 (HSIRYDF) and T25 (YSLRSDY) alone or combined use on BEFV replication showed that the synthetic peptides could effectively inhibit the formation of cytopathic plaque and significantly inhibit BEFV RNA replication in a dose-dependent manner. Two antiviral peptide ligands binding to bovine ephemeral fever virus G 1 protein from phage display peptide library were identified, which may provide a potential research tool for diagnostic reagents and novel antiviral agents.

  20. Multiple Myeloma Genomics: A Systematic Review.

    PubMed

    Weaver, Casey J; Tariman, Joseph D

    2017-08-01

    This integrative review describes the genomic variants that have been found to be associated with poor prognosis in patients diagnosed with multiple myeloma (MM). Second, it identifies MM genetic and genomic changes using next-generation sequencing, specifically whole-genome sequencing or exome sequencing. A search for peer-reviewed articles through PubMed, EBSCOhost, and DePaul WorldCat Libraries Worldwide yielded 33 articles that were included in the final analysis. The most commonly reported genetic changes were KRAS, NRAS, TP53, FAM46C, BRAF, DIS3, ATM, and CCND1. These genetic changes play a role in the pathogenesis of MM, prognostication, and therapeutic targets for novel therapies. MM genetics and genomics are expanding rapidly; oncology nurse clinicians must have basic competencies in genetics and genomics to help patients understand the complexities of genetic and genomic alterations and be able to refer patients to appropriate genomic professionals if needed. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. 1998 Idaho Public Library Statistics and Library Directory. A Compilation of Input and Output Measures and Other Statistics in Reference to Idaho's Public Libraries, Covering the Period October 1, 1997 to September 30, 1998.

    ERIC Educational Resources Information Center

    Nelson, Frank, Comp.

    This report is a compilation of input and output measures and other statistics in reference to Idaho's public libraries, covering the period from October 1997 through September 1998. The introductory sections include notes on the statistics, definitions of performance measures, Idaho public library rankings for fiscal year 1996, and a state map…

  2. Optimized Next-Generation Sequencing Genotype-Haplotype Calling for Genome Variability Analysis

    PubMed Central

    Navarro, Javier; Nevado, Bruno; Hernández, Porfidio; Vera, Gonzalo; Ramos-Onsins, Sebastián E

    2017-01-01

    The accurate estimation of nucleotide variability using next-generation sequencing data is challenged by the high number of sequencing errors produced by new sequencing technologies, especially for nonmodel species, where reference sequences may not be available and the read depth may be low due to limited budgets. The most popular single-nucleotide polymorphism (SNP) callers are designed to obtain a high SNP recovery and low false discovery rate but are not designed to account appropriately the frequency of the variants. Instead, algorithms designed to account for the frequency of SNPs give precise results for estimating the levels and the patterns of variability. These algorithms are focused on the unbiased estimation of the variability and not on the high recovery of SNPs. Here, we implemented a fast and optimized parallel algorithm that includes the method developed by Roesti et al and Lynch, which estimates the genotype of each individual at each site, considering the possibility to call both bases from the genotype, a single one or none. This algorithm does not consider the reference and therefore is independent of biases related to the reference nucleotide specified. The pipeline starts from a BAM file converted to pileup or mpileup format and the software outputs a FASTA file. The new program not only reduces the running times but also, given the improved use of resources, it allows its usage with smaller computers and large parallel computers, expanding its benefits to a wider range of researchers. The output file can be analyzed using software for population genetics analysis, such as the R library PopGenome, the software VariScan, and the program mstatspop for analysis considering positions with missing data. PMID:28894353

  3. Developmental validation of a Nextera XT mitogenome Illumina MiSeq sequencing method for high-quality samples.

    PubMed

    Peck, Michelle A; Sturk-Andreaggi, Kimberly; Thomas, Jacqueline T; Oliver, Robert S; Barritt-Ross, Suzanne; Marshall, Charla

    2018-05-01

    Generating mitochondrial genome (mitogenome) data from reference samples in a rapid and efficient manner is critical to harnessing the greater power of discrimination of the entire mitochondrial DNA (mtDNA) marker. The method of long-range target enrichment, Nextera XT library preparation, and Illumina sequencing on the MiSeq is a well-established technique for generating mitogenome data from high-quality samples. To this end, a validation was conducted for this mitogenome method processing up to 24 samples simultaneously along with analysis in the CLC Genomics Workbench and utilizing the AQME (AFDIL-QIAGEN mtDNA Expert) tool to generate forensic profiles. This validation followed the Federal Bureau of Investigation's Quality Assurance Standards (QAS) for forensic DNA testing laboratories and the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines. The evaluation of control DNA, non-probative samples, blank controls, mixtures, and nonhuman samples demonstrated the validity of this method. Specifically, the sensitivity was established at ≥25 pg of nuclear DNA input for accurate mitogenome profile generation. Unreproducible low-level variants were observed in samples with low amplicon yields. Further, variant quality was shown to be a useful metric for identifying sequencing error and crosstalk. Success of this method was demonstrated with a variety of reference sample substrates and extract types. These studies further demonstrate the advantages of using NGS techniques by highlighting the quantitative nature of heteroplasmy detection. The results presented herein from more than 175 samples processed in ten sequencing runs, show this mitogenome sequencing method and analysis strategy to be valid for the generation of reference data. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D.

    2018-01-22

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  5. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Patel, Kamlesh D.

    2012-06-01

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  6. Shift Colors

    Science.gov Websites

    Skip to main content Navigate Up This page location is: Navy Personnel Command Reference Library Command > Reference Library > Publications & News > Shift Colors Top Link Bar Navy Personnel Enlisted Support & Services Expand Support & Services Organization Expand Organization Reference

  7. A microfluidic device for preparing next generation DNA sequencing libraries and for automating other laboratory protocols that require one or more column chromatography steps.

    PubMed

    Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C; Quake, Stephen R; Burkholder, William F

    2013-01-01

    Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation.

  8. A Microfluidic Device for Preparing Next Generation DNA Sequencing Libraries and for Automating Other Laboratory Protocols That Require One or More Column Chromatography Steps

    PubMed Central

    Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C.; Quake, Stephen R.; Burkholder, William F.

    2013-01-01

    Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation. PMID:23894273

  9. Cost-effective ways of delivering enquiry services: a rapid review.

    PubMed

    Sutton, Anthea; Grant, Maria J

    2011-12-01

    In the recent times of recession and budget cuts, it is more important than ever for library and information services to deliver cost-effective services. This rapid review aims to examine the evidence for the most cost-effective ways of delivering enquiry services. A literature search was conducted on LISA (Library and Information Sciences Abstracts) and MEDLINE. Searches were limited to 2007 onwards. Eight studies met the inclusion criteria. The studies covered hospital and academic libraries in the USA and Canada. Services analysed were 'point-of-care' librarian consultations, staffing models for reference desks and virtual/digital reference services. Transferable lessons, relevant to health library and information services generally, can be drawn from this rapid review. These suggest that 'point-of-care' librarians for primary care practitioners are a cost-effective way of answering questions. Reference desks can be cost-effectively staffed by student employees or general reference staff, although librarian referral must be provided for more complex and subject-specific enquiries. However, it is not possible to draw any conclusions on virtual/digital reference services because of the limited literature available. Further case analysis studies measuring specific services, particularly enquiry services within a health library and information context, are required. © 2011 The authors. Health Information and Libraries Journal © 2011 Health Libraries Group.

  10. Occurrence of diverse alkane hydroxylase alkB genes in indigenous oil-degrading bacteria of Baltic Sea surface water.

    PubMed

    Viggor, Signe; Jõesaar, Merike; Vedler, Eve; Kiiker, Riinu; Pärnpuu, Liis; Heinaru, Ain

    2015-12-30

    Formation of specific oil degrading bacterial communities in diesel fuel, crude oil, heptane and hexadecane supplemented microcosms of the Baltic Sea surface water samples was revealed. The 475 sequences from constructed alkane hydroxylase alkB gene clone libraries were grouped into 30 OPFs. The two largest groups were most similar to Pedobacter sp. (245 from 475) and Limnobacter sp. (112 from 475) alkB gene sequences. From 56 alkane-degrading bacterial strains 41 belonged to the Pseudomonas spp. and 8 to the Rhodococcus spp. having redundant alkB genes. Together 68 alkB gene sequences were identified. These genes grouped into 20 OPFs, half of them being specific only to the isolated strains. Altogether 543 diverse alkB genes were characterized in the brackish Baltic Sea water; some of them representing novel lineages having very low sequence identities with corresponding genes of the reference strains. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Consequences of Normalizing Transcriptomic and Genomic Libraries of Plant Genomes Using a Duplex-Specific Nuclease and Tetramethylammonium Chloride

    PubMed Central

    Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088

  12. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

    PubMed

    Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.

  13. Gel-seq: A Method for Simultaneous Sequencing Library Preparation of DNA and RNA Using Hydrogel Matrices.

    PubMed

    Hoople, Gordon D; Richards, Andrew; Wu, Yan; Pisano, Albert P; Zhang, Kun

    2018-03-26

    The ability to amplify and sequence either DNA or RNA from small starting samples has only been achieved in the last five years. Unfortunately, the standard protocols for generating genomic or transcriptomic libraries are incompatible and researchers must choose whether to sequence DNA or RNA for a particular sample. Gel-seq solves this problem by enabling researchers to simultaneously prepare libraries for both DNA and RNA starting with 100 - 1000 cells using a simple hydrogel device. This paper presents a detailed approach for the fabrication of the device as well as the biological protocol to generate paired libraries. We designed Gel-seq so that it could be easily implemented by other researchers; many genetics labs already have the necessary equipment to reproduce the Gel-seq device fabrication. Our protocol employs commonly-used kits for both whole-transcript amplification (WTA) and library preparation, which are also likely to be familiar to researchers already versed in generating genomic and transcriptomic libraries. Our approach allows researchers to bring to bear the power of both DNA and RNA sequencing on a single sample without splitting and with negligible added cost.

  14. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less

  15. Construction, Characterization, and Preliminary BAC-End Sequence Analysis of a Bacterial Artificial Chromosome Library of the Tea Plant (Camellia sinensis)

    PubMed Central

    Lin, Jinke; Kudrna, Dave; Wing, Rod A.

    2011-01-01

    We describe the construction and characterization of a publicly available BAC library for the tea plant, Camellia sinensis. Using modified methods, the library was constructed with the aim of developing public molecular resources to advance tea plant genomics research. The library consists of a total of 401,280 clones with an average insert size of 135 kb, providing an approximate coverage of 13.5 haploid genome equivalents. No empty vector clones were observed in a random sampling of 576 BAC clones. Further analysis of 182 BAC-end sequences from randomly selected clones revealed a GC content of 40.35% and low chloroplast and mitochondrial contamination. Repetitive sequence analyses indicated that LTR retrotransposons were the most predominant sequence class (86.93%–87.24%), followed by DNA retrotransposons (11.16%–11.69%). Additionally, we found 25 simple sequence repeats (SSRs) that could potentially be used as genetic markers. PMID:21234344

  16. Millstone: software for multiplex microbial genome analysis and engineering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goodman, Daniel B.; Kuznetsov, Gleb; Lajoie, Marc J.

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. Here, we describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  17. Millstone: software for multiplex microbial genome analysis and engineering.

    PubMed

    Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  18. Millstone: software for multiplex microbial genome analysis and engineering

    DOE PAGES

    Goodman, Daniel B.; Kuznetsov, Gleb; Lajoie, Marc J.; ...

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. Here, we describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  19. Computer Aided Reference Services in the Academic Library: Experiences in Organizing and Operating an Online Reference Service.

    ERIC Educational Resources Information Center

    Hoover, Ryan E.

    1979-01-01

    Summarizes the development of the Computer-Aided Reference Services (CARS) division of the University of Utah Libraries' reference department. Development, organizational structure, site selection, equipment, management, staffing and training considerations, promotion and marketing, budget and pricing, record keeping, statistics, and evaluation…

  20. The Librarian as Information Consultant: Transforming Reference for the Information Age

    ERIC Educational Resources Information Center

    Murphy, Sarah Anne

    2011-01-01

    Library users' evolving information needs and their choice of search methods have changed reference work profoundly. Today's reference librarian must work in a whole new way--not only service-focused and businesslike, but even entrepreneurial. Murphy innovatively rethinks the philosophy behind current library reference services in this…

  1. Implementation and Use of the Reference Analytics Module of LibAnswers

    ERIC Educational Resources Information Center

    Flatley, Robert; Jensen, Robert Bruce

    2012-01-01

    Academic libraries have traditionally collected reference statistics using hash marks on paper. Although efficient and simple, this method is not an effective way to capture the complexity of reference transactions. Several electronic tools are now available to assist libraries with collecting often elusive reference data--among them homegrown…

  2. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments.

    PubMed

    Daily, Jeff

    2016-02-10

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. A faster intra-sequence local pairwise alignment implementation is described and benchmarked, including new global and semi-global variants. Using a 375 residue query sequence a speed of 136 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon E5-2670 24-core processor system, the highest reported for an implementation based on Farrar's 'striped' approach. Rognes's SWIPE optimal database search application is still generally the fastest available at 1.2 to at best 2.4 times faster than Parasail for sequences shorter than 500 amino acids. However, Parasail was faster for longer sequences. For global alignments, Parasail's prefix scan implementation is generally the fastest, faster even than Farrar's 'striped' approach, however the opal library is faster for single-threaded applications. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. Applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.

  3. Virtual Reference Services.

    ERIC Educational Resources Information Center

    Brewer, Sally

    2003-01-01

    As the need to access information increases, school librarians must create virtual libraries. Linked to reliable reference resources, the virtual library extends the physical collection and library hours and lets students learn to use Web-based resources in a protected learning environment. The growing number of virtual schools increases the need…

  4. How to Survive in Industry. Cost Justifying Library Services

    ERIC Educational Resources Information Center

    Kramer, Joseph

    1971-01-01

    Two services provided by the Boeing Co. Aerospace Group Library-Literature searches and reference/publication identification activities-were evaluated by written and oral surveys of the library's users. The survey technique and cost savings reported by the two studies are discussed. (2 references) (Author/NH)

  5. Gene-based SSR markers for common bean (Phaseolus vulgaris L.) derived from root and leaf tissue ESTs: an integration of the BMc series.

    PubMed

    Blair, Matthew W; Hurtado, Natalia; Chavarro, Carolina M; Muñoz-Torres, Monica C; Giraldo, Martha C; Pedraza, Fabio; Tomkins, Jeff; Wing, Rod

    2011-03-22

    Sequencing of cDNA libraries for the development of expressed sequence tags (ESTs) as well as for the discovery of simple sequence repeats (SSRs) has been a common method of developing microsatellites or SSR-based markers. In this research, our objective was to further sequence and develop common bean microsatellites from leaf and root cDNA libraries derived from the Andean gene pool accession G19833 and the Mesoamerican gene pool accession DOR364, mapping parents of a commonly used reference map. The root libraries were made from high and low phosphorus treated plants. A total of 3,123 EST sequences from leaf and root cDNA libraries were screened and used for direct simple sequence repeat discovery. From these EST sequences we found 184 microsatellites; the majority containing tri-nucleotide motifs, many of which were GC rich (ACC, AGC and AGG in particular). Di-nucleotide motif microsatellites were about half as common as the tri-nucleotide motif microsatellites but most of these were AGn microsatellites with a moderate number of ATn microsatellites in root ESTs followed by few ACn and no GCn microsatellites. Out of the 184 new SSR loci, 120 new microsatellite markers were developed in the BMc (Bean Microsatellites from cDNAs) series and these were evaluated for their capacity to distinguish bean diversity in a germplasm panel of 18 genotypes. We developed a database with images of the microsatellites and their polymorphism information content (PIC), which averaged 0.310 for polymorphic markers. The present study produced information about microsatellite frequency in root and leaf tissues of two important genotypes for common bean genomics: namely G19833, the Andean genotype selected for whole genome shotgun sequencing from race Peru, and DOR364 a race Mesoamerica subgroup 2 genotype that is a small-red seeded, released variety in Central America. Both race Peru and Mesoamerica subgroup 2 (small red beans) have been understudied in comparison to race Nueva Granada and Mesoamerica subgroup 1 (black beans) both with regards to gene expression and as sources of markers. However, we found few differences between SSR type and frequency between the G19833 leaf and DOR364 root tissue-derived ESTs. Overall, our work adds to the analysis of microsatellite frequency evaluation for common bean and provides a new set of 120 BMc markers which combined with the 248 previously developed BMc markers brings the total in this series to 368 markers. Once we include BMd markers, which are derived from GenBank sequences, the current total of gene-based markers from our laboratory surpasses 500 markers. These markers are basic for studies of the transcriptome of common bean and can form anchor points for genetic mapping studies in the future.

  6. The Mobile Reference Service: a case study of an onsite reference service program at the School of Public Health.

    PubMed

    Tao, Donghua; McCarthy, Patrick G; Krieger, Mary M; Webb, Annie B

    2009-01-01

    The School of Public Health at Saint Louis University is located at a greater distance from the library than other programs on the main medical center campus. Physical distance diminishes the ease of access to direct reference services for public health users. To bridge the gap, the library developed the Mobile Reference Service to deliver on-site information assistance with regular office hours each week. Between September 2006 and April 2007, a total of 57 in-depth reference transactions took place over 25 weeks, averaging 2 transactions per week in a 2-hour period. Overall reference transactions from public health users went up 28%, while liaison contacts with public health users doubled compared to the same period the year before. The Mobile Reference Service program has improved library support for research and scholarship, cultivated and strengthened liaison relationships, and enhanced marketing and delivery of library resources and services to the Saint Louis University School of Public Health.

  7. The Mobile Reference Service: a case study of an onsite reference service program at the school of public health*

    PubMed Central

    Tao, Donghua; McCarthy, Patrick G.; Krieger, Mary M.; Webb, Annie B.

    2009-01-01

    The School of Public Health at Saint Louis University is located at a greater distance from the library than other programs on the main medical center campus. Physical distance diminishes the ease of access to direct reference services for public health users. To bridge the gap, the library developed the Mobile Reference Service to deliver onsite information assistance with regular office hours each week. Between September 2006 and April 2007, a total of 57 in-depth reference transactions took place over 25 weeks, averaging 2 transactions per week in a 2-hour period. Overall reference transactions from public health users went up 28%, while liaison contacts with public health users doubled compared to the same period the year before. The Mobile Reference Service program has improved library support for research and scholarship, cultivated and strengthened liaison relationships, and enhanced marketing and delivery of library resources and services to the Saint Louis University School of Public Health. PMID:19159004

  8. [cDNA library construction from panicle meristem of finger millet].

    PubMed

    Radchuk, V; Pirko, Ia V; Isaenkov, S V; Emets, A I; Blium, Ia B

    2014-01-01

    The protocol for production of full-size cDNA using SuperScript Full-Length cDNA Library Construction Kit II (Invitrogen) was tested and high quality cDNA library from meristematic tissue of finger millet panicle (Eleusine coracana (L.) Gaertn) was created. The titer of obtained cDNA library comprised 3.01 x 10(5) CFU/ml in avarage. In average the length of cDNA insertion consisted about 1070 base pairs, the effectivity of cDNA fragment insertions--99.5%. The selective sequencing of cDNA clones from created library was performed. The sequences of cDNA clones were identified with usage of BLAST-search. The results of cDNA library analysis and selective sequencing represents prove good functionality and full length character of inserted cDNA clones. Obtained cDNA library from meristematic tissue of finger millet panicle represents good and valuable source for isolation and identification of key genes regulating metabolism and meristematic development and for mining of new molecular markers to conduct out high quality genetic investigations and molecular breeding as well.

  9. Partial characterization of normal and Haemophilus influenzae-infected mucosal complementary DNA libraries in chinchilla middle ear mucosa.

    PubMed

    Kerschner, Joseph E; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J Christopher; Ehrlich, Garth D

    2010-04-01

    We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription-polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis.

  10. Partial Characterization of Normal and Haemophilus influenzae–Infected Mucosal Complementary DNA Libraries in Chinchilla Middle Ear Mucosa

    PubMed Central

    Kerschner, Joseph E.; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J. Christopher; Ehrlich, Garth D.

    2010-01-01

    Objectives We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Methods Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription–polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Results Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Conclusions Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis. PMID:20433028

  11. Content Analysis of Virtual Reference Data: Reshaping Library Website Design.

    PubMed

    Fan, Suhua Caroline; Welch, Jennifer M

    2016-01-01

    An academic health sciences library wanted to redesign its website to provide better access to health information in the community. Virtual reference data were used to provide information about user searching behavior. This study analyzed three years (2012-2014) of virtual reference data, including e-mail questions, text messaging, and live chat transcripts, to evaluate the library website for redesigning, especially in areas such as the home page, patrons' terminology, and issues prompting patrons to ask for help. A coding system based on information links in the current library website was created to analyze the data.

  12. Unlocking Short Read Sequencing for Metagenomics

    DOE PAGES

    Rodrigue, Sébastien; Materna, Arne C.; Timberlake, Sonia C.; ...

    2010-07-28

    We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read.

  13. Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

    PubMed Central

    2010-01-01

    Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644

  14. Bacterial community composition in different sediments from the Eastern Mediterranean Sea: a comparison of four 16S ribosomal DNA clone libraries.

    PubMed

    Polymenakou, Paraskevi N; Bertilsson, Stefan; Tselepides, Anastasios; Stephanou, Euripides G

    2005-10-01

    The regional variability of sediment bacterial community composition and diversity was studied by comparative analysis of four large 16S ribosomal DNA (rDNA) clone libraries from sediments in different regions of the Eastern Mediterranean Sea (Thermaikos Gulf, Cretan Sea, and South lonian Sea). Amplified rDNA restriction analysis of 664 clones from the libraries indicate that the rDNA richness and evenness was high: for example, a near-1:1 relationship among screened clones and number of unique restriction patterns when up to 190 clones were screened for each library. Phylogenetic analysis of 207 bacterial 16S rDNA sequences from the sediment libraries demonstrated that Gamma-, Delta-, and Alphaproteobacteria, Holophaga/Acidobacteria, Planctomycetales, Actinobacteria, Bacteroidetes, and Verrucomicrobia were represented in all four libraries. A few clones also grouped with the Betaproteobacteria, Nitrospirae, Spirochaetales, Chlamydiae, Firmicutes, and candidate division OPl 1. The abundance of sequences affiliated with Gammaproteobacteria was higher in libraries from shallow sediments in the Thermaikos Gulf (30 m) and the Cretan Sea (100 m) compared to the deeper South Ionian station (2790 m). Most sequences in the four sediment libraries clustered with uncultured 16S rDNA phylotypes from marine habitats, and many of the closest matches were clones from hydrocarbon seeps, benzene-mineralizing consortia, sulfate reducers, sulk oxidizers, and ammonia oxidizers. LIBSHUFF statistics of 16S rDNA gene sequences from the four libraries revealed major differences, indicating either a very high richness in the sediment bacterial communities or considerable variability in bacterial community composition among regions, or both.

  15. Meeting Reference Responsibilities through Library Web Sites.

    ERIC Educational Resources Information Center

    Adams, Michael

    2001-01-01

    Discusses library Web sites and explains some of the benefits when libraries make their sites into reference portals, linking them to other useful Web sites. Topics include print versus Web information sources; limitations of search engines; what Web sites to include, including criteria for inclusions; and organizing the sites. (LRW)

  16. Libraries & Study Facilities. A Selected Bibliography.

    ERIC Educational Resources Information Center

    Wisconsin Univ., Madison. ERIC Clearinghouse on Educational Facilities.

    This bibliography contains a selected reference list of publications of interest to architects, library planners, and librarians contemplating the planning, programing, and/or design of library facilities. Each reference is followed by a listing of ERIC descriptors that describe its contents. The items are listed in six sections: (1) library…

  17. Virtual Reference Service in Academic Libraries in West Africa

    ERIC Educational Resources Information Center

    Sekyere, Kwabena

    2011-01-01

    As technology continues to advance, libraries in Europe and America continue to improve upon their virtual reference services by employing new Web technologies and applying them to existing services. West African academic libraries have begun providing resources electronically to their users but still typically lag behind in the services they…

  18. Young Adult Reference Services in the Public Library.

    ERIC Educational Resources Information Center

    Boylan, Patricia

    1984-01-01

    Methods suggested for use by public libraries to stay on top of school assignments include a large, loose-leaf type binder entitled "School Assignments" to be kept at reference desk; assignment-related book lists; school assignment forms; and teacher notification forms to alert them if the library cannot fulfill their information…

  19. Moving Reference to the Web.

    ERIC Educational Resources Information Center

    McGlamery, Susan; Coffman, Steve

    2000-01-01

    Explores the possibility of using Web contact center software to offer reference assistance to remote users. Discusses a project by the Metropolitan Cooperative Library System/Santiago Library System consortium to test contact center software and to develop a virtual reference network. (Author/LRW)

  20. SMRT sequencing of the Vitis vinifera cv. ‘Flame seedless’ genome using a SMRTbell-free library preparation from Swift Biosciences

    USDA-ARS?s Scientific Manuscript database

    Single Molecule Real-Time (SMRT) sequencing provides advantages to the sequencing of complex genomes. The long reads generated are superior for resolving complex genomic regions and provide highly contiguous de novo assemblies. Current SMRTbell libraries generate average read lengths of 10-15kb. How...

  1. Next generation sequencing technology: a powerful tool for the genome characterization of sugarcane mosaic virus from Sorghum almum

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencing (NGS) technology was used to analyze the occurrence of viruses in Sorghum almum plants in Florida exhibiting mosaic symptoms. Total RNA was extracted from symptomatic leaves and used as a template for cDNA library preparation. The resulting library was sequenced on an Illu...

  2. Highly multiplexed targeted DNA sequencing from single nuclei.

    PubMed

    Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

    2016-02-01

    Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.

  3. Moving beyond Assumptions: The Use of Virtual Reference Data in an Academic Library

    ERIC Educational Resources Information Center

    Nolen, David S.; Powers, Amanda Clay; Zhang, Li; Xu, Yue; Cannady, Rachel E.; Li, Judy

    2012-01-01

    The Mississippi State University Libraries' Virtual Reference Service collected statistics about virtual reference usage. Analysis of the data collected by an entry survey from chat and e-mail transactions provided librarians with concrete information about what patron groups were the highest and lowest users of virtual reference services. These…

  4. Recommended Reference Books for Small and Medium-Sized Libraries and Media Centers, 1999.

    ERIC Educational Resources Information Center

    Wynar, Bohdan S., Ed.

    Designed to assist smaller libraries in selecting suitable reference materials for their collections, this annual review source identifies and describes 538 of the most useful and affordable reference sources available. The reviews cover reference titles published in 1998. Detailed annotations describe the nature, scope, and usability of each…

  5. Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function.

    PubMed

    Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J

    2007-06-01

    As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.

  6. Sequence-independent construction of ordered combinatorial libraries with predefined crossover points.

    PubMed

    Jézéquel, Laetitia; Loeper, Jacqueline; Pompon, Denis

    2008-11-01

    Combinatorial libraries coding for mosaic enzymes with predefined crossover points constitute useful tools to address and model structure-function relationships and for functional optimization of enzymes based on multivariate statistics. The presented method, called sequence-independent generation of a chimera-ordered library (SIGNAL), allows easy shuffling of any predefined amino acid segment between two or more proteins. This method is particularly well adapted to the exchange of protein structural modules. The procedure could also be well suited to generate ordered combinatorial libraries independent of sequence similarities in a robotized manner. Sequence segments to be recombined are first extracted by PCR from a single-stranded template coding for an enzyme of interest using a biotin-avidin-based method. This technique allows the reduction of parental template contamination in the final library. Specific PCR primers allow amplification of two complementary mosaic DNA fragments, overlapping in the region to be exchanged. Fragments are finally reassembled using a fusion PCR. The process is illustrated via the construction of a set of mosaic CYP2B enzymes using this highly modular approach.

  7. A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)

    PubMed Central

    Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto

    2017-01-01

    Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916

  8. Comparison of methanogen diversity of yak (Bos grunniens) and cattle (Bos taurus) from the Qinghai-Tibetan plateau, China.

    PubMed

    Huang, Xiao Dan; Tan, Hui Yin; Long, Ruijun; Liang, Juan Boo; Wright, André-Denis G

    2012-10-19

    Methane emissions by methanogen from livestock ruminants have significantly contributed to the agricultural greenhouse gas effect. It is worthwhile to compare methanogen from "energy-saving" animal (yak) and normal animal (cattle) in order to investigate the link between methanogen structure and low methane production. Diversity of methanogens from the yak and cattle rumen was investigated by analysis of 16S rRNA gene sequences from rumen digesta samples from four yaks (209 clones) and four cattle (205 clones) from the Qinghai-Tibetan Plateau area (QTP). Overall, a total of 414 clones (i.e. sequences) were examined and assigned to 95 operational taxonomic units (OTUs) using MOTHUR, based upon a 98% species-level identity criterion. Forty-six OTUs were unique to the yak clone library and 34 OTUs were unique to the cattle clone library, while 15 OTUs were found in both libraries. Of the 95 OTUs, 93 putative new species were identified. Sequences belonging to the Thermoplasmatales-affiliated Linage C (TALC) were found to dominate in both libraries, accounting for 80.9% and 62.9% of the sequences from the yak and cattle clone libraries, respectively. Sequences belonging to the Methanobacteriales represented the second largest clade in both libraries. However, Methanobrevibacter wolinii (QTPC 110) was only found in the cattle library. The number of clones from the order Methanomicrobiales was greater in cattle than in the yak clone library. Although the Shannon index value indicated similar diversity between the two libraries, the Libshuff analysis indicated that the methanogen community structure of the yak was significantly different than those from cattle. This study revealed for the first time the molecular diversity of methanogen community in yaks and cattle in Qinghai-Tibetan Plateau area in China. From the analysis, we conclude that yaks have a unique rumen microbial ecosystem that is significantly different from that of cattle, this may also help to explain why yak produce less methane than cattle.

  9. Comparison of methanogen diversity of yak (Bos grunniens) and cattle (Bos taurus) from the Qinghai-Tibetan plateau, China

    PubMed Central

    2012-01-01

    Background Methane emissions by methanogen from livestock ruminants have significantly contributed to the agricultural greenhouse gas effect. It is worthwhile to compare methanogen from “energy-saving” animal (yak) and normal animal (cattle) in order to investigate the link between methanogen structure and low methane production. Results Diversity of methanogens from the yak and cattle rumen was investigated by analysis of 16S rRNA gene sequences from rumen digesta samples from four yaks (209 clones) and four cattle (205 clones) from the Qinghai-Tibetan Plateau area (QTP). Overall, a total of 414 clones (i.e. sequences) were examined and assigned to 95 operational taxonomic units (OTUs) using MOTHUR, based upon a 98% species-level identity criterion. Forty-six OTUs were unique to the yak clone library and 34 OTUs were unique to the cattle clone library, while 15 OTUs were found in both libraries. Of the 95 OTUs, 93 putative new species were identified. Sequences belonging to the Thermoplasmatales-affiliated Linage C (TALC) were found to dominate in both libraries, accounting for 80.9% and 62.9% of the sequences from the yak and cattle clone libraries, respectively. Sequences belonging to the Methanobacteriales represented the second largest clade in both libraries. However, Methanobrevibacter wolinii (QTPC 110) was only found in the cattle library. The number of clones from the order Methanomicrobiales was greater in cattle than in the yak clone library. Although the Shannon index value indicated similar diversity between the two libraries, the Libshuff analysis indicated that the methanogen community structure of the yak was significantly different than those from cattle. Conclusion This study revealed for the first time the molecular diversity of methanogen community in yaks and cattle in Qinghai-Tibetan Plateau area in China. From the analysis, we conclude that yaks have a unique rumen microbial ecosystem that is significantly different from that of cattle, this may also help to explain why yak produce less methane than cattle. PMID:23078429

  10. Expressed sequence tag analysis of adult human lens for the NEIBank Project: over 2000 non-redundant transcripts, novel genes and splice variants.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine

    2002-06-15

    To explore the expression profile of the human lens and to provide a resource for microarray studies, expressed sequence tag (EST) analysis has been performed on cDNA libraries from adult lenses. A cDNA library was constructed from two adult (40 year old) human lenses. Over two thousand clones were sequenced from the unamplified, un-normalized library. The library was then normalized and a further 2200 sequences were obtained. All the data were analyzed using GRIST (GRouping and Identification of Sequence Tags), a procedure for gene identification and clustering. The lens library (by) contains a low percentage of non-mRNA contaminants and a high fraction (over 75%) of apparently full length cDNA clones. Approximately 2000 reads from the unamplified library yields 810 clusters, potentially representing individual genes expressed in the lens. After normalization, the content of crystallins and other abundant cDNAs is markedly reduced and a similar number of reads from this library (fs) yields 1455 unique groups of which only two thirds correspond to named genes in GenBank. Among the most abundant cDNAs is one for a novel gene related to glutamine synthetase, which was designated "lengsin" (LGS). Analyses of ESTs also reveal examples of alternative transcripts, including a major alternative splice form for the lens specific membrane protein MP19. Variant forms for other transcripts, including those encoding the apoptosis inhibitor Livin and the armadillo repeat protein ARVCF, are also described. The lens cDNA libraries are a resource for gene discovery, full length cDNAs for functional studies and microarrays. The discovery of an abundant, novel transcript, lengsin, and a major novel splice form of MP19 reflect the utility of unamplified libraries constructed from dissected tissue. Many novel transcripts and splice forms are represented, some of which may be candidates for genetic diseases.

  11. Selection dynamic of Escherichia coli host in M13 combinatorial peptide phage display libraries.

    PubMed

    Zanconato, Stefano; Minervini, Giovanni; Poli, Irene; De Lucrezia, Davide

    2011-01-01

    Phage display relies on an iterative cycle of selection and amplification of random combinatorial libraries to enrich the initial population of those peptides that satisfy a priori chosen criteria. The effectiveness of any phage display protocol depends directly on library amino acid sequence diversity and the strength of the selection procedure. In this study we monitored the dynamics of the selective pressure exerted by the host organism on a random peptide library in the absence of any additional selection pressure. The results indicate that sequence censorship exerted by Escherichia coli dramatically reduces library diversity and can significantly impair phage display effectiveness.

  12. BAC-End Sequence-Based SNP Mining in Allotetraploid Cotton (Gossypium) Utilizing Resequencing Data, Phylogenetic Inferences, and Perspectives for Genetic Mapping.

    PubMed

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Stoffel, Kevin; Zheng, Xiuting; Saski, Christopher A; Scheffler, Brian E; Fang, David D; Chen, Z Jeffrey; Van Deynze, Allen; Stelly, David M

    2015-04-09

    A bacterial artificial chromosome library and BAC-end sequences for cultivated cotton (Gossypium hirsutum L.) have recently been developed. This report presents genome-wide single nucleotide polymorphism (SNP) mining utilizing resequencing data with BAC-end sequences as a reference by alignment of 12 G. hirsutum L. lines, one G. barbadense L. line, and one G. longicalyx Hutch and Lee line. A total of 132,262 intraspecific SNPs have been developed for G. hirsutum, whereas 223,138 and 470,631 interspecific SNPs have been developed for G. barbadense and G. longicalyx, respectively. Using a set of interspecific SNPs, 11 randomly selected and 77 SNPs that are putatively associated with the homeologous chromosome pair 12 and 26, we mapped 77 SNPs into two linkage groups representing these chromosomes, spanning a total of 236.2 cM in an interspecific F2 population (G. barbadense 3-79 × G. hirsutum TM-1). The mapping results validated the approach for reliably producing large numbers of both intraspecific and interspecific SNPs aligned to BAC-ends. This will allow for future construction of high-density integrated physical and genetic maps for cotton and other complex polyploid genomes. The methods developed will allow for future Gossypium resequencing data to be automatically genotyped for identified SNPs along the BAC-end sequence reference for anchoring sequence assemblies and comparative studies. Copyright © 2015 Hulse-Kemp et al.

  13. Archaeal and bacterial diversity in two hot springs from geothermal regions in Bulgaria as demostrated by 16S rRNA and GH-57 genes.

    PubMed

    Stefanova, Katerina; Tomova, Iva; Tomova, Anna; Radchenkova, Nadja; Atanassov, Ivan; Kambourova, Margarita

    2015-12-01

    Archaeal and bacterial diversity in two Bulgarian hot springs, geographically separated with different tectonic origin and different temperature of water was investigated exploring two genes, 16S rRNA and GH-57. Archaeal diversity was significantly higher in the hotter spring Levunovo (LV) (82°C); on the contrary, bacterial diversity was higher in the spring Vetren Dol (VD) (68°C). The analyzed clones from LV library were referred to twenty eight different sequence types belonging to five archaeal groups from Crenarchaeota and Euryarchaeota. A domination of two groups was observed, Candidate Thaumarchaeota and Methanosarcinales. The majority of the clones from VD were referred to HWCG (Hot Water Crenarchaeotic Group). The formation of a group of thermophiles in the order Methanosarcinales was suggested. Phylogenetic analysis revealed high numbers of novel sequences, more than one third of archaeal and half of the bacterial phylotypes displayed similarity lower than 97% with known ones. The retrieved GH-57 gene sequences showed a complex phylogenic distribution. The main part of the retrieved homologous GH-57 sequences affiliated with bacterial phyla Bacteroidetes, Deltaproteobacteria, Candidate Saccharibacteria and affiliation of almost half of the analyzed sequences is not fully resolved. GH-57 gene analysis allows an increased resolution of the biodiversity assessment and in depth analysis of specific taxonomic groups. [Int Microbiol 18(4):217-223 (2015)]. Copyright© by the Spanish Society for Microbiology and Institute for Catalan Studies.

  14. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    PubMed Central

    2011-01-01

    Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061

  15. Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing

    PubMed Central

    Mak, Sarah Siu Tze; Gopalakrishnan, Shyam; Carøe, Christian; Geng, Chunyu; Liu, Shanlin; Sinding, Mikkel-Holger S; Kuderna, Lukas F K; Zhang, Wenwei; Fu, Shujin; Vieira, Filipe G; Germonpré, Mietje; Bocherens, Hervé; Fedorov, Sergey; Petersen, Bent; Sicheritz-Pontén, Thomas; Marques-Bonet, Tomas; Zhang, Guojie; Jiang, Hui; Gilbert, M Thomas P

    2017-01-01

    Abstract Ancient DNA research has been revolutionized following development of next-generation sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ-500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform on DNA extracted from 8 historic and ancient dog and wolf samples. The data generated were largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (P = 0.371) and average sequence length (P = 0718) of endogenous nuclear DNA, sequence GC content (P = 0.311), double-stranded DNA damage rate (v. 0.309), and sequence clonality (P = 0.093). Small significant differences were found in single-strand DNA damage rate (δS; slightly lower for the BGISEQ-500, P = 0.011) and the background rate of difference from the reference genome (θ; slightly higher for BGISEQ-500, P = 0.012). This may result from the differences in amplification cycles used to polymerase chain reaction–amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (P = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from 3 of the samples with overall very low levels of endogenous DNA. Although we acknowledge that our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable alternative platform for palaeogenomic data generation that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA. PMID:28854615

  16. Molecular cloning of actin genes in Trichomonas vaginalis and phylogeny inferred from actin sequences.

    PubMed

    Bricheux, G; Brugerolle, G

    1997-08-01

    The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.

  17. Object-oriented sequence analysis: SCL--a C++ class library.

    PubMed

    Vahrson, W; Hermann, K; Kleffe, J; Wittig, B

    1996-04-01

    SCL (Sequence Class Library) is a class library written in the C++ programming language. Designed using object-oriented programming principles, SCL consists of classes of objects performing tasks typically needed for analyzing DNA or protein sequences. Among them are very flexible sequence classes, classes accessing databases in various formats, classes managing collections of sequences, as well as classes performing higher-level tasks like calculating a pairwise sequence alignment. SCL also includes classes that provide general programming support, like a dynamically growing array, sets, matrices, strings, classes performing file input/output, and utilities for error handling. By providing these components, SCL fosters an explorative programming style: experimenting with algorithms and alternative implementations is encouraged rather than punished. A description of SCL's overall structure as well as an overview of its classes is given. Important aspects of the work with SCL are discussed in the context of a sample program.

  18. SearchSmallRNA: a graphical interface tool for the assemblage of viral genomes using small RNA libraries data.

    PubMed

    de Andrade, Roberto R S; Vaslin, Maite F S

    2014-03-07

    Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/.

  19. SearchSmallRNA: a graphical interface tool for the assemblage of viral genomes using small RNA libraries data

    PubMed Central

    2014-01-01

    Background Next-generation parallel sequencing (NGS) allows the identification of viral pathogens by sequencing the small RNAs of infected hosts. Thus, viral genomes may be assembled from host immune response products without prior virus enrichment, amplification or purification. However, mapping of the vast information obtained presents a bioinformatics challenge. Methods In order to by pass the need of line command and basic bioinformatics knowledge, we develop a mapping software with a graphical interface to the assemblage of viral genomes from small RNA dataset obtained by NGS. SearchSmallRNA was developed in JAVA language version 7 using NetBeans IDE 7.1 software. The program also allows the analysis of the viral small interfering RNAs (vsRNAs) profile; providing an overview of the size distribution and other features of the vsRNAs produced in infected cells. Results The program performs comparisons between each read sequenced present in a library and a chosen reference genome. Reads showing Hamming distances smaller or equal to an allowed mismatched will be selected as positives and used to the assemblage of a long nucleotide genome sequence. In order to validate the software, distinct analysis using NGS dataset obtained from HIV and two plant viruses were used to reconstruct viral whole genomes. Conclusions SearchSmallRNA program was able to reconstructed viral genomes using NGS of small RNA dataset with high degree of reliability so it will be a valuable tool for viruses sequencing and discovery. It is accessible and free to all research communities and has the advantage to have an easy-to-use graphical interface. Availability and implementation SearchSmallRNA was written in Java and is freely available at http://www.microbiologia.ufrj.br/ssrna/. PMID:24607237

  20. A Robust and Versatile Method of Combinatorial Chemical Synthesis of Gene Libraries via Hierarchical Assembly of Partially Randomized Modules

    PubMed Central

    Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried

    2015-01-01

    A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis. PMID:26355961

  1. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    PubMed Central

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  2. Compositional Bias in Naïve and Chemically-modified Phage-Displayed Libraries uncovered by Paired-end Deep Sequencing.

    PubMed

    He, Bifang; Tjhung, Katrina F; Bennett, Nicholas J; Chou, Ying; Rau, Andrea; Huang, Jian; Derda, Ratmir

    2018-01-19

    Understanding the composition of a genetically-encoded (GE) library is instrumental to the success of ligand discovery. In this manuscript, we investigate the bias in GE-libraries of linear, macrocyclic and chemically post-translationally modified (cPTM) tetrapeptides displayed on the M13KE platform, which are produced via trinucleotide cassette synthesis (19 codons) and NNK-randomized codon. Differential enrichment of synthetic DNA {S}, ligated vector {L} (extension and ligation of synthetic DNA into the vector), naïve libraries {N} (transformation of the ligated vector into the bacteria followed by expression of the library for 4.5 hours to yield a "naïve" library), and libraries chemically modified by aldehyde ligation and cysteine macrocyclization {M} characterized by paired-end deep sequencing, detected a significant drop in diversity in {L} → {N}, but only a minor compositional difference in {S} → {L} and {N} → {M}. Libraries expressed at the N-terminus of phage protein pIII censored positively charged amino acids Arg and Lys; libraries expressed between pIII domains N1 and N2 overcame Arg/Lys-censorship but introduced new bias towards Gly and Ser. Interrogation of biases arising from cPTM by aldehyde ligation and cysteine macrocyclization unveiled censorship of sequences with Ser/Phe. Analogous analysis can be used to explore library diversity in new display platforms and optimize cPTM of these libraries.

  3. A Robust and Versatile Method of Combinatorial Chemical Synthesis of Gene Libraries via Hierarchical Assembly of Partially Randomized Modules.

    PubMed

    Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried

    2015-01-01

    A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis.

  4. Large-Scale Concatenation cDNA Sequencing

    PubMed Central

    Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

    1997-01-01

    A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174

  5. Supervisory Rotation: Impact on an Academic Library Reference Staff.

    ERIC Educational Resources Information Center

    Perdue, Bob; Piotrowski, Chris

    1986-01-01

    Presents results of managerial rotation program (librarians take turns on two-year cycle serving as department head) in medium-sized academic library reference department. An assessment of both the advantages and disadvantages of such a plan examines impact of this innovative and collegial management technique on staff, library, and patrons. (7…

  6. Digital Library and Digital Reference Service: Integration and Mutual Complementarity

    ERIC Educational Resources Information Center

    Liu, Jia

    2008-01-01

    Both the digital library and the digital reference service were invented and have been developed under the networked environment. Among their intersections, the fundamental thing is their symbiotic interest--serving the user in a more efficient way. The article starts by discussing the digital library and its service and the digital reference…

  7. User Education in the Academic Library: Media Methods for Reference Work. An Annotated Bibliography. Updated.

    ERIC Educational Resources Information Center

    Wharton, Sika

    This annotated bibliography focuses on academic library usage of audiovisual (AV) methods of instruction, particularly for the enhancement of the reference teaching function. The bibliography's objectives are as follows: to identify current trends with regard to AV methods in library orientation and bibliographic instruction; to isolate instances…

  8. Untangling taxonomy: a DNA barcode reference library for Canadian spiders.

    PubMed

    Blagoev, Gergin A; deWaard, Jeremy R; Ratnasingham, Sujeevan; deWaard, Stephanie L; Lu, Liuqiong; Robertson, James; Telfer, Angela C; Hebert, Paul D N

    2016-01-01

    Approximately 1460 species of spiders have been reported from Canada, 3% of the global fauna. This study provides a DNA barcode reference library for 1018 of these species based upon the analysis of more than 30,000 specimens. The sequence results show a clear barcode gap in most cases with a mean intraspecific divergence of 0.78% vs. a minimum nearest-neighbour (NN) distance averaging 7.85%. The sequences were assigned to 1359 Barcode index numbers (BINs) with 1344 of these BINs composed of specimens belonging to a single currently recognized species. There was a perfect correspondence between BIN membership and a known species in 795 cases, while another 197 species were assigned to two or more BINs (556 in total). A few other species (26) were involved in BIN merges or in a combination of merges and splits. There was only a weak relationship between the number of specimens analysed for a species and its BIN count. However, three species were clear outliers with their specimens being placed in 11-22 BINs. Although all BIN splits need further study to clarify the taxonomic status of the entities involved, DNA barcodes discriminated 98% of the 1018 species. The present survey conservatively revealed 16 species new to science, 52 species new to Canada and major range extensions for 426 species. However, if most BIN splits detected in this study reflect cryptic taxa, the true species count for Canadian spiders could be 30-50% higher than currently recognized. © 2015 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  9. Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data.

    PubMed

    Olova, Nelly; Krueger, Felix; Andrews, Simon; Oxley, David; Berrens, Rebecca V; Branco, Miguel R; Reik, Wolf

    2018-03-15

    Whole-genome bisulfite sequencing (WGBS) is becoming an increasingly accessible technique, used widely for both fundamental and disease-oriented research. Library preparation methods benefit from a variety of available kits, polymerases and bisulfite conversion protocols. Although some steps in the procedure, such as PCR amplification, are known to introduce biases, a systematic evaluation of biases in WGBS strategies is missing. We perform a comparative analysis of several commonly used pre- and post-bisulfite WGBS library preparation protocols for their performance and quality of sequencing outputs. Our results show that bisulfite conversion per se is the main trigger of pronounced sequencing biases, and PCR amplification builds on these underlying artefacts. The majority of standard library preparation methods yield a significantly biased sequence output and overestimate global methylation. Importantly, both absolute and relative methylation levels at specific genomic regions vary substantially between methods, with clear implications for DNA methylation studies. We show that amplification-free library preparation is the least biased approach for WGBS. In protocols with amplification, the choice of bisulfite conversion protocol or polymerase can significantly minimize artefacts. To aid with the quality assessment of existing WGBS datasets, we have integrated a bias diagnostic tool in the Bismark package and offer several approaches for consideration during the preparation and analysis of WGBS datasets.

  10. CD-ROM Stress.

    ERIC Educational Resources Information Center

    Bunge, Charles A.

    1991-01-01

    Discussion of stress in library reference departments focuses on stress caused by CD-ROM reference tools. Topics discussed include work overload; nonreference duties; patron attitudes and behavior; staff attitudes; the need for proper staff training; and the need for library administrators to be sensitive to reference staff needs. (LRW)

  11. Changing Roles for References Librarians.

    ERIC Educational Resources Information Center

    Kelly, Julia; Robbins, Kathryn

    1996-01-01

    Discusses the future outlook for reference librarians, with topics including: "Technology as the Source of Change"; "Impact of the Internet"; "Defining the Virtual Library"; "Rethinking Reference"; "Out of the Library and into the Streets"; "Asking Users About Their Needs"; "Standardization and Artificial Intelligence"; "The Financial Future"; and…

  12. Online Reference Service--How to Begin: A Selected Bibliography.

    ERIC Educational Resources Information Center

    Shroder, Emelie J., Ed.

    1982-01-01

    Materials in this bibliography were selected and recommended by members of the Use of Machine-Assisted Reference in Public Libraries Committee, Reference and Adult Services Division, American Library Association. Topics include: financial aspects, equipment and communications considerations, comparing databases and database systems, advertising…

  13. Sequence evaluation of four specific cDNA libraries for developmental genomics of sunflower.

    PubMed

    Tamborindeguy, C; Ben, C; Liboz, T; Gentzbittel, L

    2004-04-01

    Four different cDNA libraries were constructed from sunflower protoplasts growing under embryogenic and non-embryogenic conditions: one standard library from each condition and two subtractive libraries in opposite sense. A total of 22,876 cDNA clones were obtained and 4800 ESTs were sequenced, giving rise to 2479 high quality ESTs representing an unigene set of 1502 sequences. This set was compared with ESTs represented in public databases using the programs BLASTN and BLASTX, and its members were classified according to putative function using the catalog in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Some 33% of sequences failed to align with existing plant ESTs and therefore represent putative novel genes. The libraries show a low level of redundancy and, on average, 50% of the present ESTs have not been previously reported for sunflower. Several potentially interesting genes were identified, based on their homology with genes involved in animal zygotic division or plant embryogenesis. We also identified two ESTs that show significantly different levels of expression under embryogenic and non-embryogenic conditions. The libraries described here represent an original and valuable resource for the discovery of yet unknown genes putatively involved in dicot embryogenesis and improving our knowledge of the mechanisms involved in polarity acquisition by plant embryos.

  14. Reference Collection Development Policy. Penfield Library, State University of New York College at Oswego.

    ERIC Educational Resources Information Center

    Judd, Blanche; And Others

    Intended for use by the head of reference and subject-area bibliographers at Penfield Library, this document provides guidelines for the selection and evaluation of titles for the 25,000 volume reference collection at this primarily undergraduate institution. The reference collection also supports the general information needs of both the college…

  15. The University of Texas at Arlington's Virtual Reference Service: An Evaluation by the Reference Staff

    ERIC Educational Resources Information Center

    Casebier, Katherine D.

    2006-01-01

    The University of Texas at Arlington's Library began using an online chat reference in 2002. The service, called Collaborative Digital Reference Service, later became "Ask a Librarian." Slightly over one year later, the library joined the University of Texas System's "Ask a Librarian" service. Both services are powered by…

  16. We Jumped on the Live Reference Band Wagon, and We Love the Ride!

    ERIC Educational Resources Information Center

    Schaake, Glenda; Sathan, Eleanor

    2003-01-01

    Memorial Hall Library in Andover, Massachusetts wanted to offer live reference service online but with limited resources, they couldn't do it alone. The 24/7 Reference cooperative program administered by the California State Library required Memorial Hall's librarians to monitor only 10 hours a week in return for live reference coverage. Memorial…

  17. A streamlined method for analysing genome-wide DNA methylation patterns from low amounts of FFPE DNA.

    PubMed

    Ludgate, Jackie L; Wright, James; Stockwell, Peter A; Morison, Ian M; Eccles, Michael R; Chatterjee, Aniruddha

    2017-08-31

    Formalin fixed paraffin embedded (FFPE) tumor samples are a major source of DNA from patients in cancer research. However, FFPE is a challenging material to work with due to macromolecular fragmentation and nucleic acid crosslinking. FFPE tissue particularly possesses challenges for methylation analysis and for preparing sequencing-based libraries relying on bisulfite conversion. Successful bisulfite conversion is a key requirement for sequencing-based methylation analysis. Here we describe a complete and streamlined workflow for preparing next generation sequencing libraries for methylation analysis from FFPE tissues. This includes, counting cells from FFPE blocks and extracting DNA from FFPE slides, testing bisulfite conversion efficiency with a polymerase chain reaction (PCR) based test, preparing reduced representation bisulfite sequencing libraries and massively parallel sequencing. The main features and advantages of this protocol are: An optimized method for extracting good quality DNA from FFPE tissues. An efficient bisulfite conversion and next generation sequencing library preparation protocol that uses 50 ng DNA from FFPE tissue. Incorporation of a PCR-based test to assess bisulfite conversion efficiency prior to sequencing. We provide a complete workflow and an integrated protocol for performing DNA methylation analysis at the genome-scale and we believe this will facilitate clinical epigenetic research that involves the use of FFPE tissue.

  18. Anchoring a Defined Sequence to the 55' Ends of mRNAs : The Bolt to Clone Rare Full Length mRNAs and Generate cDNA Libraries porn a Few Cells.

    PubMed

    Baptiste, J; Milne Edwards, D; Delort, J; Mallet, J

    1993-01-01

    Among numerous applications, the polymerase chain reaction (PCR) (1,2) provides a convenient means to clone 5' ends of rare mRNAs and to generate cDNA libraries from tissue available in amounts too low to be processed by conventional methods. Basically, the amplification of cDNAs by the PCR requires the availability of the sequences of two stretches of the molecule to be amplified. A sequence can easily be imposed at the 5' end of the first-strand cDNAs (corresponding to the 3' end of the mRNAs) by priming the reverse transcription with a specific primer (for cloning the 5' end of rare messenger) or with an oligonucleotide tailored with a poly (dT) stretch (for cDNA library construction), taking advantage of the poly (A) sequence that is located at the 3' end of mRNAs. Several strategies have been devised to tag the 3' end of the ss-cDNAs (corresponding to the 55' end of the mRNAs). We (3) and others have described strategies based on the addition of a homopolymeric dG (4,5) or dA (6,7) tail using terminal deoxyribonucleotide transferase (TdT) ("anchor-PCR" [4]). However, this strategy has important limitations. The TdT reaction is difficult to control and has a low efficiency (unpublished observations). But most importantly, the return primers containing a homopolymeric (dC or dT) tail generate nonspecific amplifications, a phenomenon that prevents the isolation of low abundance mRNA species and/or interferes with the relative abundance of primary clones in the library. To circumvent these drawbacks, we have used two approaches. First, we devised a strategy based on a cRNA enrichment procedure, which has been useful to eliminate nonspecific-PCR products and to allow detection and cloning of cDNAs of low abundance (3). More recently, to avoid the nonspecific amplification resulting from the annealing of the homopolymeric tail oligonucleotide, we have developed a novel anchoring strategy that is based on the ligation of an oligonucleotide to the 35' end of ss-cDNAs. This strategy is referred to as SLIC for single-strand ligation to ss-cDNA (8).

  19. Isolation of mini- and microsatellite loci from chromosome 19 library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prosnyak, M.I.; Belajeva, O.V.; Polukarova, L.G.

    Mini- and microsatellite sequences are abundant in the human genome and are very useful as genetic markers. We report the isolation of a panel of clones containing marker sequences from chromosome 19. We screened 10,000 clones from the chromosome 19 cosmid library for the presence of di-(CA)n, tri-(TCC)n, (CAC)n microsatellites and M13-like minisatellite sequences. For this we have used synthetic oligonucleotides and polynucleotides, including micro- (CA, TCC, CAC) and minisatellite (M13 core) sequences. Preliminary results indicated that the chromosome 19 cosmid library contained both human and hamster clones. In order to identify human sequences from this library we have developedmore » the technique of colony and blot hybridization with Alu-PCR, L1-PCR and B1-PCR probes. Dozens of clones have been selected, some of which were analyzed by conventional Southern blot analysis and non-radioactive in situ hybridization of chromosomes. Highly informative markers derived from these clones will be used for physical and genetic mapping of chromosome 19.« less

  20. Massively parallel sequencing of forensic STRs and SNPs using the Illumina® ForenSeq™ DNA Signature Prep Kit on the MiSeq FGx™ Forensic Genomics System.

    PubMed

    Guo, Fei; Yu, Jiao; Zhang, Lu; Li, Jun

    2017-11-01

    The ForenSeq™ DNA Signature Prep Kit (ForenSeq Kit) is designed to detect more than 200 forensically relevant markers in a single reaction on the MiSeq FGx™ Forensic Genomics System (MiSeq FGx System), including Amelogenin, 27 autosomal short tandem repeats (A-STRs), 7 X chromosomal STRs (X-STRs), 24 Y chromosomal STRs (Y-STRs) and 94 identity-informative single nucleotide polymorphisms (iSNPs) with the option to contain 22 phenotypic-informative SNPs (pSNPs) and 56 ancestry-informative SNPs (aSNPs). In this study, we evaluated the MiSeq FGx System on three major parts: methodological optimization (DNA extraction, sample quantification, library normalization, diluted libraries concentration, and sample-to-cell arrangement), massively parallel sequencing (MPS) performance (depth of coverage, sequence coverage ratio, and allele coverage ratio), and ForenSeq Kit characteristics (repeatability and concordance, sensitivity, mixture, stability and case-type samples). Results showed that quantitative polymerase chain reaction (qPCR)-based sample quantification and library normalization and the appropriate number of pooled libraries and concentration of diluted libraries provided a greater level of MPS performance and repeatability. Repeatable and concordant genotypes were obtained by the ForenSeq Kit. Full profiles were obtained from ≥100pg input DNA for STRs and ≥200pg for SNPs. A sample with ≥5% minor contributors was considered as a mixture by imbalanced allele coverage ratio distribution, and full profiles from minor contributors were easily detected between 9:1 and 1:9 mixtures with known reference profiles. The ForenSeq Kit tolerated considerable concentrations of inhibitors like ≤200μM hematin and ≤50μg/ml humic acid, and >56% STR profiles and >88% SNP profiles were obtained from ≥200-bp degraded samples. Also, it was adapted to case-type samples. As a whole, the ForenSeq Kit is a well-performed, robust, reliable, reproducible and highly informative assay, and it can fully meet requirements for human identification. Further, sensitive QC indicator and automated sample comparison function in the ForenSeq™ Universal Analysis Software are quite helpful, so that we can concentrate on questionable genotypes and avoid tedious and time-consuming labor to maximum the time spent in data analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. The Allama Iqbal Open University Library.

    ERIC Educational Resources Information Center

    Ul-Hassan, Mahmud

    1984-01-01

    This profile of an open university library system in Pakistan covers its primary objective; functional organization; library collection and collection development; staffing, budgets and expenditures; library services (users, technical, reference); departmental libraries (on-campus branch, regional); and the library building. A list of 16 library…

  2. University Library Online Reference Service Program Plan, 1986/87.

    ERIC Educational Resources Information Center

    Koga, James S.

    This program plan for online reference service--the individualized assistance provided to a library patron using an online system--at California State Polytechnic University, Pomona, covers the areas of funding, eligibility for online services, search request eligibility, database eligibility, management of online services, reference faculty…

  3. Outstanding Reference Sources: The 1994 Selection of Recent Titles.

    ERIC Educational Resources Information Center

    Luchsinger, Dale F.

    1994-01-01

    Presents an annotated bibliography of 37 outstanding reference sources published in 1993 for small to medium-sized public and academic libraries as selected by the American Library Association's Reference Services Committee. Categories include culture and civilization, biography, general, language and literature, nature, law, religion, technology,…

  4. CD-ROM + Fax = Shared Reference Resources.

    ERIC Educational Resources Information Center

    Fitzwater, Diana; Fradkin, Bernard

    1988-01-01

    Describes the Reference by GammaFax Project, which joined nine area libraries to provide cooperative reference access to optical disk databases. The configuration of disk players, microcomputers, and facsimile equipment used by the libraries is explained, and the improvements in cost effectiveness, provision of service, and librarian expertise are…

  5. Going Prime Time with Live Chat Reference.

    ERIC Educational Resources Information Center

    Hoag, Tara J.; Cichanowicz, Edana McCaffrey

    2001-01-01

    Describes the development of the Suffolk Cooperative Library System's live, online chat reference service, a pilot project for public libraries in Suffolk County (New York). Topics include chat software selection; a virtual reference collection; marketing; funding; staffing; evaluation; expanded hours of service; email; and extracting data from…

  6. HIV/AIDS reference questions in an AIDS service organization special library.

    PubMed

    Deevey, Sharon; Behring, Michael

    2005-01-01

    Librarians in many venues may anticipate a wide range of reference questions related to HIV and AIDS. Information on HIV/ AIDS is now available in medical, academic, and public libraries and on the Internet, and ranges from the most complex science to the most private disclosures about personal behavior. In this article, the 913 reference questions asked between May 2002 and August 2004 in a special library in a mid-western community-based AIDS service organization are described and analyzed.

  7. The Region 4 collaborative virtual reference project.

    PubMed

    Parker, Sandi K; Johnson, E Diane

    2003-01-01

    In May 2002, the Denison Memorial Library at the University of Colorado Health Sciences Center and the J. Otto Lottes Health Sciences Library at the University of Missouri-Columbia, with funding from the National Network of Libraries of Medicine-Midcontinental Region, embarked on a collaborative, real-time reference project using the 24/7 Reference, Inc., software package. This paper describes how the project was conceived, and includes details on the service hours, staffing, training, marketing, lessons learned, and future plans for the service.

  8. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Torella, JP; Lienert, F; Boehm, CR

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less

  9. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  10. Next-Generation DNA Sequencing of VH/VL Repertoires: A Primer and Guide to Applications in Single-Domain Antibody Discovery.

    PubMed

    Henry, Kevin A

    2018-01-01

    Immunogenetic analyses of expressed antibody repertoires are becoming increasingly common experimental investigations and are critical to furthering our understanding of autoimmunity, infectious disease, and cancer. Next-generation DNA sequencing (NGS) technologies have now made it possible to interrogate antibody repertoires to unprecedented depths, typically by sequencing of cDNAs encoding immunoglobulin variable domains. In this chapter, we describe simple, fast, and reliable methods for producing and sequencing multiplex PCR amplicons derived from the variable regions (V H , V H H or V L ) of rearranged immunoglobulin heavy and light chain genes using the Illumina MiSeq platform. We include complete protocols and primer sets for amplicon sequencing of V H /V H H/V L repertoires directly from human, mouse, and llama lymphocytes as well as from phage-displayed V H /V H H/V L libraries; these can be easily be adapted to other types of amplicons with little modification. The resulting amplicons are diverse and representative, even using as few as 10 3 input B cells, and their generation is relatively inexpensive, requiring no special equipment and only a limited set of primers. In the absence of heavy-light chain pairing, single-domain antibodies are uniquely amenable to NGS analyses. We present a number of applications of NGS technology useful in discovery of single-domain antibodies from phage display libraries, including: (i) assessment of library functionality; (ii) confirmation of desired library randomization; (iii) estimation of library diversity; and (iv) monitoring the progress of panning experiments. While the case studies presented here are of phage-displayed single-domain antibody libraries, the principles extend to other types of in vitro display libraries.

  11. Viral morphogenesis is the dominant source of sequence censorship in M13 combinatorial peptide phage display.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rodi, D. J.; Soares, A. S.; Makowski, L.

    Novel statistical methods have been developed and used to quantitate and annotate the sequence diversity within combinatorial peptide libraries on the basis of small numbers (1-200) of sequences selected at random from commercially available M13 p3-based phage display libraries. These libraries behave statistically as though they correspond to populations containing roughly 4.0{+-}1.6% of the random dodecapeptides and 7.9{+-}2.6% of the random constrained heptapeptides that are theoretically possible within the phage populations. Analysis of amino acid residue occurrence patterns shows no demonstrable influence on sequence censorship by Escherichia coli tRNA isoacceptor profiles or either overall codon or Class II codon usagemore » patterns, suggesting no metabolic constraints on recombinant p3 synthesis. There is an overall depression in the occurrence of cysteine, arginine and glycine residues and an overabundance of proline, threonine and histidine residues. The majority of position-dependent amino acid sequence bias is clustered at three positions within the inserted peptides of the dodecapeptide library, +1, +3 and +12 downstream from the signal peptidase cleavage site. Conformational tendency measures of the peptides indicate a significant preference for inserts favoring a {beta}-turn conformation. The observed protein sequence limitations can primarily be attributed to genetic codon degeneracy and signal peptidase cleavage preferences. These data suggest that for applications in which maximal sequence diversity is essential, such as epitope mapping or novel receptor identification, combinatorial peptide libraries should be constructed using codon-corrected trinucleotide cassettes within vector-host systems designed to minimize morphogenesis-related censorship.« less

  12. Tandem mass spectrometry spectral libraries and library searching.

    PubMed

    Deutsch, Eric W

    2011-01-01

    Spectral library searching in the field of proteomics has been gaining visibility and use in the last few years, primarily due to the expansion of public proteomics data repositories and the large spectral libraries that can be generated from them. Spectral library searching has several advantages over conventional sequence searching: it is generally much faster, and has higher specificity and sensitivity. The speed increase is primarily, due to having a smaller, fully indexable search space of real spectra that are known to be observable. The increase in specificity and sensitivity is primarily due to the ability of a search engine to utilize the known intensities of the fragment ions, rather than just comparing with theoretical spectra as is done with sequence searching. The main disadvantage of spectral library searching is that one can only identify peptide ions that have been seen before and are stored in the spectral library. In this chapter, an overview of spectral library searching and the libraries currently available are presented.

  13. Doing Virtual Reference along with Everything Else

    ERIC Educational Resources Information Center

    Garlish, Betsy Harper

    2009-01-01

    Virtual reference is a service that makes particular sense in a state like Montana. The fourth-largest state by geographic area but 44th by population in the U.S., it has about one library per 11,392 residents. Montana's libraries, including the libraries of Montana Tech of the University of Montana, where the author currently serves as a…

  14. The Teaching of Reference in American Library Schools. Historical Paper 4

    ERIC Educational Resources Information Center

    Cheney, Frances Neel

    2015-01-01

    In this article, the author proposes to present a superficial view of what is being taught in the 32 American Library Association accredited graduate library schools. She briefly discusses and answers the following three questions: (1) Who is teaching reference?; (2) What is being taught?; and (3) How is it being taught? [For the commentary on…

  15. Where Does that Electronic Resource Fit on the Library Web Page?

    ERIC Educational Resources Information Center

    Digby, Todd R.

    2004-01-01

    The author of this article is an automation librarian, but at times he also works at the reference desk, as well as teaching library instruction and literacy classes. Working at the reference desk, he learns how users handle their library's information technology. This article explores the conclusions that the author has reached regarding the…

  16. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    PubMed Central

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  17. Epitope selection from an uncensored peptide library displayed on avian leukosis virus.

    PubMed

    Khare, Pranay D; Rosales, Ana G; Bailey, Kent R; Russell, Stephen J; Federspiel, Mark J

    2003-10-25

    Phage display libraries have provided an extraordinarily versatile technology to facilitate the isolation of peptides, growth factors, single chain antibodies, and enzymes with desired binding specificities or enzymatic activities. The overall diversity of peptides in phage display libraries can be significantly limited by Escherichia coli protein folding and processing machinery, which result in sequence censorship. To achieve an optimal diversity of displayed eukaryotic peptides, the library should be produced in the endoplasmic reticulum of eukaryotic cells using a eukaryotic display platform. In the accompanying article, we presented experiments that demonstrate that polypeptides of various sizes could be efficiently displayed on the envelope glycoproteins of a eukaryotic virus, avian leukosis virus (ALV), and the displayed polypeptides could efficiently attach to cognate receptors without interfering with viral attachment and entry into susceptible cells. In this study, methods were developed to construct a model library of randomized eight amino acid peptides using the ALV eukaryotic display platform and screen the library for specific epitopes using immobilized antibodies. A virus library with approximately 2 x 10(6) different members was generated from a plasmid library of approximately 5 x 10(6) diversity. The sequences of the randomized 24 nucleotide/eight amino acid regions of representatives of the plasmid and virus libraries were analyzed. No significant sequence censorship was observed in producing the virus display library from the plasmid library. Different populations of peptide epitopes were selected from the virus library when different monoclonal antibodies were used as the target. The results of these two studies clearly demonstrate the potential of ALV as a eukaryotic platform for the display and selection of eukaryotic polypeptides libraries.

  18. Bibliography for the Hospitality Industry.

    ERIC Educational Resources Information Center

    Nelson, Elizabeth A.

    This annotated bibliography is a sample collection of reference materials in the hospitality industry suitable for a small academic library. It is assumed that the library has a general reference collection. Publication dates range from 1992-96, with two publication dates in the 1980s. No periodicals are included. The 41 reference materials are…

  19. ALA Library Schools and Subject Reference Coursework: A Short Communication

    ERIC Educational Resources Information Center

    Condic, Kristine

    2016-01-01

    Reference librarians are exposed to the literature of different disciplines in a number of ways including advanced degrees, on the job training, and intellectual inquisitiveness. As students, many reference librarians were also exposed to library science programs offering coursework specializing in information sources and research within other…

  20. Virtual Reference, Real Money: Modeling Costs in Virtual Reference Services

    ERIC Educational Resources Information Center

    Eakin, Lori; Pomerantz, Jeffrey

    2009-01-01

    Libraries nationwide are in yet another phase of belt tightening. Without an understanding of the economic factors that influence library operations, however, controlling costs and performing cost-benefit analyses on services is difficult. This paper describes a project to develop a cost model for collaborative virtual reference services. This…

  1. Librarians without Borders? Virtual Reference Service to Unaffiliated Users

    ERIC Educational Resources Information Center

    Kibbee, Jo

    2006-01-01

    The author investigates issues faced by academic research libraries in providing virtual reference services to unaffiliated users. These libraries generally welcome visitors who use on-site collections and reference services, but are these altruistic policies feasible in a virtual environment? This paper reviews the use of virtual reference…

  2. Distance Education and Virtual Reference: Where Are We Headed?

    ERIC Educational Resources Information Center

    Coffman, Steve

    2001-01-01

    Discusses changes in distance education and considers the resulting need for new types of library services. Topics include new Web-based contact center software; how to conduct virtual reference interviews; online reference service; the role of the physical library; staffing changes; and future possibilities, including impacts on costs of library…

  3. Jewish Studies: A Guide to Reference Sources.

    ERIC Educational Resources Information Center

    McGill Univ., Montreal (Quebec). McLennan Library.

    An annotated bibliography to the reference sources for Jewish Studies in the McLennan Library of McGill University (Canada) is presented. Any titles in Hebrew characters are listed by their transliterated equivalents. There is also a list of relevant Library of Congress Subject Headings. General reference sources listed are: encyclopedias,…

  4. They CAN and They SHOULD: Undergraduates Providing Peer Reference and Instruction

    ERIC Educational Resources Information Center

    Bodemer, Brett B.

    2014-01-01

    Peer learning dynamics have proven powerful in collegiate contexts. These dynamics should be leveraged at the undergraduate level in academic libraries for reference provision and basic information literacy instruction. Drawing on the literature of peer learning, documented examples of peer reference and instruction in academic libraries, and…

  5. Building Bridges for Collaborative Digital Reference between Libraries and Museums through an Examination of Reference in Special Collections

    ERIC Educational Resources Information Center

    Lavender, Kenneth; Nicholson, Scott; Pomerantz, Jeffrey

    2005-01-01

    While a growing number of the digital reference services in libraries have become part of collaborative reference networks, other entities that serve similar information-seeking needs such as special collections and museums have not joined these networks, even though they are answering an increasing number of questions from off-site patrons via…

  6. Molecular profiling of fungal communities in moisture damaged buildings before and after remediation--a comparison of culture-dependent and culture-independent methods.

    PubMed

    Pitkäranta, Miia; Meklin, Teija; Hyvärinen, Anne; Nevalainen, Aino; Paulin, Lars; Auvinen, Petri; Lignell, Ulla; Rintala, Helena

    2011-10-21

    Indoor microbial contamination due to excess moisture is an important contributor to human illness in both residential and occupational settings. However, the census of microorganisms in the indoor environment is limited by the use of selective, culture-based detection techniques. By using clone library sequencing of full-length internal transcribed spacer region combined with quantitative polymerase chain reaction (qPCR) for 69 fungal species or assay groups and cultivation, we have been able to generate a more comprehensive description of the total indoor mycoflora. Using this suite of methods, we assessed the impact of moisture damage on the fungal community composition of settled dust and building material samples (n = 8 and 16, correspondingly). Water-damaged buildings (n = 2) were examined pre- and post- remediation, and compared with undamaged reference buildings (n = 2). Culture-dependent and independent methods were consistent in the dominant fungal taxa in dust, but sequencing revealed a five to ten times higher diversity at the genus level than culture or qPCR. Previously unknown, verified fungal phylotypes were detected in dust, accounting for 12% of all diversity. Fungal diversity, especially within classes Dothideomycetes and Agaricomycetes tended to be higher in the water damaged buildings. Fungal phylotypes detected in building materials were present in dust samples, but their proportion of total fungi was similar for damaged and reference buildings. The quantitative correlation between clone library phylotype frequencies and qPCR counts was moderate (r = 0.59, p < 0.01). We examined a small number of target buildings and found indications of elevated fungal diversity associated with water damage. Some of the fungi in dust were attributable to building growth, but more information on the material-associated communities is needed in order to understand the dynamics of microbial communities between building structures and dust. The sequencing-based method proved indispensable for describing the true fungal diversity in indoor environments. However, making conclusions concerning the effect of building conditions on building mycobiota using this methodology was complicated by the wide natural diversity in the dust samples, the incomplete knowledge of material-associated fungi fungi and the semiquantitative nature of sequencing based methods.

  7. A Protocol for Functional Assessment of Whole-Protein Saturation Mutagenesis Libraries Utilizing High-Throughput Sequencing.

    PubMed

    Stiffler, Michael A; Subramanian, Subu K; Salinas, Victor H; Ranganathan, Rama

    2016-07-03

    Site-directed mutagenesis has long been used as a method to interrogate protein structure, function and evolution. Recent advances in massively-parallel sequencing technology have opened up the possibility of assessing the functional or fitness effects of large numbers of mutations simultaneously. Here, we present a protocol for experimentally determining the effects of all possible single amino acid mutations in a protein of interest utilizing high-throughput sequencing technology, using the 263 amino acid antibiotic resistance enzyme TEM-1 β-lactamase as an example. In this approach, a whole-protein saturation mutagenesis library is constructed by site-directed mutagenic PCR, randomizing each position individually to all possible amino acids. The library is then transformed into bacteria, and selected for the ability to confer resistance to β-lactam antibiotics. The fitness effect of each mutation is then determined by deep sequencing of the library before and after selection. Importantly, this protocol introduces methods which maximize sequencing read depth and permit the simultaneous selection of the entire mutation library, by mixing adjacent positions into groups of length accommodated by high-throughput sequencing read length and utilizing orthogonal primers to barcode each group. Representative results using this protocol are provided by assessing the fitness effects of all single amino acid mutations in TEM-1 at a clinically relevant dosage of ampicillin. The method should be easily extendable to other proteins for which a high-throughput selection assay is in place.

  8. A Blumeria graminisf.sp. hordei BAC library--contig building and microsynteny studies.

    PubMed

    Pedersen, Carsten; Wu, Boqian; Giese, Henriette

    2002-11-01

    A bacterial artificial chromosome (BAC) library of Blumeria graminis f.sp. hordei, containing 12,000 clones with an average insert size of 41 kb, was constructed. The library represents about three genome equivalents and BAC-end sequencing showed a high content of repetitive sequences, making contig-building difficult. To identify overlapping clones, several strategies were used: colony hybridisation, PCR screening, fingerprinting techniques and the use of single-copy expressed sequence tags. The latter proved to be the most efficient method for identification of overlapping clones. Two contigs, at or close to avirulence loci, were constructed. Single nucleotide polymorphism (SNP) markers were developed from BAC-end sequences to link the contigs to the genetic maps. Two other BAC contigs were used to study microsynteny between B. graminis and two other ascomycetes, Neurospora crassa and Aspergillus fumigatus. The library provides an invaluable tool for the isolation of avirulence genes from B. graminis and for the study of gene synteny between this fungus and other fungi.

  9. De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries.

    PubMed

    Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

    2015-09-21

    Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.

  10. A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

    PubMed

    Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

    2016-11-21

    Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum. These resources as a robust platform will be used in high-resolution mapping, gene cloning, assembly of genome sequences, comparative genomics and evolution for sweetpotato.

  11. Genomic sequencing of Pleistocene cave bears

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less

  12. The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

    PubMed

    Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara

    2009-10-15

    The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.

  13. Sequence of Spider Aciniform and Piriform Silks

    DTIC Science & Technology

    2001-09-19

    7/98nd subtan-6/01 4. TITLE AND SUBTITLE Sequence of Spider Aciniform and Piriform Silks 5. FUNDING NUMBERS DAAD19-01-1-0569 6...aciniform glands from Argiope trifasciata were used to construct a cDNA library. The library was probed with various DNA probes based on known spider silk ...sequence in a number of other spider silks . The 5’end of the clone still appears to be repetitive sequence and thus it is unlikely to be a full-length

  14. Targeted Evolution of Embedded Librarian Services: Providing Mobile Reference and Instruction Services Using iPads

    PubMed Central

    Chiarella, Deborah

    2016-01-01

    The University at Buffalo Health Sciences Library provides reference and instructional services to support research, curricular, and clinical programs of the University at Buffalo. With funding from an NN/LM MAR Technology Improvement Award, the University at Buffalo Health Sciences Library (UBHSL) purchased iPads to develop embedded reference and educational services. Usage statistics were collected over a ten-month period to measure the frequency of iPad use for mobile services. While this experiment demonstrates that the iPad can be used to meet the library user's needs outside of the physical library space, this paper will also offer advice for others who are considering implementing their own program. PMID:26496394

  15. Targeted Evolution of Embedded Librarian Services: Providing Mobile Reference and Instruction Services Using iPads.

    PubMed

    Stellrecht, Elizabeth; Chiarella, Deborah

    2015-01-01

    The University at Buffalo Health Sciences Library provides reference and instructional services to support research, curricular, and clinical programs of the University at Buffalo. With funding from an NN/LM MAR Technology Improvement Award, the University at Buffalo Health Sciences Library (UBHSL) purchased iPads to develop embedded reference and educational services. Usage statistics were collected over a ten-month period to measure the frequency of iPad use for mobile services. While this experiment demonstrates that the iPad can be used to meet the library user's needs outside of the physical library space, this article will also offer advice for others who are considering implementing their own program.

  16. A basic analysis toolkit for biological sequences

    PubMed Central

    Giancarlo, Raffaele; Siragusa, Alessandro; Siragusa, Enrico; Utro, Filippo

    2007-01-01

    This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at under the GNU GPL. PMID:17877802

  17. Scaling up the 454 Titanium Library Construction and Pooling of Barcoded Libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Phung, Wilson; Hack, Christopher; Shapiro, Harris

    2009-03-23

    We have been developing a high throughput 454 library construction process at the Joint Genome Institute to meet the needs of de novo sequencing a large number of microbial and eukaryote genomes, EST, and metagenome projects. We have been focusing efforts in three areas: (1) modifying the current process to allow the construction of 454 standard libraries on a 96-well format; (2) developing a robotic platform to perform the 454 library construction; and (3) designing molecular barcodes to allow pooling and sorting of many different samples. In the development of a high throughput process to scale up the number ofmore » libraries by adapting the process to a 96-well plate format, the key process change involves the replacement of gel electrophoresis for size selection with Solid Phase Reversible Immobilization (SPRI) beads. Although the standard deviation of the insert sizes increases, the overall quality sequence and distribution of the reads in the genome has not changed. The manual process of constructing 454 shotgun libraries on 96-well plates is a time-consuming, labor-intensive, and ergonomically hazardous process; we have been experimenting to program a BioMek robot to perform the library construction. This will not only enable library construction to be completed in a single day, but will also minimize any ergonomic risk. In addition, we have implemented a set of molecular barcodes (AKA Multiple Identifiers or MID) and a pooling process that allows us to sequence many targets simultaneously. Here we will present the testing of pooling a set of selected fosmids derived from the endomycorrhizal fungus Glomus intraradices. By combining the robotic library construction process and the use of molecular barcodes, it is now possible to sequence hundreds of fosmids that represent a minimal tiling path of this genome. Here we present the progress and the challenges of developing these scaled-up processes.« less

  18. Electronic Library: A TERI Experiment.

    ERIC Educational Resources Information Center

    Kar, Debal C.; Deb, Subrata; Kumar, Satish

    2003-01-01

    Discusses the development of Electronic Library at TERI (The Energy and Resources Institute, New Delhi). Highlights include: hardware and software used; the digital library/Virtual Electronic Library; directory of Internet journals; virtual reference resources; electronic collection/Physical Electronic Library; downloaded online full-length…

  19. The Planning, Implementation, and Movement of an Academic Library Collection.

    ERIC Educational Resources Information Center

    Kurkul, Donna Lee

    1983-01-01

    Discusses methodology, logistics, and time/cost study of planning, implementation, and relocation of 682,810 volume Smith College Library collection into its newly constructed and renovated facility. Call number sequence location, collection movement phasing and formulas for sequence distribution, and personnel requirements are noted. Elementary…

  20. Construction of naïve camelids VHH repertoire in phage display-based library.

    PubMed

    Sabir, Jamal S M; Atef, Ahmed; El-Domyati, Fotouh M; Edris, Sherif; Hajrah, Nahid; Alzohairy, Ahmed M; Bahieldin, Ahmed

    2014-04-01

    Camelids have unique antibodies, namely HCAbs (VHH) or commercially named Nanobodies(®) (Nb) that are composed only of a heavy-chain homodimer. As libraries based on immunized camelids are time-consuming, costly and likely redundant for certain antigens, we describe the construction of a naïve camelid VHHs library from blood serum of non-immunized camelids with affinity in the subnanomolar range and suitable for standard immune applications. This approach is rapid and recovers VHH repertoire with the advantages of being more diverse, non-specific and devoid of subpopulations of specific antibodies, which allows the identification of binders for any potential antigen (or pathogen). RNAs from a number of camelids from Saudi Arabia were isolated and cDNAs of the diverse vhh gene were amplified; the resulting amplicons were cloned in the phage display pSEX81 vector. The size of the library was found to be within the required range (10(7)) suitable for subsequent applications in disease diagnosis and treatment. Two hundred clones were randomly selected and the inserted gene library was either estimated for redundancy or sequenced and aligned to the reference camelid vhh gene (acc. No. ADE99145). Results indicated complete non-specificity of this small library in which no single event of redundancy was detected. These results indicate the efficacy of following this approach in order to yield a large and diverse enough gene library to secure the presence of the required version encoding the required antibodies for any target antigen. This work is a first step towards the construction of phage display-based biosensors useful in disease (e.g., TB or tuberculosis) diagnosis and treatment. Copyright © 2014 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  1. Designing oligo libraries taking alternative splicing into account

    NASA Astrophysics Data System (ADS)

    Shoshan, Avi; Grebinskiy, Vladimir; Magen, Avner; Scolnicov, Ariel; Fink, Eyal; Lehavi, David; Wasserman, Alon

    2001-06-01

    We have designed sequences for DNA microarrays and oligo libraries, taking alternative splicing into account. Alternative splicing is a common phenomenon, occurring in more than 25% of the human genes. In many cases, different splice variants have different functions, are expressed in different tissues or may indicate different stages of disease. When designing sequences for DNA microarrays or oligo libraries, it is very important to take into account the sequence information of all the mRNA transcripts. Therefore, when a gene has more than one transcript (as a result of alternative splicing, alternative promoter sites or alternative poly-adenylation sites), it is very important to take all of them into account in the design. We have used the LEADS transcriptome prediction system to cluster and assemble the human sequences in GenBank and design optimal oligonucleotides for all the human genes with a known mRNA sequence based on the LEADS predictions.

  2. Evolution of Reference: A New Service Model for Science and Engineering Libraries

    ERIC Educational Resources Information Center

    Bracke, Marianne Stowell; Chinnaswamy, Sainath; Kline, Elizabeth

    2008-01-01

    This article explores the different steps involved in adopting a new service model at the University of Arizona Science-Engineering Library. In a time of shrinking budgets and changing user behavior the library was forced to rethink it reference services to be cost effective and provide quality service at the same time. The new model required…

  3. Patron Survey of User Satisfaction with Library Services: Relationship between Librarian Behaviors during the Reference Interview and User Satisfaction.

    ERIC Educational Resources Information Center

    Nichols, Mary Ellen

    This study examined whether user satisfaction with library services is affected by certain objective and subjective librarian behaviors exhibited during the reference interview. A patron survey was conducted during July 1993 in three branches of Cuyahoga County Public Library, located in northeastern Ohio. The sample was determined by the patrons…

  4. The Library as the Community Switchboard. Bay Area Reference Center Workshop, February 14 and 15, 1973.

    ERIC Educational Resources Information Center

    San Francisco Public Library, CA. Bay Area Reference Center.

    In a two-day workshop at the San Francisco Public Library, the staff of the Bay Area Reference Center (BARC) led discussions on how the library can serve as a community information and referral center. Various speakers reviewed the problems and possibilities of such a service and presented guidelines for the establishment, volunteer staffing, and…

  5. Constructing and detecting a cDNA library for mites.

    PubMed

    Hu, Li; Zhao, YaE; Cheng, Juan; Yang, YuanJun; Li, Chen; Lu, ZhaoHui

    2015-10-01

    RNA extraction and construction of complementary DNA (cDNA) library for mites have been quite challenging due to difficulties in acquiring tiny living mites and breaking their hard chitin. The present study is to explore a better method to construct cDNA library for mites that will lay the foundation on transcriptome and molecular pathogenesis research. We selected Psoroptes cuniculi as an experimental subject and took the following steps to construct and verify cDNA library. First, we combined liquid nitrogen grinding with TRIzol for total RNA extraction. Then, switching mechanism at 5' end of the RNA transcript (SMART) technique was used to construct full-length cDNA library. To evaluate the quality of cDNA library, the library titer and recombination rate were calculated. The reliability of cDNA library was detected by sequencing and analyzing positive clones and genes amplified by specific primers. The results showed that the RNA concentration was 836 ng/μl and the absorbance ratio at 260/280 nm was 1.82. The library titer was 5.31 × 10(5) plaque-forming unit (PFU)/ml and the recombination rate was 98.21%, indicating that the library was of good quality. In the 33 expressed sequence tags (ESTs) of P. cuniculi, two clones of 1656 and 1658 bp were almost identical with only three variable sites detected, which had an identity of 99.63% with that of Psoroptes ovis, indicating that the cDNA library was reliable. Further detection by specific primers demonstrated that the 553-bp Pso c II gene sequences of P. cuniculi had an identity of 98.56% with those of P. ovis, confirming that the cDNA library was not only reliable but also feasible.

  6. Sequence-based screening for self-sufficient P450 monooxygenase from a metagenome library.

    PubMed

    Kim, B S; Kim, S Y; Park, J; Park, W; Hwang, K Y; Yoon, Y J; Oh, W K; Kim, B Y; Ahn, J S

    2007-05-01

    Cytochrome P450 monooxygenases (CYPs) are useful catalysts for oxidation reactions. Self-sufficient CYPs harbour a reductive domain covalently connected to a P450 domain and are known for their robust catalytic activity with great potential as biocatalysts. In an effort to expand genetic sources of self-sufficient CYPs, we devised a sequence-based screening system to identify them in a soil metagenome. We constructed a soil metagenome library and performed sequence-based screening for self-sufficient CYP genes. A new CYP gene, syk181, was identified from the metagenome library. Phylogenetic analysis revealed that SYK181 formed a distinct phylogenic line with 46% amino-acid-sequence identity to CYP102A1 which has been extensively studied as a fatty acid hydroxylase. The heterologously expressed SYK181 showed significant hydroxylase activity towards naphthalene and phenanthrene as well as towards fatty acids. Sequence-based screening of metagenome libraries is expected to be a useful approach for searching self-sufficient CYP genes. The translated product of syk181 shows self-sufficient hydroxylase activity towards fatty acids and aromatic compounds. SYK181 is the first self-sufficient CYP obtained directly from a metagenome library. The genetic and biochemical information on SYK181 are expected to be helpful for engineering self-sufficient CYPs with broader catalytic activities towards various substrates, which would be useful for bioconversion of natural products and biodegradation of organic chemicals.

  7. Reference librarians' perceptions of the issues they face as academic health information professionals

    PubMed Central

    Scherrer, Carol S.

    2004-01-01

    Background: Leaders in the profession encourage academic health sciences librarians to assume new roles as part of the growth process for remaining vital professionals. Have librarians embraced these new roles? Objectives: This research sought to examine from the reference librarians' viewpoints how their roles have changed over the past ten years and what the challenges these changes present as viewed by both the librarians and library directors. Method: A series of eight focus groups was conducted with reference librarians from private and public academic health sciences libraries. Directors of these libraries were interviewed separately. Results: Reference librarians' activities have largely confirmed the role changes anticipated by their leaders. They are teaching more, engaging in outreach through liaison initiatives, and designing Web pages, in addition to providing traditional reference duties. Librarians offer insights into unanticipated issues encountered in each of these areas and offer some creative solutions. Directors discuss the issues from their unique perspective. Conclusion: Librarians have identified areas for focusing efforts in lifelong learning. Adult learning theory, specialized databases and resources needed by researchers, ever-evolving technology, and promotion and evaluation of the library are areas needing attention. Implications for library education and continuing professional development are presented. PMID:15098052

  8. Reference librarians' perceptions of the issues they face as academic health information professionals.

    PubMed

    Scherrer, Carol S

    2004-04-01

    Leaders in the profession encourage academic health sciences librarians to assume new roles as part of the growth process for remaining vital professionals. Have librarians embraced these new roles? This research sought to examine from the reference librarians' viewpoints how their roles have changed over the past ten years and what the challenges these changes present as viewed by both the librarians and library directors. A series of eight focus groups was conducted with reference librarians from private and public academic health sciences libraries. Directors of these libraries were interviewed separately. Reference librarians' activities have largely confirmed the role changes anticipated by their leaders. They are teaching more, engaging in outreach through liaison initiatives, and designing Web pages, in addition to providing traditional reference duties. Librarians offer insights into unanticipated issues encountered in each of these areas and offer some creative solutions. Directors discuss the issues from their unique perspective. Librarians have identified areas for focusing efforts in lifelong learning. Adult learning theory, specialized databases and resources needed by researchers, ever-evolving technology, and promotion and evaluation of the library are areas needing attention. Implications for library education and continuing professional development are presented.

  9. Public Libraries Section. Libraries Serving General Public Division. Papers.

    ERIC Educational Resources Information Center

    International Federation of Library Associations, The Hague (Netherlands).

    Papers on public libraries, which were presented at the 1983 International Federation of Library Associations (IFLA) conference, include: (1) "The Role of Public Libraries in Developing Countries with Particular Reference to the Gambia" by Sally P. C. N'Jie (The Gambia); (2) "Public Libraries in the Federal Republic of Germany…

  10. SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

    PubMed Central

    2010-01-01

    Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148

  11. The Role of Microcomputers in Libraries.

    ERIC Educational Resources Information Center

    Lundeen, Gerald

    1980-01-01

    Describes the functions and characteristics of the microcomputer and discusses library applications including cataloging, circulation, acquisitions, serials control, reference and database systems, administration, current and future trends, and computers as media. Twenty references are listed. (CHC)

  12. Reference as an Access Service: Collaboration between Reference and Interlibrary Loan Departments

    ERIC Educational Resources Information Center

    Kern, M. Kathleen; Weible, Cherie L.

    2005-01-01

    Academic libraries increasingly rely on Interlibrary Loan (ILL) departments to obtain research materials. This adds to the workload of ILL at a time when many libraries are experiencing budget cuts and dwindling staff. Collaboration between ILL and Reference can assist ILL by providing searching expertise. Collaboration is facilitated by the…

  13. Marketing: A Bibliography of Marketing Reference Sources. The University of Rhode Island University Library.

    ERIC Educational Resources Information Center

    Masten, Lisa

    This annotated bibliography provides a selected list of marketing reference sources for undergraduate and graduate business students interested in marketing and related topics. All sources listed are available in the Reference Department at the University Library at the University of Rhode Island Kingston campus. Most sources, with the exception…

  14. Recommended Reference Books for Small and Medium-Sized Libraries and Media Centers, 1998.

    ERIC Educational Resources Information Center

    Wynar, Bohdan S., Ed.

    This annual review source is designed to assist smaller libraries in the systematic selection of suitable reference materials for their collections. Approximately 500 unabridged reviews, written by over 250 practicing librarians and subject specialists, identify and describe the most useful and affordable reference sources available for small and…

  15. Models of Reference Services in Australian Academic Libraries

    ERIC Educational Resources Information Center

    Burke, Liz

    2008-01-01

    This article reports on a project which was undertaken in 2006 to investigate the current modes and methods for delivering reference services in Australian academic libraries. The project included a literature review to assist in providing a definition of reference services as well as a snapshot of statistics showing staff and patron numbers from…

  16. Reference Sources on CD-ROM at Indiana University.

    ERIC Educational Resources Information Center

    Bristow, Ann

    1988-01-01

    Describes the use of several CD-ROM products to provide access to reference sources in a large academic research library. Equipment and staffing problems and solutions, user reaction, and the impact of optical technologies on the library's fee-based searching service and planning for the future are discussed. (3 references) (Author/CLB)

  17. The Problem Patron and the Academic Library Web Site as Virtual Reference Desk.

    ERIC Educational Resources Information Center

    Taylor, Daniel; Porter, George S.

    2002-01-01

    Considers problem library patrons in a virtual environment based on experiences at California Institute of Technology's Web site and its use for virtual reference. Discusses the virtual reference desk concept; global visibility and access to the World Wide Web; problematic email; and advantages in the electronic environment. (LRW)

  18. Rethinking Reference: Consistent Values, New Methods, and Different Tools

    ERIC Educational Resources Information Center

    Kendrick, Kaetrena Davis

    2011-01-01

    The core duties of the reference librarian inherently mandate that the work environment is not unlike a kaleidoscope: Students and faculty revolve within and around the library, and reference and public services workers do the same; every move temporarily redesigning the library, its collections, and even its very role on campus into something…

  19. shRNA target prediction informed by comprehensive enquiry (SPICE): a supporting system for high-throughput screening of shRNA library.

    PubMed

    Kamatuka, Kenta; Hattori, Masahiro; Sugiyama, Tomoyasu

    2016-12-01

    RNA interference (RNAi) screening is extensively used in the field of reverse genetics. RNAi libraries constructed using random oligonucleotides have made this technology affordable. However, the new methodology requires exploration of the RNAi target gene information after screening because the RNAi library includes non-natural sequences that are not found in genes. Here, we developed a web-based tool to support RNAi screening. The system performs short hairpin RNA (shRNA) target prediction that is informed by comprehensive enquiry (SPICE). SPICE automates several tasks that are laborious but indispensable to evaluate the shRNAs obtained by RNAi screening. SPICE has four main functions: (i) sequence identification of shRNA in the input sequence (the sequence might be obtained by sequencing clones in the RNAi library), (ii) searching the target genes in the database, (iii) demonstrating biological information obtained from the database, and (iv) preparation of search result files that can be utilized in a local personal computer (PC). Using this system, we demonstrated that genes targeted by random oligonucleotide-derived shRNAs were not different from those targeted by organism-specific shRNA. The system facilitates RNAi screening, which requires sequence analysis after screening. The SPICE web application is available at http://www.spice.sugysun.org/.

  20. Maize Endophytic Bacterial Diversity as Affected by Soil Cultivation History.

    PubMed

    Correa-Galeote, David; Bedmar, Eulogio J; Arone, Gregorio J

    2018-01-01

    The bacterial endophytic communities residing within roots of maize ( Zea mays L.) plants cultivated by a sustainable management in soils from the Quechua maize belt (Peruvian Andes) were examined using tags pyrosequencing spanning the V4 and V5 hypervariable regions of the 16S rRNA. Across four replicate libraries, two corresponding to sequences of endophytic bacteria from long time maize-cultivated soils and the other two obtained from fallow soils, 793 bacterial sequences were found that grouped into 188 bacterial operational taxonomic units (OTUs, 97% genetic similarity). The numbers of OTUs in the libraries from the maize-cultivated soils were significantly higher than those found in the libraries from fallow soils. A mean of 30 genera were found in the fallow soil libraries and 47 were in those from the maize-cultivated soils. Both alpha and beta diversity indexes showed clear differences between bacterial endophytic populations from plants with different soil cultivation history and that the soils cultivated for long time requires a higher diversity of endophytes. The number of sequences corresponding to main genera Sphingomonas, Herbaspirillum, Bradyrhizobium and Methylophilus in the maize-cultivated libraries were statistically more abundant than those from the fallow soils. Sequences of genera Dyella and Sreptococcus were significantly more abundant in the libraries from the fallow soils. Relative abundance of genera Burkholderia, candidatus Glomeribacter, Staphylococcus, Variovorax, Bacillus and Chitinophaga were similar among libraries. A canonical correspondence analysis of the relative abundance of the main genera showed that the four libraries distributed in two clearly separated groups. Our results suggest that cultivation history is an important driver of endophytic colonization of maize and that after a long time of cultivation of the soil the maize plants need to increase the richness of the bacterial endophytes communities.

  1. Maize Endophytic Bacterial Diversity as Affected by Soil Cultivation History

    PubMed Central

    Correa-Galeote, David; Bedmar, Eulogio J.; Arone, Gregorio J.

    2018-01-01

    The bacterial endophytic communities residing within roots of maize (Zea mays L.) plants cultivated by a sustainable management in soils from the Quechua maize belt (Peruvian Andes) were examined using tags pyrosequencing spanning the V4 and V5 hypervariable regions of the 16S rRNA. Across four replicate libraries, two corresponding to sequences of endophytic bacteria from long time maize-cultivated soils and the other two obtained from fallow soils, 793 bacterial sequences were found that grouped into 188 bacterial operational taxonomic units (OTUs, 97% genetic similarity). The numbers of OTUs in the libraries from the maize-cultivated soils were significantly higher than those found in the libraries from fallow soils. A mean of 30 genera were found in the fallow soil libraries and 47 were in those from the maize-cultivated soils. Both alpha and beta diversity indexes showed clear differences between bacterial endophytic populations from plants with different soil cultivation history and that the soils cultivated for long time requires a higher diversity of endophytes. The number of sequences corresponding to main genera Sphingomonas, Herbaspirillum, Bradyrhizobium and Methylophilus in the maize-cultivated libraries were statistically more abundant than those from the fallow soils. Sequences of genera Dyella and Sreptococcus were significantly more abundant in the libraries from the fallow soils. Relative abundance of genera Burkholderia, candidatus Glomeribacter, Staphylococcus, Variovorax, Bacillus and Chitinophaga were similar among libraries. A canonical correspondence analysis of the relative abundance of the main genera showed that the four libraries distributed in two clearly separated groups. Our results suggest that cultivation history is an important driver of endophytic colonization of maize and that after a long time of cultivation of the soil the maize plants need to increase the richness of the bacterial endophytes communities. PMID:29662471

  2. Phylogenetic comparison of the methanogenic communities from an acidic, oligotrophic fen and an anaerobic digester treating municipal wastewater sludge.

    PubMed

    Steinberg, Lisa M; Regan, John M

    2008-11-01

    Methanogens play a critical role in the decomposition of organics under anaerobic conditions. The methanogenic consortia in saturated wetland soils are often subjected to large temperature fluctuations and acidic conditions, imposing a selective pressure for psychro- and acidotolerant community members; however, methanogenic communities in engineered digesters are frequently maintained within a narrow range of mesophilic and circumneutral conditions to retain system stability. To investigate the hypothesis that these two disparate environments have distinct methanogenic communities, the methanogens in an oligotrophic acidic fen and a mesophilic anaerobic digester treating municipal wastewater sludge were characterized by creating clone libraries for the 16S rRNA and methyl coenzyme M reductase alpha subunit (mcrA) genes. A quantitative framework was developed to assess the differences between these two communities by calculating the average sequence similarity for 16S rRNA genes and mcrA within a genus and family using sequences of isolated and characterized methanogens within the approved methanogen taxonomy. The average sequence similarities for 16S rRNA genes within a genus and family were 96.0 and 93.5%, respectively, and the average sequence similarities for mcrA within a genus and family were 88.9 and 79%, respectively. The clone libraries of the bog and digester environments showed no overlap at the species level and almost no overlap at the family level. Both libraries were dominated by clones related to uncultured methanogen groups within the Methanomicrobiales, although members of the Methanosarcinales and Methanobacteriales were also found in both libraries. Diversity indices for the 16S rRNA gene library of the bog and both mcrA libraries were similar, but these indices indicated much lower diversity in the 16S digester library than in the other three libraries.

  3. Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species

    PubMed Central

    2010-01-01

    Background The presence of closely related genomes in polyploid species makes the assembly of total genomic sequence from shotgun sequence reads produced by the current sequencing platforms exceedingly difficult, if not impossible. Genomes of polyploid species could be sequenced following the ordered-clone sequencing approach employing contigs of bacterial artificial chromosome (BAC) clones and BAC-based physical maps. Although BAC contigs can currently be constructed for virtually any diploid organism with the SNaPshot high-information-content-fingerprinting (HICF) technology, it is currently unknown if this is also true for polyploid species. It is possible that BAC clones from orthologous regions of homoeologous chromosomes would share numerous restriction fragments and be therefore included into common contigs. Because of this and other concerns, physical mapping utilizing the SNaPshot HICF of BAC libraries of polyploid species has not been pursued and the possibility of doing so has not been assessed. The sole exception has been in common wheat, an allohexaploid in which it is possible to construct single-chromosome or single-chromosome-arm BAC libraries from DNA of flow-sorted chromosomes and bypass the obstacles created by polyploidy. Results The potential of the SNaPshot HICF technology for physical mapping of polyploid plants utilizing global BAC libraries was evaluated by assembling contigs of fingerprinted clones in an in silico merged BAC library composed of single-chromosome libraries of two wheat homoeologous chromosome arms, 3AS and 3DS, and complete chromosome 3B. Because the chromosome arm origin of each clone was known, it was possible to estimate the fidelity of contig assembly. On average 97.78% or more clones, depending on the library, were from a single chromosome arm. A large portion of the remaining clones was shown to be library contamination from other chromosomes, a feature that is unavoidable during the construction of single-chromosome BAC libraries. Conclusions The negligibly low level of incorporation of clones from homoeologous chromosome arms into a contig during contig assembly suggested that it is feasible to construct contigs and physical maps using global BAC libraries of wheat and almost certainly also of other plant polyploid species with genome sizes comparable to that of wheat. Because of the high purity of the resulting assembled contigs, they can be directly used for genome sequencing. It is currently unknown but possible that equally good BAC contigs can be also constructed for polyploid species containing smaller, more gene-rich genomes. PMID:20170511

  4. Arduino-based automation of a DNA extraction system.

    PubMed

    Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won

    2015-01-01

    There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile.

  5. Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly.

    PubMed

    Garcia, T I; Shen, Y; Catchen, J; Amores, A; Schartl, M; Postlethwait, J; Walter, R B

    2012-01-01

    For many researchers, next generation sequencing data holds the key to answering a category of questions previously unassailable. One of the important and challenging steps in achieving these goals is accurately assembling the massive quantity of short sequencing reads into full nucleic acid sequences. For research groups working with non-model or wild systems, short read assembly can pose a significant challenge due to the lack of pre-existing EST or genome reference libraries. While many publications describe the overall process of sequencing and assembly, few address the topic of how many and what types of reads are best for assembly. The goal of this project was use real world data to explore the effects of read quantity and short read quality scores on the resulting de novo assemblies. Using several samples of short reads of various sizes and qualities we produced many assemblies in an automated manner. We observe how the properties of read length, read quality, and read quantity affect the resulting assemblies and provide some general recommendations based on our real-world data set. Published by Elsevier Inc.

  6. Making Strategic Decisions: Conducting and Using Research on the Impact of Sequenced Library Instruction

    ERIC Educational Resources Information Center

    Lundstrom, Kacy; Martin, Pamela; Cochran, Dory

    2016-01-01

    This study explores the relationship between course grades and sequenced library instruction interventions throughout psychology students' curriculum. Researchers conducted this study to inform decisions about sustaining and improving program integrations for first- and second-year composition courses and to improve discipline-level integrations.…

  7. A new single-nucleotide polymorphisms database for rainbow trout generated through whole genome resequencing of selected samples

    USDA-ARS?s Scientific Manuscript database

    Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...

  8. Use of Computer-Based Reference Services in Texas Information Exchange Libraries.

    ERIC Educational Resources Information Center

    Menges, Gary L.

    The Texas Information Exchange (TIE) is a state-wide library network organized in 1967 for the purpose of sharing resources among Texas libraries. Its membership includes 37 college and university libraries, the Texas State Library, and ten public libraries that serve as Major Resource Centers in the Texas State Library Communications Network. In…

  9. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daily, Jeffrey A.

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less

  10. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments

    DOE PAGES

    Daily, Jeffrey A.

    2016-02-10

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less

  11. Microbial Communities in the Surface Mucopolysaccharide Layer and the Black Band Microbial Mat of Black Band-Diseased Siderastrea siderea

    PubMed Central

    Sekar, Raju; Mills, DeEtta K.; Remily, Elizabeth R.; Voss, Joshua D.; Richardson, Laurie L.

    2006-01-01

    Microbial community profiles and species composition associated with two black band-diseased colonies of the coral Siderastrea siderea were studied by 16S rRNA-targeted gene cloning, sequencing, and amplicon-length heterogeneity PCR (LH-PCR). Bacterial communities associated with the surface mucopolysaccharide layer (SML) of apparently healthy tissues of the infected colonies, together with samples of the black band disease (BBD) infections, were analyzed using the same techniques for comparison. Gene sequences, ranging from 424 to 1,537 bp, were retrieved from all positive clones (n = 43 to 48) in each of the four clone libraries generated and used for comparative sequence analysis. In addition to LH-PCR community profiling, all of the clone sequences were aligned with LH-PCR primer sequences, and the theoretical lengths of the amplicons were determined. Results revealed that the community profiles were significantly different between BBD and SML samples. The SML samples were dominated by γ-proteobacteria (53 to 64%), followed by β-proteobacteria (18 to 21%) and α-proteobacteria (5 to 11%). In contrast, both BBD clone libraries were dominated by α-proteobacteria (58 to 87%), followed by verrucomicrobia (2 to 10%) and 0 to 6% each of δ-proteobacteria, bacteroidetes, firmicutes, and cyanobacteria. Alphaproteobacterial sequence types related to the bacteria associated with toxin-producing dinoflagellates were observed in BBD clone libraries but were not found in the SML libraries. Similarly, sequences affiliated with the family Desulfobacteraceae and toxin-producing cyanobacteria, both believed to be involved in BBD pathogenesis, were found only in BBD libraries. These data provide evidence for an association of numerous toxin-producing heterotrophic microorganisms with BBD of corals. PMID:16957217

  12. Characterization of the Prokaryotic Diversity in Cold Saline Perennial Springs of the Canadian High Arctic▿

    PubMed Central

    Perreault, Nancy N.; Andersen, Dale T.; Pollard, Wayne H.; Greer, Charles W.; Whyte, Lyle G.

    2007-01-01

    The springs at Gypsum Hill and Colour Peak on Axel Heiberg Island in the Canadian Arctic originate from deep salt aquifers and are among the few known examples of cold springs in thick permafrost on Earth. The springs discharge cold anoxic brines (7.5 to 15.8% salts), with a mean oxidoreduction potential of −325 mV, and contain high concentrations of sulfate and sulfide. We surveyed the microbial diversity in the sediments of seven springs by denaturing gradient gel electrophoresis (DGGE) and analyzing clone libraries of 16S rRNA genes amplified with Bacteria and Archaea-specific primers. Dendrogram analysis of the DGGE banding patterns divided the springs into two clusters based on their geographic origin. Bacterial 16S rRNA clone sequences from the Gypsum Hill library (spring GH-4) were classified into seven phyla (Actinobacteria, Bacteroidetes, Firmicutes, Gemmatimonadetes, Proteobacteria, Spirochaetes, and Verrucomicrobia); Deltaproteobacteria and Gammaproteobacteria sequences represented half of the clone library. Sequences related to Proteobacteria (82%), Firmicutes (9%), and Bacteroidetes (6%) constituted 97% of the bacterial clone library from Colour Peak (spring CP-1). Most GH-4 archaeal clone sequences (79%) were related to the Crenarchaeota while half of the CP-1 sequences were related to orders Halobacteriales and Methanosarcinales of the Euryarchaeota. Sequences related to the sulfur-oxidizing bacterium Thiomicrospira psychrophila dominated both the GH-4 (19%) and CP-1 (45%) bacterial libraries, and 56 to 76% of the bacterial sequences were from potential sulfur-metabolizing bacteria. These results suggest that the utilization and cycling of sulfur compounds may play a major role in the energy production and maintenance of microbial communities in these unique, cold environments. PMID:17220254

  13. X-MATE: a flexible system for mapping short read data

    PubMed Central

    Pearson, John V.; Cloonan, Nicole; Grimmond, Sean M.

    2011-01-01

    Summary: Accurate and complete mapping of short-read sequencing to a reference genome greatly enhances the discovery of biological results and improves statistical predictions. We recently presented RNA-MATE, a pipeline for the recursive mapping of RNA-Seq datasets. With the rapid increase in genome re-sequencing projects, progression of available mapping software and the evolution of file formats, we now present X-MATE, an updated version of RNA-MATE, capable of mapping both RNA-Seq and DNA datasets and with improved performance, output file formats, configuration files, and flexibility in core mapping software. Availability: Executables, source code, junction libraries, test data and results and the user manual are available from http://grimmond.imb.uq.edu.au/X-MATE/. Contact: n.cloonan@uq.edu.au; s.grimmond@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:21216778

  14. Construction of Chinese adult male phantom library and its application in the virtual calibration of in vivo measurement.

    PubMed

    Chen, Yizheng; Qiu, Rui; Li, Chunyan; Wu, Zhen; Li, Junli

    2016-03-07

    In vivo measurement is a main method of internal contamination evaluation, particularly for large numbers of people after a nuclear accident. Before the practical application, it is necessary to obtain the counting efficiency of the detector by calibration. The virtual calibration based on Monte Carlo simulation usually uses the reference human computational phantom, and the morphological difference between the monitored personnel with the calibrated phantom may lead to the deviation of the counting efficiency. Therefore, a phantom library containing a wide range of heights and total body masses is needed. In this study, a Chinese reference adult male polygon surface (CRAM_S) phantom was constructed based on the CRAM voxel phantom, with the organ models adjusted to match the Chinese reference data. CRAM_S phantom was then transformed to sitting posture for convenience in practical monitoring. Referring to the mass and height distribution of the Chinese adult male, a phantom library containing 84 phantoms was constructed by deforming the reference surface phantom. Phantoms in the library have 7 different heights ranging from 155 cm to 185 cm, and there are 12 phantoms with different total body masses in each height. As an example of application, organ specific and total counting efficiencies of Ba-133 were calculated using the MCNPX code, with two series of phantoms selected from the library. The influence of morphological variation on the counting efficiency was analyzed. The results show only using the reference phantom in virtual calibration may lead to an error of 68.9% for total counting efficiency. Thus the influence of morphological difference on virtual calibration can be greatly reduced using the phantom library with a wide range of masses and heights instead of a single reference phantom.

  15. Construction of Chinese adult male phantom library and its application in the virtual calibration of in vivo measurement

    NASA Astrophysics Data System (ADS)

    Chen, Yizheng; Qiu, Rui; Li, Chunyan; Wu, Zhen; Li, Junli

    2016-03-01

    In vivo measurement is a main method of internal contamination evaluation, particularly for large numbers of people after a nuclear accident. Before the practical application, it is necessary to obtain the counting efficiency of the detector by calibration. The virtual calibration based on Monte Carlo simulation usually uses the reference human computational phantom, and the morphological difference between the monitored personnel with the calibrated phantom may lead to the deviation of the counting efficiency. Therefore, a phantom library containing a wide range of heights and total body masses is needed. In this study, a Chinese reference adult male polygon surface (CRAM_S) phantom was constructed based on the CRAM voxel phantom, with the organ models adjusted to match the Chinese reference data. CRAMS phantom was then transformed to sitting posture for convenience in practical monitoring. Referring to the mass and height distribution of the Chinese adult male, a phantom library containing 84 phantoms was constructed by deforming the reference surface phantom. Phantoms in the library have 7 different heights ranging from 155 cm to 185 cm, and there are 12 phantoms with different total body masses in each height. As an example of application, organ specific and total counting efficiencies of Ba-133 were calculated using the MCNPX code, with two series of phantoms selected from the library. The influence of morphological variation on the counting efficiency was analyzed. The results show only using the reference phantom in virtual calibration may lead to an error of 68.9% for total counting efficiency. Thus the influence of morphological difference on virtual calibration can be greatly reduced using the phantom library with a wide range of masses and heights instead of a single reference phantom.

  16. Reflections on Reference Services.

    ERIC Educational Resources Information Center

    Brandt, Kerryn A.; And Others

    1996-01-01

    Describes programmatic changes in reference services at the Johns Hopkins University (Maryland) medical library and speculates on the future. Topics include institutional restructuring and consolidation; improvements in technology infrastructure; external economic pressure; and fiscal accountability, including library funding and cost center…

  17. Comparative Analysis of Expressed Genes from Cacao Meristems Infected by Moniliophthora perniciosa

    PubMed Central

    Gesteira, Abelmon S.; Micheli, Fabienne; Carels, Nicolas; Da Silva, Aline C.; Gramacho, Karina P.; Schuster, Ivan; Macêdo, Joci N.; Pereira, Gonçalo A. G.; Cascardo, Júlio C. M.

    2007-01-01

    Background and Aims Witches' broom disease is caused by the hemibiotrophic basidiomycete Moniliophthora perniciosa, and is one of the most important diseases of cacao in the western hemisphere. Because very little is known about the global process of such disease development, expressed sequence tags (ESTs) were used to identify genes expressed during the Theobroma cacao–Moniliophthora perniciosa interaction. Methods Two cDNA libraries corresponding to the resistant (RT) and susceptible (SP) cacao–M. perniciosa interactions were constructed from total RNA, using the DB SMART Creator cDNA library kit (Clontech). Clones were randomly selected, sequenced from the 5′ end and analysed using bioinformatics tools including in silico analysis of the differential gene expression. Key Results A total of 6884 ESTs were generated from the RT and SP cDNA libraries. These ESTs were composed of 2585 singlets and 341 contigs for a total of 2926 non-redundant sequences. The redundancy of the libraries was low and their specificity high when compared with the few other cacao libraries already published. Sequence analysis allowed the assignment of a putative functional category for 54 % of sequences, whereas approx. 22 % of sequences corresponded to unknown function and approx. 24 % of sequences did not show any significant similarity with other proteins present in the database. Despite the similar overall distribution of the sequences in functional categories between the two libraries, qualitative differences were observed. Genes involved during the defence response to pathogen infection or in programmed cell death were identified, such as pathogenesis related-proteins, trypsin inhibitor or oxalate oxidase, and some of them showed an in silico differential expression between the resistant and the susceptible interactions. Conclusions As far as is known this is the first EST resource from the cacao–M. perniciosa interaction and it is believed that it will provide a significant contribution to the understanding of the molecular mechanisms of the resistance and susceptibility of cacao to M. perniciosa, to develop strategies to control witches broom, and as a source of polymorphism for molecular marker development and marker-assisted selection. PMID:17557832

  18. Less Words, More Action: Using On-the-Fly Videos and Screenshots in Your Library's IM/Chat and Email Reference Transactions

    ERIC Educational Resources Information Center

    Sekyere, Kwabena

    2010-01-01

    Email and chat/IM reference services have become a convenient and easily accessible option for the online community and libraries, particularly with increasing amounts of library resources now available electronically. This article gives an overview of Jing, which can be used to produce videos and screenshots on-the-fly, and demonstrates how to…

  19. Hybrid sequencing approach applied to human fecal metagenomic clone libraries revealed clones with potential biotechnological applications.

    PubMed

    Džunková, Mária; D'Auria, Giuseppe; Pérez-Villarroya, David; Moya, Andrés

    2012-01-01

    Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be "domesticated" for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7-15 kb) cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs) with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.

  20. Engaging a Wider Community: The Academic Library as a Center for Creativity, Discovery, and Collaboration

    ERIC Educational Resources Information Center

    Shapiro, Steven D.

    2016-01-01

    Academic libraries have reported long-term declines in circulation, reference transactions, reserves, and in-house library materials usage. Increasingly, libraries are perceived as being less critical to the academic enterprise. Are these trends irreversible? Perhaps public libraries and some innovative academic libraries can provide us with some…

  1. Medical Libraries of the Soviet Union *

    PubMed Central

    Morozov, A.

    1964-01-01

    Medical libraries are a part of the Soviet library system. The total number of medical libraries in the country is more than 4,000, with a collection of over 42,000,000 volumes that are used by over a million readers. The State Central Medical Library occupies a special place among these libraries. It carries out the functions of a methodological, bibliographic, and coordinating center. It has a collection of over 1,000,000 units of books and periodicals. All the bibliographic work of medical libraries as well as any other work is provided to help medical institutions. Libraries prepare special bibliographies of medical literature for publication according to a plan. The libraries also conduct reference and information work. The methodological work helps to solve the most important problems that arise in libraries with reference to the specific character of their work and tasks. The chief means of rendering methodological guidance are to analyze the work of various libraries, to hold conferences, to exchange visits with libraries, to give both field and correspondence consultations, and to organize qualification courses for librarians and bibliographers. PMID:14119299

  2. DNA Barcode Analysis of Thrips (Thysanoptera) Diversity in Pakistan Reveals Cryptic Species Complexes.

    PubMed

    Iftikhar, Romana; Ashfaq, Muhammad; Rasool, Akhtar; Hebert, Paul D N

    2016-01-01

    Although thrips are globally important crop pests and vectors of viral disease, species identifications are difficult because of their small size and inconspicuous morphological differences. Sequence variation in the mitochondrial COI-5' (DNA barcode) region has proven effective for the identification of species in many groups of insect pests. We analyzed barcode sequence variation among 471 thrips from various plant hosts in north-central Pakistan. The Barcode Index Number (BIN) system assigned these sequences to 55 BINs, while the Automatic Barcode Gap Discovery detected 56 partitions, a count that coincided with the number of monophyletic lineages recognized by Neighbor-Joining analysis and Bayesian inference. Congeneric species showed an average of 19% sequence divergence (range = 5.6% - 27%) at COI, while intraspecific distances averaged 0.6% (range = 0.0% - 7.6%). BIN analysis suggested that all intraspecific divergence >3.0% actually involved a species complex. In fact, sequences for three major pest species (Haplothrips reuteri, Thrips palmi, Thrips tabaci), and one predatory thrips (Aeolothrips intermedius) showed deep intraspecific divergences, providing evidence that each is a cryptic species complex. The study compiles the first barcode reference library for the thrips of Pakistan, and examines global haplotype diversity in four important pest thrips.

  3. Sequence and pattern of expression of a bovine homologue of a human mitochondrial transport protein associated with Grave's disease.

    PubMed

    Fiermonte, G; Runswick, M J; Walker, J E; Palmieri, F

    1992-01-01

    A human cDNA has been isolated previously from a thyroid library with the aid of serum from a patient with Grave's disease. It encodes a protein belonging to the mitochondrial metabolite carrier family, referred to as the Grave's disease carrier protein (GDC). Using primers based on this sequence, overlapping cDNAs encoding the bovine homologue of the GDC have been isolated from total bovine heart poly(A)+ cDNA. The bovine protein is 18 amino acids shorter than the published human sequence, but if a frame shift requiring the removal of one nucleotide is introduced into the human cDNA sequence, the human and bovine proteins become identical in their C-terminal regions, and 308 out of 330 amino acids are conserved over their entire sequences. The bovine cDNA has been used to investigate the expression of the GDC in various bovine tissues. In the tissues that were examined, the GDC is most strongly expressed in the thyroid, but substantial amounts of its mRNA were also detected in liver, lung and kidney, and lesser amounts in heart and skeletal muscle.

  4. Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

    PubMed

    Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

    2016-02-27

    In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.

  5. A Survey of Telephone Inquiries: Case Study and Operational Impact in an Academic Library Reference Department.

    ERIC Educational Resources Information Center

    Allen, Frank R.; Smith, Rita H.

    1993-01-01

    Describes a survey that was conducted at the University of Tennessee at Knoxville library to count and categorize the types of questions coming into the reference department from telephone calls. Informational and directional calls are examined, implications for staffing are considered, and queuing theory is applied. (seven references) (LRW)

  6. Death of Reference or Birth of a New Marketing Age?

    ERIC Educational Resources Information Center

    Henry, Jo

    2011-01-01

    Reference transactions in academic libraries have been on the decline since mid-1990. The Academic Library Survey from the National Center for Education Statistics shows an average drop of 25% in reference use from 1996-2004 with higher numbers at some institutions such as the University of Maryland which plummeted 47% (Martell, 2008). The…

  7. Making Decisions: Using Electronic Data Collection to Re-Envision Reference Services at the USF Tampa Libraries

    ERIC Educational Resources Information Center

    Todorinova, Lily; Huse, Andy; Lewis, Barbara; Torrence, Matt

    2011-01-01

    Declining reference statistics, diminishing human resources, and the desire to be more proactive and embedded in academic departments, prompted the University of South Florida Library to create a taskforce for re-envisioning reference services. The taskforce was charged with examining the staffing patterns at the desk and developing…

  8. Distance Education and Virtual Reference: Implementing a Marketing Plan at Texas A&M University

    ERIC Educational Resources Information Center

    MacDonald, Karen I.; vanDuinkerken, Wyoma

    2005-01-01

    Texas A&M University Libraries has been testing virtual reference services since February 2004, but during the fall semester 2005, the Libraries began implementing and actively promoting the services to various target groups. Distance education students were identified as a primary target group for virtual reference services, and as of the…

  9. Sequencing thousands of single-cell genomes with combinatorial indexing.

    PubMed

    Vitak, Sarah A; Torkenczy, Kristof A; Rosenkrantz, Jimi L; Fields, Andrew J; Christiansen, Lena; Wong, Melissa H; Carbone, Lucia; Steemers, Frank J; Adey, Andrew

    2017-03-01

    Single-cell genome sequencing has proven valuable for the detection of somatic variation, particularly in the context of tumor evolution. Current technologies suffer from high library construction costs, which restrict the number of cells that can be assessed and thus impose limitations on the ability to measure heterogeneity within a tissue. Here, we present single-cell combinatorial indexed sequencing (SCI-seq) as a means of simultaneously generating thousands of low-pass single-cell libraries for detection of somatic copy-number variants. We constructed libraries for 16,698 single cells from a combination of cultured cell lines, primate frontal cortex tissue and two human adenocarcinomas, and obtained a detailed assessment of subclonal variation within a pancreatic tumor.

  10. Testing DNA barcode performance in 1000 species of European lepidoptera: large geographic distances have small genetic impacts.

    PubMed

    Huemer, Peter; Mutanen, Marko; Sefc, Kristina M; Hebert, Paul D N

    2014-01-01

    This study examines the performance of DNA barcodes (mt cytochrome c oxidase 1 gene) in the identification of 1004 species of Lepidoptera shared by two localities (Finland, Austria) that are 1600 km apart. Maximum intraspecific distances for the pooled data were less than 2% for 880 species (87.6%), while deeper divergence was detected in 124 species. Despite such variation, the overall DNA barcode library possessed diagnostic COI sequences for 98.8% of the taxa. Because a reference library based on Finnish specimens was highly effective in identifying specimens from Austria, we conclude that barcode libraries based on regional sampling can often be effective for a much larger area. Moreover, dispersal ability (poor, good) and distribution patterns (disjunct, fragmented, continuous, migratory) had little impact on levels of intraspecific geographic divergence. Furthermore, the present study revealed that, despite the intensity of past taxonomic work on European Lepidoptera, nearly 20% of the species shared by Austria and Finland require further work to clarify their status. Particularly discordant BIN (Barcode Index Number) cases should be checked to ascertain possible explanatory factors such as incorrect taxonomy, hybridization, introgression, and Wolbachia infections.

  11. An annotated cDNA library of juvenile Euprymna scolopes with and without colonization by the symbiont Vibrio fischeri

    PubMed Central

    Chun, Carlene K; Scheetz, Todd E; Bonaldo, Maria de Fatima; Brown, Bartley; Clemens, Anik; Crookes-Goodson, Wendy J; Crouch, Keith; DeMartini, Tad; Eyestone, Mari; Goodson, Michael S; Janssens, Bernadette; Kimbell, Jennifer L; Koropatnick, Tanya A; Kucaba, Tamara; Smith, Christina; Stewart, Jennifer J; Tong, Deyan; Troll, Joshua V; Webster, Sarahrose; Winhall-Rice, Jane; Yap, Cory; Casavant, Thomas L; McFall-Ngai, Margaret J; Soares, M Bento

    2006-01-01

    Background Biologists are becoming increasingly aware that the interaction of animals, including humans, with their coevolved bacterial partners is essential for health. This growing awareness has been a driving force for the development of models for the study of beneficial animal-bacterial interactions. In the squid-vibrio model, symbiotic Vibrio fischeri induce dramatic developmental changes in the light organ of host Euprymna scolopes over the first hours to days of their partnership. We report here the creation of a juvenile light-organ specific EST database. Results We generated eleven cDNA libraries from the light organ of E. scolopes at developmentally significant time points with and without colonization by V. fischeri. Single pass 3' sequencing efforts generated 42,564 expressed sequence tags (ESTs) of which 35,421 passed our quality criteria and were then clustered via the UIcluster program into 13,962 nonredundant sequences. The cDNA clones representing these nonredundant sequences were sequenced from the 5' end of the vector and 58% of these resulting sequences overlapped significantly with the associated 3' sequence to generate 8,067 contigs with an average sequence length of 1,065 bp. All sequences were annotated with BLASTX (E-value < -03) and Gene Ontology (GO). Conclusion Both the number of ESTs generated from each library and GO categorizations are reflective of the activity state of the light organ during these early stages of symbiosis. Future analyses of the sequences identified in these libraries promise to provide valuable information not only about pathways involved in colonization and early development of the squid light organ, but also about pathways conserved in response to bacterial colonization across the animal kingdom. PMID:16780587

  12. De novo characterization of microRNAs in oriental fruit moth Grapholita molesta and selection of reference genes for normalization of microRNA expression

    PubMed Central

    Zhang, Jing; Zhang, Qingwen; Liu, Xiaoxia; Li, Zhen

    2017-01-01

    MicroRNAs (miRNAs) are a group of endogenous non-coding small RNAs that have critical regulatory functions in almost all known biological processes at the post-transcriptional level in a variety of organisms. The oriental fruit moth Grapholita molesta is one of the most serious pests in orchards worldwide and threatens the production of Rosacea fruits. In this study, a de novo small RNA library constructed from mixed stages of G. molesta was sequenced through Illumina sequencing platform and a total of 536 mature miRNAs consisting of 291 conserved and 245 novel miRNAs were identified. Most of the conserved and novel miRNAs were detected with moderate abundance. The miRNAs in the same cluster normally showed correlated expressional profiles. A comparative analysis of the 79 conserved miRNA families within 31 arthropod species indicated that these miRNA families were more conserved among insects and within orders of closer phylogenetic relationships. The KEGG pathway analysis and network prediction of target genes indicated that the complex composed of miRNAs, clock genes and developmental regulation genes may play vital roles to regulate the developmental circadian rhythm of G. molesta. Furthermore, based on the sRNA library of G. molesta, suitable reference genes were selected and validated for study of miRNA transcriptional profile in G. molesta under two biotic and six abiotic experimental conditions. This study systematically documented the miRNA profile in G. molesta, which could lay a foundation for further understanding of the regulatory roles of miRNAs in the development and metabolism in this pest and might also suggest clues to the development of genetic-based techniques for agricultural pest control. PMID:28158242

  13. GigaTON: an extensive publicly searchable database providing a new reference transcriptome in the pacific oyster Crassostrea gigas.

    PubMed

    Riviere, Guillaume; Klopp, Christophe; Ibouniyamine, Nabihoudine; Huvet, Arnaud; Boudry, Pierre; Favrel, Pascal

    2015-12-02

    The Pacific oyster, Crassostrea gigas, is one of the most important aquaculture shellfish resources worldwide. Important efforts have been undertaken towards a better knowledge of its genome and transcriptome, which makes now C. gigas becoming a model organism among lophotrochozoans, the under-described sister clade of ecdysozoans within protostomes. These massive sequencing efforts offer the opportunity to assemble gene expression data and make such resource accessible and exploitable for the scientific community. Therefore, we undertook this assembly into an up-to-date publicly available transcriptome database: the GigaTON (Gigas TranscriptOme pipeliNe) database. We assembled 2204 million sequences obtained from 114 publicly available RNA-seq libraries that were realized using all embryo-larval development stages, adult organs, different environmental stressors including heavy metals, temperature, salinity and exposure to air, which were mostly performed as part of the Crassostrea gigas genome project. This data was analyzed in silico and resulted into 56621 newly assembled contigs that were deposited into a publicly available database, the GigaTON database. This database also provides powerful and user-friendly request tools to browse and retrieve information about annotation, expression level, UTRs, splice and polymorphism, and gene ontology associated to all the contigs into each, and between all libraries. The GigaTON database provides a convenient, potent and versatile interface to browse, retrieve, confront and compare massive transcriptomic information in an extensive range of conditions, tissues and developmental stages in Crassostrea gigas. To our knowledge, the GigaTON database constitutes the most extensive transcriptomic database to date in marine invertebrates, thereby a new reference transcriptome in the oyster, a highly valuable resource to physiologists and evolutionary biologists.

  14. Analysis of expressed sequence tags from the four main developmental stages of Trypanosoma congolense

    PubMed Central

    Helm, Jared R.; Hertz-Fowler, Christiane; Aslett, Martin; Berriman, Matthew; Sanders, Mandy; Quail, Michael A.; Soares, Marcelo B.; Bonaldo, Maria F.; Sakurai, Tatsuya; Inoue, Noboru; Donelson, John E.

    2009-01-01

    Trypanosoma congolense is one of the most economically important pathogens of livestock in Africa. Culture-derived parasites of each of the three main insect stages of the T. congolense life cycle, i.e., the procyclic, epimastigote and metacyclic stages, and bloodstream stage parasites isolated from infected mice, were used to construct stage-specific cDNA libraries and expressed sequence tags (ESTs or cDNA clones) in each library were sequenced. Thirteen EST clusters encoding different variant surface glycoproteins (VSGs) were detected in the metacyclic library and twenty-six VSG EST clusters were found in the bloodstream library, six of which are shared by the metacyclic library. Rare VSG ESTs are present in the epimastigote library, and none were detected in the procyclic library. ESTs encoding enzymes that catalyze oxidative phosphorylation and amino acid metabolism are about twice as abundant in the procyclic and epimastigote stages as in the metacyclic and bloodstream stages. In contrast, ESTs encoding enzymes involved in glycolysis, the citric acid cycle and nucleotide metabolism are about the same in all four developmental stages. Cysteine proteases, kinases and phosphatases are the most abundant enzyme groups represented by the ESTs. All four libraries contain T. congolense-specific expressed sequences not present in the T. brucei and T. cruzi genomes. Normalized cDNA libraries were constructed from the metacyclic and bloodstream stages, and found to be further enriched for T. congolense-specific ESTs. Given that cultured T. congolense offers an experimental advantage over other African trypanosome species, these ESTs provide a basis for further investigation of the molecular properties of these four developmental stages, especially the epimastigote and metacyclic stages for which it is difficult to obtain large quantities of organisms. The T. congolense EST databases are available at: http://www.sanger.ac.uk/Projects/T_congolense/EST_index.shtml. PMID:19559733

  15. Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics.

    PubMed

    Dutheil, Julien; Gaillard, Sylvain; Bazin, Eric; Glémin, Sylvain; Ranwez, Vincent; Galtier, Nicolas; Belkhir, Khalid

    2006-04-04

    A large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/output methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications. We present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets), various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc.), phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization), population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses) and various algorithms for numerical calculus. Implementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP.

  16. A target-unrelated peptide in an M13 phage display library traced to an advantageous mutation in the gene II ribosome-binding site.

    PubMed

    Brammer, Leighanne A; Bolduc, Benjamin; Kass, Jessica L; Felice, Kristin M; Noren, Christopher J; Hall, Marilena Fitzsimons

    2008-02-01

    Screening of the commercially available Ph.D.-7 phage-displayed heptapeptide library for peptides that bind immobilized Zn2+ resulted in the repeated selection of the peptide HAIYPRH, although binding assays indicated that HAIYPRH is not a zinc-binding peptide. HAIYPRH has also been selected in several other laboratories using completely different targets, and its ubiquity suggests that it is a target-unrelated peptide. We demonstrated that phage displaying HAIYPRH are enriched after serial amplification of the library without exposure to target. The amplification of phage displaying HAIYPRH was found to be dramatically faster than that of the library itself. DNA sequencing uncovered a mutation in the Shine-Dalgarno (SD) sequence for gIIp, a protein involved in phage replication, imparting to the SD sequence better complementarity to the 16S ribosomal RNA (rRNA). Introducing this mutation into phage lacking a displayed peptide resulted in accelerated propagation, whereas phage displaying HAIYPRH with a wild-type SD sequence were found to amplify normally. The SD mutation may alter gIIp expression and, consequently, the rate of propagation of phage. In the Ph.D.-7 library, the mutation is coincident with the displayed peptide HAIYPRH, accounting for the target-unrelated selection of this peptide in multiple reported panning experiments.

  17. High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency

    PubMed Central

    Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J.; Burnett, John C.; Zhou, Jiehua

    2016-01-01

    The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct “biased sequences” and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the “biased sequences” was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy. PMID:27652575

  18. Quality control of next-generation sequencing library through an integrative digital microfluidic platform.

    PubMed

    Thaitrong, Numrin; Kim, Hanyoup; Renzi, Ronald F; Bartsch, Michael S; Meagher, Robert J; Patel, Kamlesh D

    2012-12-01

    We have developed an automated quality control (QC) platform for next-generation sequencing (NGS) library characterization by integrating a droplet-based digital microfluidic (DMF) system with a capillary-based reagent delivery unit and a quantitative CE module. Using an in-plane capillary-DMF interface, a prepared sample droplet was actuated into position between the ground electrode and the inlet of the separation capillary to complete the circuit for an electrokinetic injection. Using a DNA ladder as an internal standard, the CE module with a compact LIF detector was capable of detecting dsDNA in the range of 5-100 pg/μL, suitable for the amount of DNA required by the Illumina Genome Analyzer sequencing platform. This DMF-CE platform consumes tenfold less sample volume than the current Agilent BioAnalyzer QC technique, preserving precious sample while providing necessary sensitivity and accuracy for optimal sequencing performance. The ability of this microfluidic system to validate NGS library preparation was demonstrated by examining the effects of limited-cycle PCR amplification on the size distribution and the yield of Illumina-compatible libraries, demonstrating that as few as ten cycles of PCR bias the size distribution of the library toward undesirable larger fragments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).

    PubMed

    Hoskins, Roger A; Stapleton, Mark; George, Reed A; Yu, Charles; Wan, Kenneth H; Carlson, Joseph W; Celniker, Susan E

    2005-12-02

    cDNA cloning is a central technology in molecular biology. cDNA sequences are used to determine mRNA transcript structures, including splice junctions, open reading frames (ORFs) and 5'- and 3'-untranslated regions (UTRs). cDNA clones are valuable reagents for functional studies of genes and proteins. Expressed Sequence Tag (EST) sequencing is the method of choice for recovering cDNAs representing many of the transcripts encoded in a eukaryotic genome. However, EST sequencing samples a cDNA library at random, and it recovers transcripts with low expression levels inefficiently. We describe a PCR-based method for directed screening of plasmid cDNA libraries. We demonstrate its utility in a screen of libraries used in our Drosophila EST projects for 153 transcription factor genes that were not represented by full-length cDNA clones in our Drosophila Gene Collection. We recovered high-quality, full-length cDNAs for 72 genes and variously compromised clones for an additional 32 genes. The method can be used at any scale, from the isolation of cDNA clones for a particular gene of interest, to the improvement of large gene collections in model organisms and the human. Finally, we discuss the relative merits of directed cDNA library screening and RT-PCR approaches.

  20. Essential Genes for In Vitro Growth of the Endophyte Herbaspirillum seropedicae SmR1 as Revealed by Transposon Insertion Site Sequencing

    PubMed Central

    Rosconi, Federico; de Vries, Stefan P. W.; Baig, Abiyad; Fabiano, Elena

    2016-01-01

    ABSTRACT The interior of plants contains microorganisms (referred to as endophytes) that are distinct from those present at the root surface or in the surrounding soil. Herbaspirillum seropedicae strain SmR1, belonging to the betaproteobacteria, is an endophyte that colonizes crops, including rice, maize, sugarcane, and sorghum. Different approaches have revealed genes and pathways regulated during the interactions of H. seropedicae with its plant hosts. However, functional genomic analysis of transposon (Tn) mutants has been hampered by the lack of genetic tools. Here we successfully employed a combination of in vivo high-density mariner Tn mutagenesis and targeted Tn insertion site sequencing (Tn-seq) in H. seropedicae SmR1. The analysis of multiple gene-saturating Tn libraries revealed that 395 genes are essential for the growth of H. seropedicae SmR1 in tryptone-yeast extract medium. A comparative analysis with the Database of Essential Genes (DEG) showed that 25 genes are uniquely essential in H. seropedicae SmR1. The Tn mutagenesis protocol developed and the gene-saturating Tn libraries generated will facilitate elucidation of the genetic mechanisms of the H. seropedicae endophytic lifestyle. IMPORTANCE A focal point in the study of endophytes is the development of effective biofertilizers that could help to reduce the input of agrochemicals in croplands. Besides the ability to promote plant growth, a good biofertilizer should be successful in colonizing its host and competing against the native microbiota. By using a systematic Tn-based gene-inactivation strategy and massively parallel sequencing of Tn insertion sites (Tn-seq), it is possible to study the fitness of thousands of Tn mutants in a single experiment. We have applied the combination of these techniques to the plant-growth-promoting endophyte Herbaspirillum seropedicae SmR1. The Tn mutant libraries generated will enable studies into the genetic mechanisms of H. seropedicae-plant interactions. The approach that we have taken is applicable to other plant-interacting bacteria. PMID:27590816

  1. Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

    USDA-ARS?s Scientific Manuscript database

    Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...

  2. Identification of antigenic regions on VP2 of African horsesickness virus serotype 3 by using phage-displayed epitope libraries.

    PubMed

    Bentley, L; Fehrsen, J; Jordaan, F; Huismans, H; du Plessis, D H

    2000-04-01

    VP2 is an outer capsid protein of African horsesickness virus (AHSV) and is recognized by serotype-discriminatory neutralizing antibodies. With the objective of locating its antigenic regions, a filamentous phage library was constructed that displayed peptides derived from the fragmentation of a cDNA copy of the gene encoding VP2. Peptides ranging in size from approximately 30 to 100 amino acids were fused with pIII, the attachment protein of the display vector, fUSE2. To ensure maximum diversity, the final library consisted of three sub-libraries. The first utilized enzymatically fragmented DNA encoding only the VP2 gene, the second included plasmid sequences, while the third included a PCR step designed to allow different peptide-encoding sequences to recombine before ligation into the vector. The resulting composite library was subjected to immunoaffinity selection with AHSV-specific polyclonal chicken IgY, polyclonal horse immunoglobulins and a monoclonal antibody (MAb) known to neutralize AHSV. Antigenic peptides were located by sequencing the DNA of phages bound by the antibodies. Most antigenic determinants capable of being mapped by this method were located in the N-terminal half of VP2. Important binding areas were mapped with high resolution by identifying the minimum overlapping areas of the selected peptides. The MAb was also used to screen a random 17-mer epitope library. Sequences that may be part of a discontinuous neutralization epitope were identified. The amino acid sequences of the antigenic regions on VP2 of serotype 3 were compared with corresponding regions on three other serotypes, revealing regions with the potential to discriminate AHSV serotypes serologically.

  3. Computers and Libraries-A Reply

    ERIC Educational Resources Information Center

    Salton, Gerard

    1971-01-01

    A consideration of the various possibilities which are available for improving library operations and collection control makes it plain that the response of the library community ought to be towards greater cooperation among library centers and more extensive standardization. (31 references) (Author)

  4. Large-scale DNA Barcode Library Generation for Biomolecule Identification in High-throughput Screens.

    PubMed

    Lyons, Eli; Sheridan, Paul; Tremmel, Georg; Miyano, Satoru; Sugano, Sumio

    2017-10-24

    High-throughput screens allow for the identification of specific biomolecules with characteristics of interest. In barcoded screens, DNA barcodes are linked to target biomolecules in a manner allowing for the target molecules making up a library to be identified by sequencing the DNA barcodes using Next Generation Sequencing. To be useful in experimental settings, the DNA barcodes in a library must satisfy certain constraints related to GC content, homopolymer length, Hamming distance, and blacklisted subsequences. Here we report a novel framework to quickly generate large-scale libraries of DNA barcodes for use in high-throughput screens. We show that our framework dramatically reduces the computation time required to generate large-scale DNA barcode libraries, compared with a naїve approach to DNA barcode library generation. As a proof of concept, we demonstrate that our framework is able to generate a library consisting of one million DNA barcodes for use in a fragment antibody phage display screening experiment. We also report generating a general purpose one billion DNA barcode library, the largest such library yet reported in literature. Our results demonstrate the value of our novel large-scale DNA barcode library generation framework for use in high-throughput screening applications.

  5. T7 lytic phage-displayed peptide libraries exhibit less sequence bias than M13 filamentous phage-displayed peptide libraries.

    PubMed

    Krumpe, Lauren R H; Atkinson, Andrew J; Smythers, Gary W; Kandel, Andrea; Schumacher, Kathryn M; McMahon, James B; Makowski, Lee; Mori, Toshiyuki

    2006-08-01

    We investigated whether the T7 system of phage display could produce peptide libraries of greater diversity than the M13 system of phage display due to the differing processes of lytic and filamentous phage morphogenesis. Using a bioinformatics-assisted computational approach, collections of random peptide sequences obtained from a T7 12-mer library (X(12)) and a T7 7-mer disulfide-constrained library (CX(7)C) were analyzed and compared with peptide populations obtained from New England BioLabs' M13 Ph.D.-12 and Ph.D.-C7C libraries. Based on this analysis, peptide libraries constructed with the T7 system have fewer amino acid biases, increased peptide diversity, and more normal distributions of peptide net charge and hydropathy than the M13 libraries. The greater diversity of T7-displayed libraries provides a potential resource of novel binding peptides for new as well as previously studied molecular targets. To demonstrate their utility, several of the T7-displayed peptide libraries were screened for streptavidin- and neutravidin-binding phage. Novel binding motifs were identified for each protein.

  6. Peregrine

    PubMed Central

    Langevin, Stanley A.; Bent, Zachary W.; Solberg, Owen D.; Curtis, Deanna J.; Lane, Pamela D.; Williams, Kelly P.; Schoeniger, Joseph S.; Sinha, Anupama; Lane, Todd W.; Branda, Steven S.

    2013-01-01

    Use of second generation sequencing (SGS) technologies for transcriptional profiling (RNA-Seq) has revolutionized transcriptomics, enabling measurement of RNA abundances with unprecedented specificity and sensitivity and the discovery of novel RNA species. Preparation of RNA-Seq libraries requires conversion of the RNA starting material into cDNA flanked by platform-specific adaptor sequences. Each of the published methods and commercial kits currently available for RNA-Seq library preparation suffers from at least one major drawback, including long processing times, large starting material requirements, uneven coverage, loss of strand information and high cost. We report the development of a new RNA-Seq library preparation technique that produces representative, strand-specific RNA-Seq libraries from small amounts of starting material in a fast, simple and cost-effective manner. Additionally, we have developed a new quantitative PCR-based assay for precisely determining the number of PCR cycles to perform for optimal enrichment of the final library, a key step in all SGS library preparation workflows. PMID:23558773

  7. Library Research Handbook for Faculty and Graduate Students at the College of Engineering. A Selective Guide to Engineering Reference Materials at the Library of Science & Medicine.

    ERIC Educational Resources Information Center

    Wu, Connie

    This handbook identifies the major reference sources for various engineering fields at the Library of Science and Medicine (LSM) at Rutgers University. The resources are divided into 15 categories and provide full call numbers and locations for every title. These categories include: (1) Card Catalog and Online Catalog; (2) Guide to the Literature…

  8. Building a DNA barcode reference library for the true butterflies (Lepidoptera) of Peninsula Malaysia: what about the subspecies?

    PubMed

    Wilson, John-James; Sing, Kong-Wah; Sofian-Azirun, Mohd

    2013-01-01

    The objective of this study was to build a DNA barcode reference library for the true butterflies of Peninsula Malaysia and assess the value of attaching subspecies names to DNA barcode records. A new DNA barcode library was constructed with butterflies from the Museum of Zoology, University of Malaya collection. The library was analysed in conjunction with publicly available DNA barcodes from other Asia-Pacific localities to test the ability of the DNA barcodes to discriminate species and subspecies. Analyses confirmed the capacity of the new DNA barcode reference library to distinguish the vast majority of species (92%) and revealed that most subspecies possessed unique DNA barcodes (84%). In some cases conspecific subspecies exhibited genetic distances between their DNA barcodes that are typically seen between species, and these were often taxa that have previously been regarded as full species. Subspecies designations as shorthand for geographically and morphologically differentiated groups provide a useful heuristic for assessing how such groups correlate with clustering patterns of DNA barcodes, especially as the number of DNA barcodes per species in reference libraries increases. Our study demonstrates the value in attaching subspecies names to DNA barcode records as they can reveal a history of taxonomic concepts and expose important units of biodiversity.

  9. Building a DNA Barcode Reference Library for the True Butterflies (Lepidoptera) of Peninsula Malaysia: What about the Subspecies?

    PubMed Central

    Wilson, John-James; Sing, Kong-Wah; Sofian-Azirun, Mohd

    2013-01-01

    The objective of this study was to build a DNA barcode reference library for the true butterflies of Peninsula Malaysia and assess the value of attaching subspecies names to DNA barcode records. A new DNA barcode library was constructed with butterflies from the Museum of Zoology, University of Malaya collection. The library was analysed in conjunction with publicly available DNA barcodes from other Asia-Pacific localities to test the ability of the DNA barcodes to discriminate species and subspecies. Analyses confirmed the capacity of the new DNA barcode reference library to distinguish the vast majority of species (92%) and revealed that most subspecies possessed unique DNA barcodes (84%). In some cases conspecific subspecies exhibited genetic distances between their DNA barcodes that are typically seen between species, and these were often taxa that have previously been regarded as full species. Subspecies designations as shorthand for geographically and morphologically differentiated groups provide a useful heuristic for assessing how such groups correlate with clustering patterns of DNA barcodes, especially as the number of DNA barcodes per species in reference libraries increases. Our study demonstrates the value in attaching subspecies names to DNA barcode records as they can reveal a history of taxonomic concepts and expose important units of biodiversity. PMID:24282514

  10. The Microbial Ferrous Wheel in a Neutral pH Groundwater Seep

    PubMed Central

    Roden, Eric E.; McBeth, Joyce M.; Blöthe, Marco; Percak-Dennett, Elizabeth M.; Fleming, Emily J.; Holyoke, Rebecca R.; Luther, George W.; Emerson, David; Schieber, Juergen

    2012-01-01

    Evidence for microbial Fe redox cycling was documented in a circumneutral pH groundwater seep near Bloomington, Indiana. Geochemical and microbiological analyses were conducted at two sites, a semi-consolidated microbial mat and a floating puffball structure. In situ voltammetric microelectrode measurements revealed steep opposing gradients of O2 and Fe(II) at both sites, similar to other groundwater seep and sedimentary environments known to support microbial Fe redox cycling. The puffball structure showed an abrupt increase in dissolved Fe(II) just at its surface (∼5 cm depth), suggesting an internal Fe(II) source coupled to active Fe(III) reduction. Most probable number enumerations detected microaerophilic Fe(II)-oxidizing bacteria (FeOB) and dissimilatory Fe(III)-reducing bacteria (FeRB) at densities of 102 to 105 cells mL−1 in samples from both sites. In vitro Fe(III) reduction experiments revealed the potential for immediate reduction (no lag period) of native Fe(III) oxides. Conventional full-length 16S rRNA gene clone libraries were compared with high throughput barcode sequencing of the V1, V4, or V6 variable regions of 16S rRNA genes in order to evaluate the extent to which new sequencing approaches could provide enhanced insight into the composition of Fe redox cycling microbial community structure. The composition of the clone libraries suggested a lithotroph-dominated microbial community centered around taxa related to known FeOB (e.g., Gallionella, Sideroxydans, Aquabacterium). Sequences related to recognized FeRB (e.g., Rhodoferax, Aeromonas, Geobacter, Desulfovibrio) were also well-represented. Overall, sequences related to known FeOB and FeRB accounted for 88 and 59% of total clone sequences in the mat and puffball libraries, respectively. Taxa identified in the barcode libraries showed partial overlap with the clone libraries, but were not always consistent across different variable regions and sequencing platforms. However, the barcode libraries provided confirmation of key clone library results (e.g., the predominance of Betaproteobacteria) and an expanded view of lithotrophic microbial community composition. PMID:22783228

  11. Meeting the Information Needs of Interdisciplinary Scholars: Issues for Administrators of Large University Libraries.

    ERIC Educational Resources Information Center

    Searing, Susan E.

    1996-01-01

    Provides an overview of administrative issues in supporting interdisciplinary library use at large universities. Topics include information resources; cataloging and classification; library services to users, including library use education and reference services; library organization; the campus context; and the politics of interdisciplinarity.…

  12. Management Preparation and Training of Department Heads in ARL Libraries.

    ERIC Educational Resources Information Center

    Wittenbach, Stefanie A.; And Others

    1992-01-01

    A survey of cataloging and reference department heads in ARL (Association for Research Libraries) member libraries showed that both professional experience and management coursework play a part in a librarian's promotion to department head. It is suggested that library management will improve if libraries require formal management preparation and…

  13. Cost Accounting and Analysis for University Libraries

    ERIC Educational Resources Information Center

    Leimkuhler, Ferdinand F.; Cooper, Michael D.

    1971-01-01

    The approach to library planning studied in this paper is the use of accounting models to measure library costs and implement program budgets. A cost-flow model for a university library is developed and tested with historical data from the General Library at the University of California, Berkeley. (4 references) (Author)

  14. Microbial community analysis of a coastal hot spring in Kagoshima, Japan, using molecular- and culture-based approaches.

    PubMed

    Nishiyama, Minako; Yamamoto, Shuichi; Kurosawa, Norio

    2013-08-01

    Ibusuki hot spring is located on the coastline of Kagoshima Bay, Japan. The hot spring water is characterized by high salinity, high temperature, and neutral pH. The hot spring is covered by the sea during high tide, which leads to severe fluctuations in several environmental variables. A combination of molecular- and culture-based techniques was used to determine the bacterial and archaeal diversity of the hot spring. A total of 48 thermophilic bacterial strains were isolated from two sites (Site 1: 55.6°C; Site 2: 83.1°C) and they were categorized into six groups based on their 16S rRNA gene sequence similarity. Two groups (including 32 isolates) demonstrated low sequence similarity with published species, suggesting that they might represent novel taxa. The 148 clones from the Site 1 bacterial library included 76 operational taxonomy units (OTUs; 97% threshold), while 132 clones from the Site 2 bacterial library included 31 OTUs. Proteobacteria, Bacteroidetes, and Firmicutes were frequently detected in both clone libraries. The clones were related to thermophilic, mesophilic and psychrophilic bacteria. Approximately half of the sequences in bacterial clone libraries shared <92% sequence similarity with their closest sequences in a public database, suggesting that the Ibusuki hot spring may harbor a unique and novel bacterial community. By contrast, 77 clones from the Site 2 archaeal library contained only three OTUs, most of which were affiliated with Thaumarchaeota.

  15. Primer-Free Aptamer Selection Using A Random DNA Library

    PubMed Central

    Pan, Weihua; Xin, Ping; Patrick, Susan; Dean, Stacey; Keating, Christine; Clawson, Gary

    2010-01-01

    Aptamers are highly structured oligonucleotides (DNA or RNA) that can bind to targets with affinities comparable to antibodies 1. They are identified through an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX) to recognize a wide variety of targets, from small molecules to proteins and other macromolecules 2-4. Aptamers have properties that are well suited for in vivo diagnostic and/or therapeutic applications: Besides good specificity and affinity, they are easily synthesized, survive more rigorous processing conditions, they are poorly immunogenic, and their relatively small size can result in facile penetration of tissues. Aptamers that are identified through the standard SELEX process usually comprise ~80 nucleotides (nt), since they are typically selected from nucleic acid libraries with ~40 nt long randomized regions plus fixed primer sites of ~20 nt on each side. The fixed primer sequences thus can comprise nearly ~50% of the library sequences, and therefore may positively or negatively compromise identification of aptamers in the selection process 3, although bioinformatics approaches suggest that the fixed sequences do not contribute significantly to aptamer structure after selection 5. To address these potential problems, primer sequences have been blocked by complementary oligonucleotides or switched to different sequences midway during the rounds of SELEX 6, or they have been trimmed to 6-9 nt 7, 8. Wen and Gray 9 designed a primer-free genomic SELEX method, in which the primer sequences were completely removed from the library before selection and were then regenerated to allow amplification of the selected genomic fragments. However, to employ the technique, a unique genomic library has to be constructed, which possesses limited diversity, and regeneration after rounds of selection relies on a linear reamplification step. Alternatively, efforts to circumvent problems caused by fixed primer sequences using high efficiency partitioning are met with problems regarding PCR amplification 10. We have developed a primer-free (PF) selection method that significantly simplifies SELEX procedures and effectively eliminates primer-interference problems 11, 12. The protocols work in a straightforward manner. The central random region of the library is purified without extraneous flanking sequences and is bound to a suitable target (for example to a purified protein or complex mixtures such as cell lines). Then the bound sequences are obtained, reunited with flanking sequences, and re-amplified to generate selected sub-libraries. As an example, here we selected aptamers to S100B, a protein marker for melanoma. Binding assays showed Kd s in the 10-7 - 10-8 M range after a few rounds of selection, and we demonstrate that the aptamers function effectively in a sandwich binding format. PMID:20689511

  16. The Unrecognized Crisis: Library Reference Service at the Crossroads.

    ERIC Educational Resources Information Center

    Hernon, Peter

    1986-01-01

    Briefly describes fundamental problems with library reference service based on the findings of 20 separate studies using unobtrusive tests. An emphasis on quality of service, realistic goals, and effective managerial strategies are identified as possible resolutions of these problems. (CLB)

  17. Achieving Excellence in Library Instruction.

    ERIC Educational Resources Information Center

    Gilbert, Betty; And Others

    The materials included in this document supporting library instruction are divided into two chapters. The first chapter contains bibliographies of instructional materials, professional periodicals, general reference tools, and religious reference tools. Throughout the bibliographies, the approximate cost of the materials is indicated by dollar…

  18. Improved protocols to accelerate the assembly of DNA barcode reference libraries for freshwater zooplankton.

    PubMed

    Elías-Gutiérrez, Manuel; Valdez-Moreno, Martha; Topan, Janet; Young, Monica R; Cohuo-Colli, José Angel

    2018-03-01

    Currently, freshwater zooplankton sampling and identification methodologies have remained virtually unchanged since they were first established in the beginning of the XX century. One major contributing factor to this slow progress is the limited success of modern genetic methodologies, such as DNA barcoding, in several of the main groups. This study demonstrates improved protocols which enable the rapid assessment of most animal taxa inhabiting any freshwater system by combining the use of light traps, careful fixation at low temperatures using ethanol, and zooplankton-specific primers. We DNA-barcoded 2,136 specimens from a diverse array of taxonomic assemblages (rotifers, mollusks, mites, crustaceans, insects, and fishes) from several Canadian and Mexican lakes with an average sequence success rate of 85.3%. In total, 325 Barcode Index Numbers (BINs) were detected with only three BINs (two cladocerans and one copepod) shared between Canada and Mexico, suggesting a much narrower distribution range of freshwater zooplankton than previously thought. This study is the first to broadly explore the metazoan biodiversity of freshwater systems with DNA barcodes to construct a reference library that represents the first step for future programs which aim to monitor ecosystem health, track invasive species, or improve knowledge of the ecology and distribution of freshwater zooplankton.

  19. Peanut gene expression profiling in developing seeds at different reproduction stages during Aspergillus parasiticus infection

    PubMed Central

    Guo, Baozhu; Chen, Xiaoping; Dang, Phat; Scully, Brian T; Liang, Xuanqiang; Holbrook, C Corley; Yu, Jiujiang; Culbreath, Albert K

    2008-01-01

    Background Peanut (Arachis hypogaea L.) is an important crop economically and nutritionally, and is one of the most susceptible host crops to colonization of Aspergillus parasiticus and subsequent aflatoxin contamination. Knowledge from molecular genetic studies could help to devise strategies in alleviating this problem; however, few peanut DNA sequences are available in the public database. In order to understand the molecular basis of host resistance to aflatoxin contamination, a large-scale project was conducted to generate expressed sequence tags (ESTs) from developing seeds to identify resistance-related genes involved in defense response against Aspergillus infection and subsequent aflatoxin contamination. Results We constructed six different cDNA libraries derived from developing peanut seeds at three reproduction stages (R5, R6 and R7) from a resistant and a susceptible cultivated peanut genotypes, 'Tifrunner' (susceptible to Aspergillus infection with higher aflatoxin contamination and resistant to TSWV) and 'GT-C20' (resistant to Aspergillus with reduced aflatoxin contamination and susceptible to TSWV). The developing peanut seed tissues were challenged by A. parasiticus and drought stress in the field. A total of 24,192 randomly selected cDNA clones from six libraries were sequenced. After removing vector sequences and quality trimming, 21,777 high-quality EST sequences were generated. Sequence clustering and assembling resulted in 8,689 unique EST sequences with 1,741 tentative consensus EST sequences (TCs) and 6,948 singleton ESTs. Functional classification was performed according to MIPS functional catalogue criteria. The unique EST sequences were divided into twenty-two categories. A similarity search against the non-redundant protein database available from NCBI indicated that 84.78% of total ESTs showed significant similarity to known proteins, of which 165 genes had been previously reported in peanuts. There were differences in overall expression patterns in different libraries and genotypes. A number of sequences were expressed throughout all of the libraries, representing constitutive expressed sequences. In order to identify resistance-related genes with significantly differential expression, a statistical analysis to estimate the relative abundance (R) was used to compare the relative abundance of each gene transcripts in each cDNA library. Thirty six and forty seven unique EST sequences with threshold of R > 4 from libraries of 'GT-C20' and 'Tifrunner', respectively, were selected for examination of temporal gene expression patterns according to EST frequencies. Nine and eight resistance-related genes with significant up-regulation were obtained in 'GT-C20' and 'Tifrunner' libraries, respectively. Among them, three genes were common in both genotypes. Furthermore, a comparison of our EST sequences with other plant sequences in the TIGR Gene Indices libraries showed that the percentage of peanut EST matched to Arabidopsis thaliana, maize (Zea mays), Medicago truncatula, rapeseed (Brassica napus), rice (Oryza sativa), soybean (Glycine max) and wheat (Triticum aestivum) ESTs ranged from 33.84% to 79.46% with the sequence identity ≥ 80%. These results revealed that peanut ESTs are more closely related to legume species than to cereal crops, and more homologous to dicot than to monocot plant species. Conclusion The developed ESTs can be used to discover novel sequences or genes, to identify resistance-related genes and to detect the differences among alleles or markers between these resistant and susceptible peanut genotypes. Additionally, this large collection of cultivated peanut EST sequences will make it possible to construct microarrays for gene expression studies and for further characterization of host resistance mechanisms. It will be a valuable genomic resource for the peanut community. The 21,777 ESTs have been deposited to the NCBI GenBank database with accession numbers ES702769 to ES724546. PMID:18248674

  20. Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

    PubMed Central

    2010-01-01

    Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar), but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST) resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius) ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate. PMID:20433749

  1. Peregrine: A rapid and unbiased method to produce strand-specific RNA-Seq libraries from small quantities of starting material.

    PubMed

    Langevin, Stanley A; Bent, Zachary W; Solberg, Owen D; Curtis, Deanna J; Lane, Pamela D; Williams, Kelly P; Schoeniger, Joseph S; Sinha, Anupama; Lane, Todd W; Branda, Steven S

    2013-04-01

    Use of second generation sequencing (SGS) technologies for transcriptional profiling (RNA-Seq) has revolutionized transcriptomics, enabling measurement of RNA abundances with unprecedented specificity and sensitivity and the discovery of novel RNA species. Preparation of RNA-Seq libraries requires conversion of the RNA starting material into cDNA flanked by platform-specific adaptor sequences. Each of the published methods and commercial kits currently available for RNA-Seq library preparation suffers from at least one major drawback, including long processing times, large starting material requirements, uneven coverage, loss of strand information and high cost. We report the development of a new RNA-Seq library preparation technique that produces representative, strand-specific RNA-Seq libraries from small amounts of starting material in a fast, simple and cost-effective manner. Additionally, we have developed a new quantitative PCR-based assay for precisely determining the number of PCR cycles to perform for optimal enrichment of the final library, a key step in all SGS library preparation workflows.

  2. GBParsy: a GenBank flatfile parser library with high speed.

    PubMed

    Lee, Tae-Ho; Kim, Yeon-Ki; Nahm, Baek Hie

    2008-07-25

    GenBank flatfile (GBF) format is one of the most popular sequence file formats because of its detailed sequence features and ease of readability. To use the data in the file by a computer, a parsing process is required and is performed according to a given grammar for the sequence and the description in a GBF. Currently, several parser libraries for the GBF have been developed. However, with the accumulation of DNA sequence information from eukaryotic chromosomes, parsing a eukaryotic genome sequence with these libraries inevitably takes a long time, due to the large GBF file and its correspondingly large genomic nucleotide sequence and related feature information. Thus, there is significant need to develop a parsing program with high speed and efficient use of system memory. We developed a library, GBParsy, which was C language-based and parses GBF files. The parsing speed was maximized by using content-specified functions in place of regular expressions that are flexible but slow. In addition, we optimized an algorithm related to memory usage so that it also increased parsing performance and efficiency of memory usage. GBParsy is at least 5-100x faster than current parsers in benchmark tests. GBParsy is estimated to extract annotated information from almost 100 Mb of a GenBank flatfile for chromosomal sequence information within a second. Thus, it should be used for a variety of applications such as on-time visualization of a genome at a web site.

  3. Reference and Information Services: An Introduction. Third Edition. Library and Information Science Text Series.

    ERIC Educational Resources Information Center

    Bopp, Richard E., Ed.; Smith, Linda C., Ed.

    Like the first two editions, this third edition is designed primarily to provide the beginning student of library and information science with an overview both of the concepts and processes behind today's reference services and of the most important sources consulted in answering common types of reference questions. The first 12 chapters deal with…

  4. Mission IM-Possible: Starting an Instant Message Reference Service Using Trillian

    ERIC Educational Resources Information Center

    Ciocco, Ronalee; Huff, Alice

    2007-01-01

    The authors, both of whom are working at the Musselman library, relate how they are always looking for ways to improve and update their reference service. When they introduced a chat reference service to their online library services, they received only four inquiries in the entire year 2002-2003. The authors found out later that the paltry number…

  5. Chat Reference Training after One Decade: The Results of a National Survey of Academic Libraries

    ERIC Educational Resources Information Center

    Devine, Christopher; Paladino, Emily Bounds; Davis, John A.

    2011-01-01

    The first comprehensive national survey of all academic libraries in the United States which were conducting chat reference service was carried out to determine: what practices were being used to prepare personnel for chat reference service, what competencies were being taught, how and why training practices may have changed over time, and what…

  6. Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies.

    PubMed

    DeMaere, Matthew Z; Darling, Aaron E

    2018-02-01

    Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing.

  7. The Gap in Standards for Special Libraries.

    ERIC Educational Resources Information Center

    Dodd, James Beaupre

    1982-01-01

    The issue of standards for special libraries is discussed, highlighting surveys conducted concerning the diversity of special libraries and salaries of members of the Special Libraries Association (SLA). Efforts of SLA's Standards and Statistics Committee are noted. Twenty references are listed. (EJS)

  8. RNA-Seq analysis to capture the transcriptome landscape of a single cell

    PubMed Central

    Tang, Fuchou; Barbacioru, Catalin; Nordman, Ellen; Xu, Nanlan; Bashkirov, Vladimir I; Lao, Kaiqin; Surani, M. Azim

    2013-01-01

    We describe here a protocol for digital transcriptome analysis in a single mouse blastomere using a deep sequencing approach. An individual blastomere was first isolated and put into lysate buffer by mouth pipette. Reverse transcription was then performed directly on the whole cell lysate. After this, the free primers were removed by Exonuclease I and a poly(A) tail was added to the 3′ end of the first-strand cDNA by Terminal Deoxynucleotidyl Transferase. Then the single cell cDNAs were amplified by 20 plus 9 cycles of PCR. Then 100-200 ng of these amplified cDNAs were used to construct a sequencing library. The sequencing library can be used for deep sequencing using the SOLiD system. Compared with the cDNA microarray technique, our assay can capture up to 75% more genes expressed in early embryos. The protocol can generate deep sequencing libraries within 6 days for 16 single cell samples. PMID:20203668

  9. FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.

    PubMed

    Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver

    2014-06-14

    Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.

  10. Generation of a total of 6483 expressed sequence tags from 60 day-old bovine whole fetus and fetal placenta.

    PubMed

    Oishi, M; Gohma, H; Lejukole, H Y; Taniguchi, Y; Yamada, T; Suzuki, K; Shinkai, H; Uenishi, H; Yasue, H; Sasaki, Y

    2004-05-01

    Expressed sequence tags (ESTs) generated based on characterization of clones isolated randomly from cDNA libraries are used to study gene expression profiles in specific tissues and to provide useful information for characterizing tissue physiology. In this study, two directionally cloned cDNA libraries were constructed from 60 day-old bovine whole fetus and fetal placenta. We have characterized 5357 and 1126 clones, and then identified 3464 and 795 unique sequences for the fetus and placenta cDNA libraries: 1851 and 504 showed homology to already identified genes, and 1613 and 291 showed no significant matches to any of the sequences in DNA databases, respectively. Further, we found 94 unique sequences overlapping in both the fetus and the placenta, leading to a catalog of 4165 genes expressed in 60 day-old fetus and placenta. The catalog is used to examine expression profile of genes in 60 day-old bovine fetus and placenta.

  11. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nierman, William C.

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phredmore » Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.« less

  12. Planning Hospital Library Quarters: References to Help the Librarian

    PubMed Central

    Hayne, Frances

    1965-01-01

    When a hospital planned an addition that would allow library expansion, the librarian looked into relevant literature for information on what improvements she should request for the new library. Shortly she was reading not for self-instruction alone but also to strengthen her credentials for membership on the planning team. The bibliography which resulted has been annotated and, by means of an index, classified. Topics examined in the twenty-five references range from library standards and the writing of a significant building program to attainment of happy collaboration between librarian and architect, space relationships designed to facilitate work flow, planned flexibility for the sake of the future, and heating, lighting, decoration, library equipment, and furniture. PMID:14306022

  13. Pepitome: evaluating improved spectral library search for identification complementarity and quality assessment

    PubMed Central

    Dasari, Surendra; Chambers, Matthew C.; Martinez, Misti A.; Carpenter, Kristin L.; Ham, Amy-Joan L.; Vega-Montoto, Lorenzo J.; Tabb, David L.

    2012-01-01

    Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines. PMID:22217208

  14. A method for high-throughput production of sequence-verified DNA libraries and strain collections.

    PubMed

    Smith, Justin D; Schlecht, Ulrich; Xu, Weihong; Suresh, Sundari; Horecka, Joe; Proctor, Michael J; Aiyar, Raeka S; Bennett, Richard A O; Chu, Angela; Li, Yong Fuga; Roy, Kevin; Davis, Ronald W; Steinmetz, Lars M; Hyman, Richard W; Levy, Sasha F; St Onge, Robert P

    2017-02-13

    The low costs of array-synthesized oligonucleotide libraries are empowering rapid advances in quantitative and synthetic biology. However, high synthesis error rates, uneven representation, and lack of access to individual oligonucleotides limit the true potential of these libraries. We have developed a cost-effective method called Recombinase Directed Indexing (REDI), which involves integration of a complex library into yeast, site-specific recombination to index library DNA, and next-generation sequencing to identify desired clones. We used REDI to generate a library of ~3,300 DNA probes that exhibited > 96% purity and remarkable uniformity (> 95% of probes within twofold of the median abundance). Additionally, we created a collection of ~9,000 individually accessible CRISPR interference yeast strains for > 99% of genes required for either fermentative or respiratory growth, demonstrating the utility of REDI for rapid and cost-effective creation of strain collections from oligonucleotide pools. Our approach is adaptable to any complex DNA library, and fundamentally changes how these libraries can be parsed, maintained, propagated, and characterized. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.

  15. Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity

    PubMed Central

    Hurst, Gregory D.D.

    2017-01-01

    High throughput (or ‘next generation’) sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and ‘contaminating’ material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these ‘contaminations’ provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee (Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo. We conclude that ‘contamination’ in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses. PMID:28717593

  16. Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity.

    PubMed

    Gerth, Michael; Hurst, Gregory D D

    2017-01-01

    High throughput (or 'next generation') sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and 'contaminating' material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these 'contaminations' provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee ( Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo . We conclude that 'contamination' in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses.

  17. Transcriptome Analysis of Gene Expression during Chinese Water Chestnut Storage Organ Formation

    PubMed Central

    Chen, Sainan; Wang, Yan; Yu, Meizhen; Chen, Xuehao; Li, Liangjun; Yin, Jingjing

    2016-01-01

    The product organ (storage organ; corm) of the Chinese water chestnut has become a very popular food in Asian countries because of its unique nutritional value. Corm formation is a complex biological process, and extensive whole genome analysis of transcripts during corm development has not been carried out. In this study, four corm libraries at different developmental stages were constructed, and gene expression was identified using a high-throughput tag sequencing technique. Approximately 4.9 million tags were sequenced, and 4,371,386, 4,372,602, 4,782,494, and 5,276,540 clean tags, including 119,676, 110,701, 100,089, and 101,239 distinct tags, respectively, were obtained after removal of low-quality tags from each library. More than 39% of the distinct tags were unambiguous and could be mapped to reference genes, while 40% were unambiguous tag-mapped genes. After mapping their functions in existing databases, a total of 11,592, 10,949, 10,585, and 7,111 genes were annotated from the B1, B2, B3, and B4 libraries, respectively. Analysis of the differentially expressed genes (DEGs) in B1/B2, B2/B3, and B3/B4 libraries showed that most of the DEGs at the B1/B2 stages were involved in carbohydrate and hormone metabolism, while the majority of DEGs were involved in energy metabolism and carbohydrate metabolism at the B2/B3 and B3/B4 stages. All of the upregulated transcription factors and 9 important genes related to product organ formation in the above four stages were also identified. The expression changes of nine of the identified DEGs were validated using a quantitative PCR approach. This study provides a comprehensive understanding of gene expression during corm formation in the Chinese water chestnut. PMID:27716802

  18. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections.

    PubMed

    Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe; Avarre, Jean-Christophe

    2016-01-01

    Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×10 7 . The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.

  19. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections

    PubMed Central

    Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe

    2016-01-01

    Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3. PMID:27703859

  20. Solution-Phase Synthesis of a Tricyclic Pyrrole-2-Carboxamide Discovery Library Applying a Stetter-Paal-Knorr Reaction Sequence

    PubMed Central

    Iyer, Pravin S.; Fodor, Matthew D.; Coleman, Claire M.; Twining, Leslie A.; Mitasev, Branko

    2012-01-01

    The solution phase synthesis of a discovery library of 178 tricyclic pyrrole-2-carboxamides was accomplished in nine steps and seven purifications starting with three benzoyl protected amino acid methyl esters. Further diversity was introduced by two glyoxaldehydes and forty-one primary amines. The combination of Pauson-Khand, Stetter and microwave assisted Paal Knorr reactions was applied as a key sequence. The discovery library was designed with the help of QikProp 2.1 and physicochemical data are presented for all pyrroles. Library members were synthesized and purified in parallel and analyzed by LC-MS. Selected compounds were fully characterized. PMID:16677007

  1. Generation of Synthetic Copolymer Libraries by Combinatorial Assembly on Nucleic Acid Templates.

    PubMed

    Kong, Dehui; Yeung, Wayland; Hili, Ryan

    2016-07-11

    Recent advances in nucleic acid-templated copolymerization have expanded the scope of sequence-controlled synthetic copolymers beyond the molecular architectures witnessed in nature. This has enabled the power of molecular evolution to be applied to synthetic copolymer libraries to evolve molecular function ranging from molecular recognition to catalysis. This Review seeks to summarize different approaches available to generate sequence-defined monodispersed synthetic copolymer libraries using nucleic acid-templated polymerization. Key concepts and principles governing nucleic acid-templated polymerization, as well as the fidelity of various copolymerization technologies, will be described. The Review will focus on methods that enable the combinatorial generation of copolymer libraries and their molecular evolution for desired function.

  2. Preparation of fosmid libraries and functional metagenomic analysis of microbial community DNA.

    PubMed

    Martínez, Asunción; Osburne, Marcia S

    2013-01-01

    One of the most important challenges in contemporary microbial ecology is to assign a functional role to the large number of novel genes discovered through large-scale sequencing of natural microbial communities that lack similarity to genes of known function. Functional screening of metagenomic libraries, that is, screening environmental DNA clones for the ability to confer an activity of interest to a heterologous bacterial host, is a promising approach for bridging the gap between metagenomic DNA sequencing and functional characterization. Here, we describe methods for isolating environmental DNA and constructing metagenomic fosmid libraries, as well as methods for designing and implementing successful functional screens of such libraries. © 2013 Elsevier Inc. All rights reserved.

  3. DNA-Encoded Solid-Phase Synthesis: Encoding Language Design and Complex Oligomer Library Synthesis.

    PubMed

    MacConnell, Andrew B; McEnaney, Patrick J; Cavett, Valerie J; Paegel, Brian M

    2015-09-14

    The promise of exploiting combinatorial synthesis for small molecule discovery remains unfulfilled due primarily to the "structure elucidation problem": the back-end mass spectrometric analysis that significantly restricts one-bead-one-compound (OBOC) library complexity. The very molecular features that confer binding potency and specificity, such as stereochemistry, regiochemistry, and scaffold rigidity, are conspicuously absent from most libraries because isomerism introduces mass redundancy and diverse scaffolds yield uninterpretable MS fragmentation. Here we present DNA-encoded solid-phase synthesis (DESPS), comprising parallel compound synthesis in organic solvent and aqueous enzymatic ligation of unprotected encoding dsDNA oligonucleotides. Computational encoding language design yielded 148 thermodynamically optimized sequences with Hamming string distance ≥ 3 and total read length <100 bases for facile sequencing. Ligation is efficient (70% yield), specific, and directional over 6 encoding positions. A series of isomers served as a testbed for DESPS's utility in split-and-pool diversification. Single-bead quantitative PCR detected 9 × 10(4) molecules/bead and sequencing allowed for elucidation of each compound's synthetic history. We applied DESPS to the combinatorial synthesis of a 75,645-member OBOC library containing scaffold, stereochemical and regiochemical diversity using mixed-scale resin (160-μm quality control beads and 10-μm screening beads). Tandem DNA sequencing/MALDI-TOF MS analysis of 19 quality control beads showed excellent agreement (<1 ppt) between DNA sequence-predicted mass and the observed mass. DESPS synergistically unites the advantages of solid-phase synthesis and DNA encoding, enabling single-bead structural elucidation of complex compounds and synthesis using reactions normally considered incompatible with unprotected DNA. The widespread availability of inexpensive oligonucleotide synthesis, enzymes, DNA sequencing, and PCR make implementation of DESPS straightforward, and may prompt the chemistry community to revisit the synthesis of more complex and diverse libraries.

  4. Generation of Aptamers from A Primer-Free Randomized ssDNA Library Using Magnetic-Assisted Rapid Aptamer Selection

    NASA Astrophysics Data System (ADS)

    Tsao, Shih-Ming; Lai, Ji-Ching; Horng, Horng-Er; Liu, Tu-Chen; Hong, Chin-Yih

    2017-04-01

    Aptamers are oligonucleotides that can bind to specific target molecules. Most aptamers are generated using random libraries in the standard systematic evolution of ligands by exponential enrichment (SELEX). Each random library contains oligonucleotides with a randomized central region and two fixed primer regions at both ends. The fixed primer regions are necessary for amplifying target-bound sequences by PCR. However, these extra-sequences may cause non-specific bindings, which potentially interfere with good binding for random sequences. The Magnetic-Assisted Rapid Aptamer Selection (MARAS) is a newly developed protocol for generating single-strand DNA aptamers. No repeat selection cycle is required in the protocol. This study proposes and demonstrates a method to isolate aptamers for C-reactive proteins (CRP) from a randomized ssDNA library containing no fixed sequences at 5‧ and 3‧ termini using the MARAS platform. Furthermore, the isolated primer-free aptamer was sequenced and binding affinity for CRP was analyzed. The specificity of the obtained aptamer was validated using blind serum samples. The result was consistent with monoclonal antibody-based nephelometry analysis, which indicated that a primer-free aptamer has high specificity toward targets. MARAS is a feasible platform for efficiently generating primer-free aptamers for clinical diagnoses.

  5. Specifications of a Mechanized Center for Information Services for a Public Library Reference Center. Final Report. Part 1, Preliminary Specification: Mechanized Information Services in Public Library Reference Centers.

    ERIC Educational Resources Information Center

    California Univ., Los Angeles. Inst. of Library Research.

    This document presents preliminary specifications for a library-based Center for Information Services (CIS). Four sets of issues are covered: (1) data base inventory, providing a listing of magnetic tape data bases now available from national sources or soon to be so; (2) administrative issues, including the organization of the CIS within the…

  6. The National Electronic Library: A Guide to the Future for Library Managers. The Greenwood Library Management Collection.

    ERIC Educational Resources Information Center

    Pitkin, Gary M., Ed.

    As a reference guide for library professionals, this volume helps librarians prepare for the future in the growing electronic environment by examining the historical and theoretical background of the National Electronic Library and assessing the role of libraries in the past, present, and future. The book is divided in two sections: "The…

  7. The Consequences of Extending a Country's Library Legislation to the Inclusion of Academic Libraries, with Special Reference to Nigeria.

    ERIC Educational Resources Information Center

    Nwoye, S. C.

    An aspect of library legislation which is generally ignored is legislation that promotes the utilization of academic libraries to maximize the potential of a nation's resources. From the available literature it would seem that library legislation in developing nations still conforms strictly to the traditional view that library legislation should…

  8. Rapid and Easy Protocol for Quantification of Next-Generation Sequencing Libraries.

    PubMed

    Hawkins, Steve F C; Guest, Paul C

    2018-01-01

    The emergence of next-generation sequencing (NGS) over the last 10 years has increased the efficiency of DNA sequencing in terms of speed, ease, and price. However, the exact quantification of a NGS library is crucial in order to obtain good data on sequencing platforms developed by the current market leader Illumina. Different approaches for DNA quantification are available currently and the most commonly used are based on analysis of the physical properties of the DNA through spectrophotometric or fluorometric methods. Although these methods are technically simple, they do not allow exact quantification as can be achieved using a real-time quantitative PCR (qPCR) approach. A qPCR protocol for DNA quantification with applications in NGS library preparation studies is presented here. This can be applied in various fields of study such as medical disorders resulting from nutritional programming disturbances.

  9. Who, What and How? Commentary on Cheney, F. N. (1963) the Teaching of Reference in American Library Schools. (Journal of Education for Librarianship, 3(3), 188-198)

    ERIC Educational Resources Information Center

    Smith, Linda C.

    2015-01-01

    Cheney's paper was the first major article on the subject of teaching reference in American library schools. In the article, she addressed three questions: (1) Who is teaching reference?; (2) What is being taught?; and (3) How is it being taught? In this article, Linda Smith discusses how the teaching of reference has changed in the more than 50…

  10. This I Believe...All Libraries Should Be Teaching Libraries

    ERIC Educational Resources Information Center

    Palmer, Catherine

    2011-01-01

    In this article, I imagine a library that prioritizes teaching users how to find, evaluate, and use information over the traditional library public service activities of collection development, access to materials, and reference services. If I ran the library, all services would support end-user education. (Contains 1 graph and 1 note.)

  11. Law Libraries as Special Libraries: An Educational Model.

    ERIC Educational Resources Information Center

    Hazelton, Penny A.

    1993-01-01

    Summarizes the history of the law library profession and the development of the educational model for law librarians in light of the particular demands and needs of corporate and law firm libraries. Guidelines of the American Association of Law Libraries for graduate programs in law librarianship are discussed. (Contains 17 references.) (LRW)

  12. The Emergence of Libraries in the Sultanate of Oman.

    ERIC Educational Resources Information Center

    Karim, Bakri Musa A.

    1991-01-01

    Describes developments in library services that took place in Oman from 1970-90 and discusses the current status of library development. Topics discussed include the rapid social and economic development in Oman, the lack of human and physical resources, the lack of a national library, and deficiencies in school libraries. (five references) (LRW)

  13. Federal Library Programs for Acquisition of Foreign Materials.

    ERIC Educational Resources Information Center

    Cylke, Frank Kurt

    Sixteen libraries representing those agencies holding membership on the Federal Library Committee were surveyed to determine library foreign language or imprint holdings, acquisitions techniques, procedures and/or problems. Specific questions, relating to holdings, staff, budget and the acquisition, processing, reference and translation services…

  14. Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

    PubMed Central

    Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392

  15. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    PubMed

    Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  16. Population structure of pigs determined by single nucleotide polymorphisms observed in assembled expressed sequence tags.

    PubMed

    Matsumoto, Toshimi; Okumura, Naohiko; Uenishi, Hirohide; Hayashi, Takeshi; Hamasima, Noriyuki; Awata, Takashi

    2012-01-01

    We have collected more than 190000 porcine expressed sequence tags (ESTs) from full-length complementary DNA (cDNA) libraries and identified more than 2800 single nucleotide polymorphisms (SNPs). In this study, we tentatively chose 222 SNPs observed in assembled ESTs to study pigs of different breeds; 104 were selected by comparing the cDNA sequences of a Meishan pig and samples of three-way cross pigs (Landrace, Large White, and Duroc: LWD), and 118 were selected from LWD samples. To evaluate the genetic variation between the chosen SNPs from pig breeds, we determined the genotypes for 192 pig samples (11 pig groups) from our DNA reference panel with matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Of the 222 reference SNPs, 186 were successfully genotyped. A neighbor-joining tree showed that the pig groups were classified into two large clusters, namely, Euro-American and East Asian pig populations. F-statistics and the analysis of molecular variance of Euro-American pig groups revealed that approximately 25% of the genetic variations occurred because of intergroup differences. As the F(IS) values were less than the F(ST) values(,) the clustering, based on the Bayesian inference, implied that there was strong genetic differentiation among pig groups and less divergence within the groups in our samples. © 2011 The Authors. Animal Science Journal © 2011 Japanese Society of Animal Science.

  17. Terminal sequence importance of de novo proteins from binary-patterned library: stable artificial proteins with 11- or 12-amino acid alphabet.

    PubMed

    Okura, Hiromichi; Takahashi, Tsuyoshi; Mihara, Hisakazu

    2012-06-01

    Successful approaches of de novo protein design suggest a great potential to create novel structural folds and to understand natural rules of protein folding. For these purposes, smaller and simpler de novo proteins have been developed. Here, we constructed smaller proteins by removing the terminal sequences from stable de novo vTAJ proteins and compared stabilities between mutant and original proteins. vTAJ proteins were screened from an α3β3 binary-patterned library which was designed with polar/ nonpolar periodicities of α-helix and β-sheet. vTAJ proteins have the additional terminal sequences due to the method of constructing the genetically repeated library sequences. By removing the parts of the sequences, we successfully obtained the stable smaller de novo protein mutants with fewer amino acid alphabets than the originals. However, these mutants showed the differences on ANS binding properties and stabilities against denaturant and pH change. The terminal sequences, which were designed just as flexible linkers not as secondary structure units, sufficiently affected these physicochemical details. This study showed implications for adjusting protein stabilities by designing N- and C-terminal sequences.

  18. A DNA barcode library for ground beetles of Germany: the genus Amara Bonelli, 1810 (Insecta, Coleoptera, Carabidae)

    PubMed Central

    Raupach, Michael J.; Hannig, Karsten; Moriniére, Jérôme; Hendrich, Lars

    2018-01-01

    Abstract The genus Amara Bonelli, 1810 is a very speciose and taxonomically difficult genus of the Carabidae. The identification of many of the species is accomplished with considerable difficulty, in particular for females and immature stages. In this study the effectiveness of DNA barcoding, the most popular method for molecular species identification, was examined to discriminate various species of this genus from Central Europe. DNA barcodes from 690 individuals and 47 species were analysed, including sequences from previous studies and more than 350 newly generated DNA barcodes. Our analysis revealed unique BINs for 38 species (81%). Interspecific K2P distances below 2.2% were found for three species pairs and one species trio, including haplotype sharing between Amara alpina/Amara torrida and Amara communis/Amara convexior/Amara makolskii. This study represents another step in generating an extensive reference library of DNA barcodes for carabids, highly valuable bioindicators for characterizing disturbances in various habitats. PMID:29853775

  19. An in vivo library-versus-library selection of optimized protein-protein interactions.

    PubMed

    Pelletier, J N; Arndt, K M; Plückthun, A; Michnick, S W

    1999-07-01

    We describe a rapid and efficient in vivo library-versus-library screening strategy for identifying optimally interacting pairs of heterodimerizing polypeptides. Two leucine zipper libraries, semi-randomized at the positions adjacent to the hydrophobic core, were genetically fused to either one of two designed fragments of the enzyme murine dihydrofolate reductase (mDHFR), and cotransformed into Escherichia coli. Interaction between the library polypeptides reconstituted enzymatic activity of mDHFR, allowing bacterial growth. Analysis of the resulting colonies revealed important biases in the zipper sequences relative to the original libraries, which are consistent with selection for stable, heterodimerizing pairs. Using more weakly associating mDHFR fragments, we increased the stringency of selection. We enriched the best-performing leucine zipper pairs by multiple passaging of the pooled, selected colonies in liquid culture, as the best pairs allowed for better bacterial propagation. This competitive growth allowed small differences among the pairs to be amplified, and different sequence positions were enriched at different rates. We applied these selection processes to a library-versus-library sample of 2.0 x 10(6) combinations and selected a novel leucine zipper pair that may be appropriate for use in further in vivo heterodimerization strategies.

  20. Towards a full reference library of MS(n) spectra. Testing of a library containing 3126 MS2 spectra of 1743 compounds.

    PubMed

    Milman, Boris L

    2005-01-01

    A library consisting of 3766 MS(n) spectra of 1743 compounds, including 3126 MS2 spectra acquired mainly using ion trap (IT) and triple-quadrupole (QqQ) instruments, was composed of numerous collections/sources. Ionization techniques were mainly electrospray ionization and also atmospheric pressure chemical ionization and chemical ionization. The library was tested for the performance in identification of unknowns, and in this context this work is believed to be the largest of all known tests of product-ion mass spectral libraries. The MS2 spectra of the same compounds from different collections were in turn divided into spectra of 'unknown' and reference compounds. For each particular compound, library searches were performed resulting in selection by taking into account the best matches for each spectral collection/source. Within each collection/source, replicate MS2 spectra differed in the collision energy used. Overall, there were up to 950 search results giving the best match factors and their ranks in corresponding hit lists. In general, the correct answers were obtained as the 1st rank in up to 60% of the search results when retrieved with (on average) 2.2 'unknown' and 6.2 reference replicates per compound. With two or more replicates of both 'unknown' and reference spectra (the average numbers of replicates were 4.0 and 7.8, respectively), the fraction of correct answers in the 1st rank increased to 77%. This value is close to the performance of established electron ionization mass spectra libraries (up to 79%) found by other workers. The hypothesis that MS2 spectra better match reference spectra acquired using the same type of tandem mass spectrometer (IT or QqQ) was neither strongly proved nor rejected here. The present work shows that MS2 spectral libraries containing sufficiently numerous different entries for each compound are sufficiently efficient for identification of unknowns and suitable for use with different tandem mass spectrometers. 2005 John Wiley & Sons, Ltd.

  1. Library Information Resource Book For Staff.

    ERIC Educational Resources Information Center

    Potts, Ken; And Others

    This guide is the Northern Illinois University (NIU) Libraries' quick reference tool for providing information about its collections, facilities, and services. The articles are arranged in an alphabetic, dictionary format with numerous cross-references, and highlight information on the following: administrative offices; company annual reports;…

  2. Interbreeding among deeply divergent mitochondrial lineages in the American cockroach (Periplaneta americana)

    NASA Astrophysics Data System (ADS)

    von Beeren, Christoph; Stoeckle, Mark Y.; Xia, Joyce; Burke, Griffin; Kronauer, Daniel J. C.

    2015-02-01

    DNA barcoding promises to be a useful tool to identify pest species assuming adequate representation of genetic variants in a reference library. Here we examined mitochondrial DNA barcodes in a global urban pest, the American cockroach (Periplaneta americana). Our sampling effort generated 284 cockroach specimens, most from New York City, plus 15 additional U.S. states and six other countries, enabling the first large-scale survey of P. americana barcode variation. Periplaneta americana barcode sequences (n = 247, including 24 GenBank records) formed a monophyletic lineage separate from other Periplaneta species. We found three distinct P. americana haplogroups with relatively small differences within (<=0.6%) and larger differences among groups (2.4%-4.7%). This could be interpreted as indicative of multiple cryptic species. However, nuclear DNA sequences (n = 77 specimens) revealed extensive gene flow among mitochondrial haplogroups, confirming a single species. This unusual genetic pattern likely reflects multiple introductions from genetically divergent source populations, followed by interbreeding in the invasive range. Our findings highlight the need for comprehensive reference databases in DNA barcoding studies, especially when dealing with invasive populations that might be derived from multiple genetically distinct source populations.

  3. The Libraries of Rio.

    ERIC Educational Resources Information Center

    Foster, Barbara

    1988-01-01

    Describes aspects of several libraries in Rio de Janeiro. Topics covered include library policies, budgets, periodicals and books in the collections, classification schemes used, and literary areas of interest to patrons. (6 references) (CLB)

  4. A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries

    PubMed Central

    2011-01-01

    Genome targeting methods enable cost-effective capture of specific subsets of the genome for sequencing. We present here an automated, highly scalable method for carrying out the Solution Hybrid Selection capture approach that provides a dramatic increase in scale and throughput of sequence-ready libraries produced. Significant process improvements and a series of in-process quality control checkpoints are also added. These process improvements can also be used in a manual version of the protocol. PMID:21205303

  5. Bacteremia due to Moraxella atlantae in a cancer patient.

    PubMed

    De Baere, Thierry; Muylaert, An; Everaert, Els; Wauters, Georges; Claeys, Geert; Verschraegen, Gerda; Vaneechoutte, Mario

    2002-07-01

    A gram-negative alkaline phosphatase- and pyrrolidone peptidase-positive rod-shaped bacterium (CCUG 45702) was isolated from two aerobic blood cultures from a female cancer patient. No identification could be reached using phenotypic techniques. Amplification of the tRNA intergenic spacers revealed fragments with lengths of 116, 133, and 270 bp, but no such pattern was present in our reference library. Sequencing of the 16S rRNA gene revealed its identity as Moraxella atlantae, a species isolated only rarely and published only once as causing infection. In retrospect, the phenotypic characteristics fit the identification as M. atlantae (formerly known as CDC group M-3). Comparative 16S rRNA sequence analysis indicates that M. atlantae, M. lincolnii, and M. osloensis might constitute three separate genera within the MORAXELLACEAE: After treatment with amoxicillin-clavulanic acid for 2 days, fever subsided and the patient was dismissed.

  6. Bacteremia Due to Moraxella atlantae in a Cancer Patient

    PubMed Central

    De Baere, Thierry; Muylaert, An; Everaert, Els; Wauters, Georges; Claeys, Geert; Verschraegen, Gerda; Vaneechoutte, Mario

    2002-01-01

    A gram-negative alkaline phosphatase- and pyrrolidone peptidase-positive rod-shaped bacterium (CCUG 45702) was isolated from two aerobic blood cultures from a female cancer patient. No identification could be reached using phenotypic techniques. Amplification of the tRNA intergenic spacers revealed fragments with lengths of 116, 133, and 270 bp, but no such pattern was present in our reference library. Sequencing of the 16S rRNA gene revealed its identity as Moraxella atlantae, a species isolated only rarely and published only once as causing infection. In retrospect, the phenotypic characteristics fit the identification as M. atlantae (formerly known as CDC group M-3). Comparative 16S rRNA sequence analysis indicates that M. atlantae, M. lincolnii, and M. osloensis might constitute three separate genera within the Moraxellaceae. After treatment with amoxicillin-clavulanic acid for 2 days, fever subsided and the patient was dismissed. PMID:12089312

  7. A simple and novel method for RNA-seq library preparation of single cell cDNA analysis by hyperactive Tn5 transposase.

    PubMed

    Brouilette, Scott; Kuersten, Scott; Mein, Charles; Bozek, Monika; Terry, Anna; Dias, Kerith-Rae; Bhaw-Rosun, Leena; Shintani, Yasunori; Coppen, Steven; Ikebe, Chiho; Sawhney, Vinit; Campbell, Niall; Kaneko, Masahiro; Tano, Nobuko; Ishida, Hidekazu; Suzuki, Ken; Yashiro, Kenta

    2012-10-01

    Deep sequencing of single cell-derived cDNAs offers novel insights into oncogenesis and embryogenesis. However, traditional library preparation for RNA-seq analysis requires multiple steps with consequent sample loss and stochastic variation at each step significantly affecting output. Thus, a simpler and better protocol is desirable. The recently developed hyperactive Tn5-mediated library preparation, which brings high quality libraries, is likely one of the solutions. Here, we tested the applicability of hyperactive Tn5-mediated library preparation to deep sequencing of single cell cDNA, optimized the protocol, and compared it with the conventional method based on sonication. This new technique does not require any expensive or special equipment, which secures wider availability. A library was constructed from only 100 ng of cDNA, which enables the saving of precious specimens. Only a few steps of robust enzymatic reaction resulted in saved time, enabling more specimens to be prepared at once, and with a more reproducible size distribution among the different specimens. The obtained RNA-seq results were comparable to the conventional method. Thus, this Tn5-mediated preparation is applicable for anyone who aims to carry out deep sequencing for single cell cDNAs. Copyright © 2012 Wiley Periodicals, Inc.

  8. A part toolbox to tune genetic expression in Bacillus subtilis

    PubMed Central

    Guiziou, Sarah; Sauveplane, Vincent; Chang, Hung-Ju; Clerté, Caroline; Declerck, Nathalie; Jules, Matthieu; Bonnet, Jerome

    2016-01-01

    Libraries of well-characterised components regulating gene expression levels are essential to many synthetic biology applications. While widely available for the Gram-negative model bacterium Escherichia coli, such libraries are lacking for the Gram-positive model Bacillus subtilis, a key organism for basic research and biotechnological applications. Here, we engineered a genetic toolbox comprising libraries of promoters, Ribosome Binding Sites (RBS), and protein degradation tags to precisely tune gene expression in B. subtilis. We first designed a modular Expression Operating Unit (EOU) facilitating parts assembly and modifications and providing a standard genetic context for gene circuits implementation. We then selected native, constitutive promoters of B. subtilis and efficient RBS sequences from which we engineered three promoters and three RBS sequence libraries exhibiting ∼14 000-fold dynamic range in gene expression levels. We also designed a collection of SsrA proteolysis tags of variable strength. Finally, by using fluorescence fluctuation methods coupled with two-photon microscopy, we quantified the absolute concentration of GFP in a subset of strains from the library. Our complete promoters and RBS sequences library comprising over 135 constructs enables tuning of GFP concentration over five orders of magnitude, from 0.05 to 700 μM. This toolbox of regulatory components will support many research and engineering applications in B. subtilis. PMID:27402159

  9. Virtual Reference Canada (VRC): A Canadian Service in a Multicultural Environment.

    ERIC Educational Resources Information Center

    Gaudet, Franceen; Savard, Nicolas

    Virtual Reference Canada (VRC) is a digital reference service using World Wide Web technology. It was initiated by the National Library of Canada (NLC) in spring 2001 and went into test mode at the start of 2002. It draws on the contribution of a wide range of Canadian libraries and allied institutions. The development of VRC owes a great deal to…

  10. Fatty acid-oxidizing consortia along a nutrient gradient in the Florida Everglades.

    PubMed

    Chauhan, Ashvini; Ogram, Andrew

    2006-04-01

    The Florida Everglades is one of the largest freshwater marshes in North America and has been subject to eutrophication for decades. A gradient in P concentrations extends for several kilometers into the interior of the northern regions of the marsh, and the structure and function of soil microbial communities vary along the gradient. In this study, stable isotope probing was employed to investigate the fate of carbon from the fermentation products propionate and butyrate in soils from three sites along the nutrient gradient. For propionate microcosms, 16S rRNA gene clone libraries from eutrophic and transition sites were dominated by sequences related to previously described propionate oxidizers, such as Pelotomaculum spp. and Syntrophobacter spp. Significant representation was also observed for sequences related to Smithella propionica, which dismutates propionate to butyrate. Sequences of dominant phylotypes from oligotrophic samples did not cluster with known syntrophs but with sulfate-reducing prokaryotes (SRP) and Pelobacter spp. In butyrate microcosms, sequences clustering with Syntrophospora spp. and Syntrophomonas spp. dominated eutrophic microcosms, and sequences related to Pelospora dominated the transition microcosm. Sequences related to Pelospora spp. and SRP dominated clone libraries from oligotrophic microcosms. Sequences from diverse bacterial phyla and primary fermenters were also present in most libraries. Archaeal sequences from eutrophic microcosms included sequences characteristic of Methanomicrobiaceae, Methanospirillaceae, and Methanosaetaceae. Oligotrophic microcosms were dominated by acetotrophs, including sequences related to Methanosarcina, suggesting accumulation of acetate.

  11. Novel encoding methods for DNA-templated chemical libraries.

    PubMed

    Li, Gang; Zheng, Wenlu; Liu, Ying; Li, Xiaoyu

    2015-06-01

    Among various types of DNA-encoded chemical libraries, DNA-templated library takes advantage of the sequence-specificity of DNA hybridization, enabling not only highly effective DNA-templated chemical reactions, but also high fidelity in library encoding. This brief review summarizes recent advances that have been made on the encoding strategies for DNA-templated libraries, and it also highlights their respective advantages and limitations for the preparation of DNA-encoded libraries. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. West Asian Special Libraries and Information Centers.

    ERIC Educational Resources Information Center

    Harvey, John F.

    Special libraries are defined in this paper as those libraries serving such institutions as government offices, private corporations, associations, and university departments. Information centers are similar to special libraries but provide personalized, high quality reference service, usually in science and technology, and often using mechanical…

  13. Evaluating High School Libraries: Service Is Top Priority.

    ERIC Educational Resources Information Center

    Baldwin, Margaret

    1988-01-01

    Discusses the need for ongoing evaluation within high school libraries to ensure adequate library services, which, in turn, enhance the total educational program. The evaluation of library facilities, collections, and staff is discussed, and an annotated bibliography of evaluation tools is provided. (6 references) (CLB)

  14. A novel library-independent approach based on high-throughput cultivation in Bioscreen and fingerprinting by FTIR spectroscopy for microbial source tracking in food industry.

    PubMed

    Shapaval, V; Møretrø, T; Wold Åsli, A; Suso, H P; Schmitt, J; Lillehaug, D; Kohler, A

    2017-05-01

    Microbiological source tracking (MST) for food industry is a rapid growing area of research and technology development. In this paper, a new library-independent approach for MST is presented. It is based on a high-throughput liquid microcultivation and FTIR spectroscopy. In this approach, FTIR spectra obtained from micro-organisms isolated along the production line and a product are compared to each other. We tested and evaluated the new source tracking approach by simulating a source tracking situation. In this simulation study, a selection of 20 spoilage mould strains from a total of six genera (Alternaria, Aspergillus, Mucor, Paecilomyces, Peyronellaea and Phoma) was used. The simulation of the source tracking situation showed that 80-100% of the sources could be correctly identified with respect to genus/species level. When performing source tracking simulations, the FTIR identification diverged for Phoma glomerata strain in the reference collection. When reidentifying the strain by sequencing, it turned out that the strain was a Peyronellaea arachidicola. The obtained results demonstrated that the proposed approach is a versatile tool for identifying sources of microbial contamination. Thus, it has a high potential for routine control in the food industry due to low costs and analysis time. The source tracking of fungal contamination in the food industry is an important aspect of food safety. Currently, all available methods are time consuming and require the use of a reference library that may limit the accuracy of the identification. In this study, we report for the first time, a library-independent FTIR spectroscopic approach for MST of fungal contamination along the food production line. It combines high-throughput microcultivation and FTIR spectroscopy and is specific on the genus and species level. Therefore, such an approach possesses great importance for food safety control in food industry. © 2016 The Society for Applied Microbiology.

  15. NERMLS: The First Year *

    PubMed Central

    Colby, Charles C.; Bloomquist, Harold; Hodges, T. Mark

    1969-01-01

    The Countway Library, Boston, was the nation's first Regional Medical Library under the Regional Medical Library Program of the NLM. New England Regional Medical Library Service (NERMLS) began in October 1967 and is the outgrowth of traditional extramural services of the Harvard and Boston Medical Libraries (constituents of the Countway). During the first year over 27,000 requests were received of which 84 percent were filled. Some problems of document delivery (and their solution) are recounted. Other activities were: a limited amount of reference work; distribution of a Serials List; and planning for a region-wide medical library service. Proposals call for consultation and education, regional reference service, and improved document delivery service. Emphasis is placed on the role of the Community Hospital as a center for continuing education and the need to strengthen and assist hospital medical libraries. With the Postgraduate Medical Institute, Boston, NERMLS assisted in the compilation of a small physician-selected medical Core Collection which would serve as a minimum standard collection for community hospital libraries. PMID:5823504

  16. Using bibliometrics to demonstrate the value of library journal collections

    PubMed Central

    Belter, Christopher W; Kaske, Neal K

    2016-01-01

    Although cited reference studies are common in the library and information science literature, they are rarely performed in non-academic institutions or in the atmospheric and oceanic sciences. In this paper, we analyze over 400,000 cited references made by authors affiliated with the National Oceanic and Atmospheric Administration between 2009 and 2013. Our results suggest that these methods can be applied to research libraries in a variety of institutions, that the results of analyses performed at one institution may not be applicable to other institutions, and that cited reference analyses should be periodically updated to reflect changes in authors’ referencing behavior. PMID:27453584

  17. Peer education in the commons: a new approach to reference services.

    PubMed

    Neal, Ruth E; Ajamie, Lauren F; Harmon, Karen D; Kellerby, Carissa D; Schweikhard, April J

    2010-10-01

    In planning for a new library construction project for the University of Oklahoma-Tulsa, graduate students enrolled in the University of Oklahoma (OU) School of Library and Information Studies collaborated in an innovative effort to develop a commons-based reference service. By first considering a philosophical approach to the need for a commons, blending in the experiences of other libraries that have created similar spaces, and focusing on the workflow issues likely to be encountered by the graduate assistants staffing the commons itself, this planning team developed an uncommon peer-to-peer approach to reference and education services, one focused on the patron as student.

  18. Kmerind: A Flexible Parallel Library for K-mer Indexing of Biological Sequences on Distributed Memory Systems.

    PubMed

    Pan, Tony; Flick, Patrick; Jain, Chirag; Liu, Yongchao; Aluru, Srinivas

    2017-10-09

    Counting and indexing fixed length substrings, or k-mers, in biological sequences is a key step in many bioinformatics tasks including genome alignment and mapping, genome assembly, and error correction. While advances in next generation sequencing technologies have dramatically reduced the cost and improved latency and throughput, few bioinformatics tools can efficiently process the datasets at the current generation rate of 1.8 terabases every 3 days. We present Kmerind, a high performance parallel k-mer indexing library for distributed memory environments. The Kmerind library provides a set of simple and consistent APIs with sequential semantics and parallel implementations that are designed to be flexible and extensible. Kmerind's k-mer counter performs similarly or better than the best existing k-mer counting tools even on shared memory systems. In a distributed memory environment, Kmerind counts k-mers in a 120 GB sequence read dataset in less than 13 seconds on 1024 Xeon CPU cores, and fully indexes their positions in approximately 17 seconds. Querying for 1% of the k-mers in these indices can be completed in 0.23 seconds and 28 seconds, respectively. Kmerind is the first k-mer indexing library for distributed memory environments, and the first extensible library for general k-mer indexing and counting. Kmerind is available at https://github.com/ParBLiSS/kmerind.

  19. Single Cell Transcriptomics of Hypothalamic Warm Sensitive Neurons that Control Core Body Temperature and Fever Response

    PubMed Central

    Eberwine, James; Bartfai, Tamas

    2011-01-01

    We report on an ‘unbiased’ molecular characterization of individual, adult neurons, active in a central, anterior hypothalamic neuronal circuit, by establishing cDNA libraries from each individual, electrophysiologically identified warm sensitive neuron (WSN). The cDNA libraries were analyzed by Affymetrix microarray. The presence and frequency of cDNAs was confirmed and enhanced with Illumina sequencing of each single cell cDNA library. cDNAs encoding the GABA biosynthetic enzyme. GAD1 and of adrenomedullin, galanin, prodynorphin, somatostatin, and tachykinin were found in the WSNs. The functional cellular and in vivo studies on dozens of the more than 500 neurotransmitter -, hormone- receptors and ion channels, whose cDNA was identified and sequence confirmed, suggest little or no discrepancy between the transcriptional and functional data in WSNs; whenever agonists were available for a receptor whose cDNA was identified, a functional response was found.. Sequencing single neuron libraries permitted identification of rarely expressed receptors like the insulin receptor, adiponectin receptor2 and of receptor heterodimers; information that is lost when pooling cells leads to dilution of signals and mixing signals. Despite the common electrophysiological phenotype and uniform GAD1 expression, WSN- transcriptomes show heterogenity, suggesting strong epigenetic influence on the transcriptome. Our study suggests that it is well-worth interrogating the cDNA libraries of single neurons by sequencing and chipping. PMID:20970451

  20. Metagenomics: Probing pollutant fate in natural and engineered ecosystems.

    PubMed

    Bouhajja, Emna; Agathos, Spiros N; George, Isabelle F

    2016-12-01

    Polluted environments are a reservoir of microbial species able to degrade or to convert pollutants to harmless compounds. The proper management of microbial resources requires a comprehensive characterization of their genetic pool to assess the fate of contaminants and increase the efficiency of bioremediation processes. Metagenomics offers appropriate tools to describe microbial communities in their whole complexity without lab-based cultivation of individual strains. After a decade of use of metagenomics to study microbiomes, the scientific community has made significant progress in this field. In this review, we survey the main steps of metagenomics applied to environments contaminated with organic compounds or heavy metals. We emphasize technical solutions proposed to overcome encountered obstacles. We then compare two metagenomic approaches, i.e. library-based targeted metagenomics and direct sequencing of metagenomes. In the former, environmental DNA is cloned inside a host, and then clones of interest are selected based on (i) their expression of biodegradative functions or (ii) sequence homology with probes and primers designed from relevant, already known sequences. The highest score for the discovery of novel genes and degradation pathways has been achieved so far by functional screening of large clone libraries. On the other hand, direct sequencing of metagenomes without a cloning step has been more often applied to polluted environments for characterization of the taxonomic and functional composition of microbial communities and their dynamics. In this case, the analysis has focused on 16S rRNA genes and marker genes of biodegradation. Advances in next generation sequencing and in bioinformatic analysis of sequencing data have opened up new opportunities for assessing the potential of biodegradation by microbes, but annotation of collected genes is still hampered by a limited number of available reference sequences in databases. Although metagenomics is still facing technical and computational challenges, our review of the recent literature highlights its value as an aid to efficiently monitor the clean-up of contaminated environments and develop successful strategies to mitigate the impact of pollutants on ecosystems. Copyright © 2016 Elsevier Inc. All rights reserved.

Top