Lyons, Eli; Sheridan, Paul; Tremmel, Georg; Miyano, Satoru; Sugano, Sumio
2017-10-24
High-throughput screens allow for the identification of specific biomolecules with characteristics of interest. In barcoded screens, DNA barcodes are linked to target biomolecules in a manner allowing for the target molecules making up a library to be identified by sequencing the DNA barcodes using Next Generation Sequencing. To be useful in experimental settings, the DNA barcodes in a library must satisfy certain constraints related to GC content, homopolymer length, Hamming distance, and blacklisted subsequences. Here we report a novel framework to quickly generate large-scale libraries of DNA barcodes for use in high-throughput screens. We show that our framework dramatically reduces the computation time required to generate large-scale DNA barcode libraries, compared with a naїve approach to DNA barcode library generation. As a proof of concept, we demonstrate that our framework is able to generate a library consisting of one million DNA barcodes for use in a fragment antibody phage display screening experiment. We also report generating a general purpose one billion DNA barcode library, the largest such library yet reported in literature. Our results demonstrate the value of our novel large-scale DNA barcode library generation framework for use in high-throughput screening applications.
Droplet Digital™ PCR Next-Generation Sequencing Library QC Assay.
Heredia, Nicholas J
2018-01-01
Digital PCR is a valuable tool to quantify next-generation sequencing (NGS) libraries precisely and accurately. Accurately quantifying NGS libraries enable accurate loading of the libraries on to the sequencer and thus improve sequencing performance by reducing under and overloading error. Accurate quantification also benefits users by enabling uniform loading of indexed/barcoded libraries which in turn greatly improves sequencing uniformity of the indexed/barcoded samples. The advantages gained by employing the Droplet Digital PCR (ddPCR™) library QC assay includes the precise and accurate quantification in addition to size quality assessment, enabling users to QC their sequencing libraries with confidence.
Trebitz, Anett S; Hoffman, Joel C; Grant, George W; Billehus, Tyler M; Pilgrim, Erik M
2015-07-22
DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.
NASA Astrophysics Data System (ADS)
Trebitz, Anett S.; Hoffman, Joel C.; Grant, George W.; Billehus, Tyler M.; Pilgrim, Erik M.
2015-07-01
DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.
Rozenberg, Andrey; Leese, Florian; Weiss, Linda C; Tollrian, Ralph
2016-01-01
Tag-Seq is a high-throughput approach used for discovering SNPs and characterizing gene expression. In comparison to RNA-Seq, Tag-Seq eases data processing and allows detection of rare mRNA species using only one tag per transcript molecule. However, reduced library complexity raises the issue of PCR duplicates, which distort gene expression levels. Here we present a novel Tag-Seq protocol that uses the least biased methods for RNA library preparation combined with a novel approach for joint PCR template and sample labeling. In our protocol, input RNA is fragmented by hydrolysis, and poly(A)-bearing RNAs are selected and directly ligated to mixed DNA-RNA P5 adapters. The P5 adapters contain i5 barcodes composed of sample-specific (moderately) degenerate base regions (mDBRs), which later allow detection of PCR duplicates. The P7 adapter is attached via reverse transcription with individual i7 barcodes added during the amplification step. The resulting libraries can be sequenced on an Illumina sequencer. After sample demultiplexing and PCR duplicate removal with a free software tool we designed, the data are ready for downstream analysis. Our protocol was tested on RNA samples from predator-induced and control Daphnia microcrustaceans.
Kim, Sungmin; Song, Kyo-Hong; Ree, Han-Il; Kim, Won
2012-01-01
Non-biting midges (Diptera: Chironomidae) are a diverse population that commonly causes respiratory allergies in humans. Chironomid larvae can be used to indicate freshwater pollution, but accurate identification on the basis of morphological characteristics is difficult. In this study, we constructed a mitochondrial cytochrome c oxidase subunit I (COI)-based DNA barcode library for Korean chironomids. This library consists of 211 specimens from 49 species, including adults and unidentified larvae. The interspecies and intraspecies COI sequence variations were analyzed. Sophisticated indexes were developed in order to properly evaluate indistinct barcode gaps that are created by insufficient sampling on both the interspecies and intraspecies levels and by variable mutation rates across taxa. In a variety of insect datasets, these indexes were useful for re-evaluating large barcode datasets and for defining COI barcode gaps. The COI-based DNA barcode library will provide a rapid and reliable tool for the molecular identification of Korean chironomid species. Furthermore, this reverse-taxonomic approach will be improved by the continuous addition of other speceis’ sequences to the library. PMID:22138764
Chen, Bo-Ruei; Hale, Devin C; Ciolek, Peter J; Runge, Kurt W
2012-05-03
Barcodes are unique DNA sequence tags that can be used to specifically label individual mutants. The barcode-tagged open reading frame (ORF) haploid deletion mutant collections in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe allow for high-throughput mutant phenotyping because the relative growth of mutants in a population can be determined by monitoring the proportions of their associated barcodes. While these mutant collections have greatly facilitated genome-wide studies, mutations in essential genes are not present, and the roles of these genes are not as easily studied. To further support genome-scale research in S. pombe, we generated a barcode-tagged fission yeast insertion mutant library that has the potential of generating viable mutations in both essential and non-essential genes and can be easily analyzed using standard molecular biological techniques. An insertion vector containing a selectable ura4+ marker and a random barcode was used to generate a collection of 10,000 fission yeast insertion mutants stored individually in 384-well plates and as six pools of mixed mutants. Individual barcodes are flanked by Sfi I recognition sites and can be oligomerized in a unique orientation to facilitate barcode sequencing. Independent genetic screens on a subset of mutants suggest that this library contains a diverse collection of single insertion mutations. We present several approaches to determine insertion sites. This collection of S. pombe barcode-tagged insertion mutants is well-suited for genome-wide studies. Because insertion mutations may eliminate, reduce or alter the function of essential and non-essential genes, this library will contain strains with a wide range of phenotypes that can be assayed by their associated barcodes. The design of the barcodes in this library allows for barcode sequencing using next generation or standard benchtop cloning approaches.
Zhou, X.; Robinson, J.L.; Geraci, C.J.; Parker, C.R.; Flint, O.S.; Etnier, D.A.; Ruiter, D.; DeWalt, R.E.; Jacobus, L.M.; Hebert, P.D.N.
2011-01-01
Deoxyribonucleic acid (DNA) barcoding is an effective tool for species identification and lifestage association in a wide range of animal taxa. We developed a strategy for rapid construction of a regional DNA-barcode reference library and used the caddisflies (Trichoptera) of the Great Smoky Mountains National Park (GSMNP) as a model. Nearly 1000 cytochrome c oxidase subunit I (COI) sequences, representing 209 caddisfly species previously recorded from GSMNP, were obtained from the global Trichoptera Barcode of Life campaign. Most of these sequences were collected from outside the GSMNP area. Another 645 COI sequences, representing 80 species, were obtained from specimens collected in a 3-d bioblitz (short-term, intense sampling program) in GSMNP. The joint collections provided barcode coverage for 212 species, 91% of the GSMNP fauna. Inclusion of samples from other localities greatly expedited construction of the regional DNA-barcode reference library. This strategy increased intraspecific divergence and decreased average distances to nearest neighboring species, but the DNA-barcode library was able to differentiate 93% of the GSMNP Trichoptera species examined. Global barcoding projects will aid construction of regional DNA-barcode libraries, but local surveys make crucial contributions to progress by contributing rare or endemic species and full-length barcodes generated from high-quality DNA. DNA taxonomy is not a goal of our present work, but the investigation of COI divergence patterns in caddisflies is providing new insights into broader biodiversity patterns in this group and has directed attention to various issues, ranging from the need to re-evaluate species taxonomy with integrated morphological and molecular evidence to the necessity of an appropriate interpretation of barcode analyses and its implications in understanding species diversity (in contrast to a simple claim for barcoding failure).
Davidsson, Marcus; Diaz-Fernandez, Paula; Schwich, Oliver D.; Torroba, Marcos; Wang, Gang; Björklund, Tomas
2016-01-01
Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode. PMID:27874090
2012-01-01
Background Barcodes are unique DNA sequence tags that can be used to specifically label individual mutants. The barcode-tagged open reading frame (ORF) haploid deletion mutant collections in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe allow for high-throughput mutant phenotyping because the relative growth of mutants in a population can be determined by monitoring the proportions of their associated barcodes. While these mutant collections have greatly facilitated genome-wide studies, mutations in essential genes are not present, and the roles of these genes are not as easily studied. To further support genome-scale research in S. pombe, we generated a barcode-tagged fission yeast insertion mutant library that has the potential of generating viable mutations in both essential and non-essential genes and can be easily analyzed using standard molecular biological techniques. Results An insertion vector containing a selectable ura4+ marker and a random barcode was used to generate a collection of 10,000 fission yeast insertion mutants stored individually in 384-well plates and as six pools of mixed mutants. Individual barcodes are flanked by Sfi I recognition sites and can be oligomerized in a unique orientation to facilitate barcode sequencing. Independent genetic screens on a subset of mutants suggest that this library contains a diverse collection of single insertion mutations. We present several approaches to determine insertion sites. Conclusions This collection of S. pombe barcode-tagged insertion mutants is well-suited for genome-wide studies. Because insertion mutations may eliminate, reduce or alter the function of essential and non-essential genes, this library will contain strains with a wide range of phenotypes that can be assayed by their associated barcodes. The design of the barcodes in this library allows for barcode sequencing using next generation or standard benchtop cloning approaches. PMID:22554201
Kuzmina, Maria L; Braukmann, Thomas W A; Fazekas, Aron J; Graham, Sean W; Dewaard, Stephanie L; Rodrigues, Anuar; Bennett, Bruce A; Dickinson, Timothy A; Saarela, Jeffery M; Catling, Paul M; Newmaster, Steven G; Percy, Diana M; Fenneman, Erin; Lauron-Moreau, Aurélien; Ford, Bruce; Gillespie, Lynn; Subramanyam, Ragupathy; Whitton, Jeannette; Jennings, Linda; Metsger, Deborah; Warne, Connor P; Brown, Allison; Sears, Elizabeth; Dewaard, Jeremy R; Zakharov, Evgeny V; Hebert, Paul D N
2017-12-01
Constructing complete, accurate plant DNA barcode reference libraries can be logistically challenging for large-scale floras. Here we demonstrate the promise and challenges of using herbarium collections for building a DNA barcode reference library for the vascular plant flora of Canada. Our study examined 20,816 specimens representing 5076 of 5190 vascular plant species in Canada (98%). For 98% of the specimens, at least one of the DNA barcode regions was recovered from the plastid loci rbcL and matK and from the nuclear ITS2 region. We used beta regression to quantify the effects of age, type of preservation, and taxonomic affiliation (family) on DNA sequence recovery. Specimen age and method of preservation had significant effects on sequence recovery for all markers, but influenced some families more (e.g., Boraginaceae) than others (e.g., Asteraceae). Our DNA barcode library represents an unparalleled resource for metagenomic and ecological genetic research working on temperate and arctic biomes. An observed decline in sequence recovery with specimen age may be associated with poor primer matches, intragenomic variation (for ITS2), or inhibitory secondary compounds in some taxa.
Kuzmina, Maria L.; Braukmann, Thomas W. A.; Fazekas, Aron J.; Graham, Sean W.; Dewaard, Stephanie L.; Rodrigues, Anuar; Bennett, Bruce A.; Dickinson, Timothy A.; Saarela, Jeffery M.; Catling, Paul M.; Newmaster, Steven G.; Percy, Diana M.; Fenneman, Erin; Lauron-Moreau, Aurélien; Ford, Bruce; Gillespie, Lynn; Subramanyam, Ragupathy; Whitton, Jeannette; Jennings, Linda; Metsger, Deborah; Warne, Connor P.; Brown, Allison; Sears, Elizabeth; Dewaard, Jeremy R.; Zakharov, Evgeny V.; Hebert, Paul D. N.
2017-01-01
Premise of the study: Constructing complete, accurate plant DNA barcode reference libraries can be logistically challenging for large-scale floras. Here we demonstrate the promise and challenges of using herbarium collections for building a DNA barcode reference library for the vascular plant flora of Canada. Methods: Our study examined 20,816 specimens representing 5076 of 5190 vascular plant species in Canada (98%). For 98% of the specimens, at least one of the DNA barcode regions was recovered from the plastid loci rbcL and matK and from the nuclear ITS2 region. We used beta regression to quantify the effects of age, type of preservation, and taxonomic affiliation (family) on DNA sequence recovery. Results: Specimen age and method of preservation had significant effects on sequence recovery for all markers, but influenced some families more (e.g., Boraginaceae) than others (e.g., Asteraceae). Discussion: Our DNA barcode library represents an unparalleled resource for metagenomic and ecological genetic research working on temperate and arctic biomes. An observed decline in sequence recovery with specimen age may be associated with poor primer matches, intragenomic variation (for ITS2), or inhibitory secondary compounds in some taxa. PMID:29299394
Chen, He; Yao, Jiacheng; Fu, Yusi; Pang, Yuhong; Wang, Jianbin; Huang, Yanyi
2018-04-11
The next generation sequencing (NGS) technologies have been rapidly evolved and applied to various research fields, but they often suffer from losing long-range information due to short library size and read length. Here, we develop a simple, cost-efficient, and versatile NGS library preparation method, called tagmentation on microbeads (TOM). This method is capable of recovering long-range information through tagmentation mediated by microbead-immobilized transposomes. Using transposomes with DNA barcodes to identically label adjacent sequences during tagmentation, we can restore inter-read connection of each fragment from original DNA molecule by fragment-barcode linkage after sequencing. In our proof-of-principle experiment, more than 4.5% of the reads are linked with their adjacent reads, and the longest linkage is over 1112 bp. We demonstrate TOM with eight barcodes, but the number of barcodes can be scaled up by an ultrahigh complexity construction. We also show this method has low amplification bias and effectively fits the applications to identify copy number variations.
Xu, Chao; Dong, Wenpan; Shi, Shuo; Cheng, Tao; Li, Changhao; Liu, Yanlei; Wu, Ping; Wu, Hongkun; Gao, Peng; Zhou, Shiliang
2015-11-01
A well-covered reference library is crucial for successful identification of species by DNA barcoding. The biggest difficulty in building such a reference library is the lack of materials of organisms. Herbarium collections are potentially an enormous resource of materials. In this study, we demonstrate that it is likely to build such reference libraries using the reconstructed (self-primed PCR amplified) DNA from the herbarium specimens. We used 179 rosaceous specimens to test the effects of DNA reconstruction, 420 randomly sampled specimens to estimate the usable percentage and another 223 specimens of true cherries (Cerasus, Rosaceae) to test the coverage of usable specimens to the species. The barcode rbcLb (the central four-sevenths of rbcL gene) and matK was each amplified in two halves and sequenced on Roche GS 454 FLX+. DNA from the herbarium specimens was typically shorter than 300 bp. DNA reconstruction enabled amplification fragments of 400-500 bp without bringing or inducing any sequence errors. About one-third of specimens in the national herbarium of China (PE) were proven usable after DNA reconstruction. The specimens in PE cover all Chinese true cherry species and 91.5% of vascular species listed in Flora of China. It is very possible to build well-covered reference libraries for DNA barcoding of vascular species in China. As exemplified in this study, DNA reconstruction and DNA-labelled next-generation sequencing can accelerate the construction of local reference libraries. By putting the local reference libraries together, a global library for DNA barcoding becomes closer to reality. © 2015 John Wiley & Sons Ltd.
Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E
2016-06-20
Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sonet, Gontran; Jordaens, Kurt; Braet, Yves; Bourguignon, Luc; Dupont, Eréna; Backeljau, Thierry; De Meyer, Marc; Desmyter, Stijn
2013-01-01
Abstract Fly larvae living on dead corpses can be used to estimate post-mortem intervals. The identification of these flies is decisive in forensic casework and can be facilitated by using DNA barcodes provided that a representative and comprehensive reference library of DNA barcodes is available. We constructed a local (Belgium and France) reference library of 85 sequences of the COI DNA barcode fragment (mitochondrial cytochrome c oxidase subunit I gene), from 16 fly species of forensic interest (Calliphoridae, Muscidae, Fanniidae). This library was then used to evaluate the ability of two public libraries (GenBank and the Barcode of Life Data Systems – BOLD) to identify specimens from Belgian and French forensic cases. The public libraries indeed allow a correct identification of most specimens. Yet, some of the identifications remain ambiguous and some forensically important fly species are not, or insufficiently, represented in the reference libraries. Several search options offered by GenBank and BOLD can be used to further improve the identifications obtained from both libraries using DNA barcodes. PMID:24453564
Using single nuclei for RNA-seq to capture the transcriptome of postmortem neurons
Krishnaswami, Suguna Rani; Grindberg, Rashel V; Novotny, Mark; Venepally, Pratap; Lacar, Benjamin; Bhutani, Kunal; Linker, Sara B; Pham, Son; Erwin, Jennifer A; Miller, Jeremy A; Hodge, Rebecca; McCarthy, James K; Kelder, Martin; McCorrison, Jamison; Aevermann, Brian D; Fuertes, Francisco Diez; Scheuermann, Richard H; Lee, Jun; Lein, Ed S; Schork, Nicholas; McConnell, Michael J; Gage, Fred H; Lasken, Roger S
2016-01-01
A protocol is described for sequencing the transcriptome of a cell nucleus. Nuclei are isolated from specimens and sorted by FACS, cDNA libraries are constructed and RNA-seq is performed, followed by data analysis. Some steps follow published methods (Smart-seq2 for cDNA synthesis and Nextera XT barcoded library preparation) and are not described in detail here. Previous single-cell approaches for RNA-seq from tissues include cell dissociation using protease treatment at 30 °C, which is known to alter the transcriptome. We isolate nuclei at 4 °C from tissue homogenates, which cause minimal damage. Nuclear transcriptomes can be obtained from postmortem human brain tissue stored at −80 °C, making brain archives accessible for RNA-seq from individual neurons. The method also allows investigation of biological features unique to nuclei, such as enrichment of certain transcripts and precursors of some noncoding RNAs. By following this procedure, it takes about 4 d to construct cDNA libraries that are ready for sequencing. PMID:26890679
A DNA Barcode Library for North American Pyraustinae (Lepidoptera: Pyraloidea: Crambidae).
Yang, Zhaofu; Landry, Jean-François; Hebert, Paul D N
2016-01-01
Although members of the crambid subfamily Pyraustinae are frequently important crop pests, their identification is often difficult because many species lack conspicuous diagnostic morphological characters. DNA barcoding employs sequence diversity in a short standardized gene region to facilitate specimen identifications and species discovery. This study provides a DNA barcode reference library for North American pyraustines based upon the analysis of 1589 sequences recovered from 137 nominal species, 87% of the fauna. Data from 125 species were barcode compliant (>500bp, <1% n), and 99 of these taxa formed a distinct cluster that was assigned to a single BIN. The other 26 species were assigned to 56 BINs, reflecting frequent cases of deep intraspecific sequence divergence and a few instances of barcode sharing, creating a total of 155 BINs. Two systems for OTU designation, ABGD and BIN, were examined to check the correspondence between current taxonomy and sequence clusters. The BIN system performed better than ABGD in delimiting closely related species, while OTU counts with ABGD were influenced by the value employed for relative gap width. Different species with low or no interspecific divergence may represent cases of unrecognized synonymy, whereas those with high intraspecific divergence require further taxonomic scrutiny as they may involve cryptic diversity. The barcode library developed in this study will also help to advance understanding of relationships among species of Pyraustinae.
Je, a versatile suite to handle multiplexed NGS libraries with unique molecular identifiers.
Girardot, Charles; Scholtalbers, Jelle; Sauer, Sajoscha; Su, Shu-Yi; Furlong, Eileen E M
2016-10-08
The yield obtained from next generation sequencers has increased almost exponentially in recent years, making sample multiplexing common practice. While barcodes (known sequences of fixed length) primarily encode the sample identity of sequenced DNA fragments, barcodes made of random sequences (Unique Molecular Identifier or UMIs) are often used to distinguish between PCR duplicates and transcript abundance in, for example, single-cell RNA sequencing (scRNA-seq). In paired-end sequencing, different barcodes can be inserted at each fragment end to either increase the number of multiplexed samples in the library or to use one of the barcodes as UMI. Alternatively, UMIs can be combined with the sample barcodes into composite barcodes, or with standard Illumina® indexing. Subsequent analysis must take read duplicates and sample identity into account, by identifying UMIs. Existing tools do not support these complex barcoding configurations and custom code development is frequently required. Here, we present Je, a suite of tools that accommodates complex barcoding strategies, extracts UMIs and filters read duplicates taking UMIs into account. Using Je on publicly available scRNA-seq and iCLIP data containing UMIs, the number of unique reads increased by up to 36 %, compared to when UMIs are ignored. Je is implemented in JAVA and uses the Picard API. Code, executables and documentation are freely available at http://gbcs.embl.de/Je . Je can also be easily installed in Galaxy through the Galaxy toolshed.
Caruccio, Nicholas
2011-01-01
DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.
NASA Astrophysics Data System (ADS)
Leray, M.; Boehm, J. T.; Mills, S. C.; Meyer, C. P.
2012-06-01
Identifying species involved in consumer-resource interactions is one of the main limitations in the construction of food webs. DNA barcoding of prey items in predator guts provides a valuable tool for characterizing trophic interactions, but the method relies on the availability of reference sequences to which prey sequences can be matched. In this study, we demonstrate that the COI sequence library of the Moorea BIOCODE project, an ecosystem-level barcode initiative, enables the identification of a large proportion of semi-digested fish, crustacean and mollusks found in the guts of three Hawkfish and two Squirrelfish species. While most prey remains lacked diagnostic morphological characters, 94% of the prey found in 67 fishes had >98% sequence similarity with BIOCODE reference sequences. Using this species-level prey identification, we demonstrate how DNA barcoding can provide insights into resource partitioning, predator feeding behaviors and the consequences of predation on ecosystem function.
BOKP: A DNA Barcode Reference Library for Monitoring Herbal Drugs in the Korean Pharmacopeia
Liu, Jinxin; Shi, Linchun; Song, Jingyuan; Sun, Wei; Han, Jianping; Liu, Xia; Hou, Dianyun; Yao, Hui; Li, Mingyue; Chen, Shilin
2017-01-01
Herbal drug authentication is an important task in traditional medicine; however, it is challenged by the limitations of traditional authentication methods and the lack of trained experts. DNA barcoding is conspicuous in almost all areas of the biological sciences and has already been added to the British pharmacopeia and Chinese pharmacopeia for routine herbal drug authentication. However, DNA barcoding for the Korean pharmacopeia still requires significant improvements. Here, we present a DNA barcode reference library for herbal drugs in the Korean pharmacopeia and developed a species identification engine named KP-IDE to facilitate the adoption of this DNA reference library for the herbal drug authentication. Using taxonomy records, specimen records, sequence records, and reference records, KP-IDE can identify an unknown specimen. Currently, there are 6,777 taxonomy records, 1,054 specimen records, 30,744 sequence records (ITS2 and psbA-trnH) and 285 reference records. Moreover, 27 herbal drug materials were collected from the Seoul Yangnyeongsi herbal medicine market to give an example for real herbal drugs authentications. Our study demonstrates the prospects of the DNA barcode reference library for the Korean pharmacopeia and provides future directions for the use of DNA barcoding for authenticating herbal drugs listed in other modern pharmacopeias. PMID:29326593
Ekrem, Torbjørn; Stur, Elisabeth
2017-01-01
Abstract Chironomidae (Diptera) pupal exuviae samples are commonly used for biological monitoring of aquatic habitats. DNA barcoding has proved useful for species identification of chironomid life stages containing cellular tissue, but the barcoding success of chironomid pupal exuviae is unknown. We assessed whether standard DNA barcoding could be efficiently used for species identification of chironomid pupal exuviae when compared with morphological techniques and if there were differences in performance between temperate and tropical ecosystems, subfamilies, and tribes. PCR, sequence, and identification success differed significantly between geographic regions and taxonomic groups. For Norway, 27 out of 190 (14.2%) of pupal exuviae resulted in high-quality chironomid sequences that match species. For Costa Rica, 69 out of 190 (36.3%) Costa Rican pupal exuviae resulted in high-quality sequences, but none matched known species. Standard DNA barcoding of chironomid pupal exuviae had limited success in species identification of unknown specimens due to contaminations and lack of matching references in available barcode libraries, especially from Costa Rica. Therefore, we recommend future biodiversity studies that focus their efforts on understudied regions, to simultaneously use morphological and molecular identification techniques to identify all life stages of chironomids and populate the barcode reference library with identified sequences.
A DNA Barcode Library for North American Ephemeroptera: Progress and Prospects
Webb, Jeffrey M.; Jacobus, Luke M.; Funk, David H.; Zhou, Xin; Kondratieff, Boris; Geraci, Christy J.; DeWalt, R. Edward; Baird, Donald J.; Richard, Barton; Phillips, Iain; Hebert, Paul D. N.
2012-01-01
DNA barcoding of aquatic macroinvertebrates holds much promise as a tool for taxonomic research and for providing the reliable identifications needed for water quality assessment programs. A prerequisite for identification using barcodes is a reliable reference library. We gathered 4165 sequences from the barcode region of the mitochondrial cytochrome c oxidase subunit I gene representing 264 nominal and 90 provisional species of mayflies (Insecta: Ephemeroptera) from Canada, Mexico, and the United States. No species shared barcode sequences and all can be identified with barcodes with the possible exception of some Caenis. Minimum interspecific distances ranged from 0.3–24.7% (mean: 12.5%), while the average intraspecific divergence was 1.97%. The latter value was inflated by the presence of very high divergences in some taxa. In fact, nearly 20% of the species included two or three haplotype clusters showing greater than 5.0% sequence divergence and some values are as high as 26.7%. Many of the species with high divergences are polyphyletic and likely represent species complexes. Indeed, many of these polyphyletic species have numerous synonyms and individuals in some barcode clusters show morphological attributes characteristic of the synonymized species. In light of our findings, it is imperative that type or topotype specimens be sequenced to correctly associate barcode clusters with morphological species concepts and to determine the status of currently synonymized species. PMID:22666447
Herbold, Craig W.; Pelikan, Claus; Kuzyk, Orest; Hausmann, Bela; Angel, Roey; Berry, David; Loy, Alexander
2015-01-01
High throughput sequencing of phylogenetic and functional gene amplicons provides tremendous insight into the structure and functional potential of complex microbial communities. Here, we introduce a highly adaptable and economical PCR approach to barcoding and pooling libraries of numerous target genes. In this approach, we replace gene- and sequencing platform-specific fusion primers with general, interchangeable barcoding primers, enabling nearly limitless customized barcode-primer combinations. Compared to barcoding with long fusion primers, our multiple-target gene approach is more economical because it overall requires lower number of primers and is based on short primers with generally lower synthesis and purification costs. To highlight our approach, we pooled over 900 different small-subunit rRNA and functional gene amplicon libraries obtained from various environmental or host-associated microbial community samples into a single, paired-end Illumina MiSeq run. Although the amplicon regions ranged in size from approximately 290 to 720 bp, we found no significant systematic sequencing bias related to amplicon length or gene target. Our results indicate that this flexible multiplexing approach produces large, diverse, and high quality sets of amplicon sequence data for modern studies in microbial ecology. PMID:26236305
2009-01-01
Background This study reports progress in assembling a DNA barcode reference library for Ephemeroptera, Plecoptera, and Trichoptera ("EPTs") from a Canadian subarctic site, which is the focus of a comprehensive biodiversity inventory using DNA barcoding. These three groups of aquatic insects exhibit a moderate level of species diversity, making them ideal for testing the feasibility of DNA barcoding for routine biotic surveys. We explore the correlation between the morphological species delineations, DNA barcode-based haplotype clusters delimited by a sequence threshold (2%), and a threshold-free approach to biodiversity quantification--phylogenetic diversity. Results A DNA barcode reference library is built for 112 EPT species for the focal region, consisting of 2277 COI sequences. Close correspondence was found between EPT morphospecies and haplotype clusters as designated using a standard threshold value. Similarly, the shapes of taxon accumulation curves based upon haplotype clusters were very similar to those generated using phylogenetic diversity accumulation curves, but were much more computationally efficient. Conclusion The results of this study will facilitate other lines of research on northern EPTs and also bode well for rapidly conducting initial biodiversity assessments in unknown EPT faunas. PMID:20003245
Building a DNA barcode library of Alaska's non-marine arthropods.
Sikes, Derek S; Bowser, Matthew; Morton, John M; Bickford, Casey; Meierotto, Sarah; Hildebrandt, Kyndall
2017-03-01
Climate change may result in ecological futures with novel species assemblages, trophic mismatch, and mass extinction. Alaska has a limited taxonomic workforce to address these changes. We are building a DNA barcode library to facilitate a metabarcoding approach to monitoring non-marine arthropods. Working with the Canadian Centre for DNA Barcoding, we obtained DNA barcodes from recently collected and authoritatively identified specimens in the University of Alaska Museum (UAM) Insect Collection and the Kenai National Wildlife Refuge collection. We submitted tissues from 4776 specimens, of which 81% yielded DNA barcodes representing 1662 species and 1788 Barcode Index Numbers (BINs), of primarily terrestrial, large-bodied arthropods. This represents 84% of the species available for DNA barcoding in the UAM Insect Collection. There are now 4020 Alaskan arthropod species represented by DNA barcodes, after including all records in Barcode of Life Data Systems (BOLD) of species that occur in Alaska - i.e., 48.5% of the 8277 Alaskan, non-marine-arthropod, named species have associated DNA barcodes. An assessment of the identification power of the library in its current state yielded fewer species-level identifications than expected, but the results were not discouraging. We believe we are the first to deliberately begin development of a DNA barcode library of the entire arthropod fauna for a North American state or province. Although far from complete, this library will become increasingly valuable as more species are added and costs to obtain DNA sequences fall.
Scaling up the 454 Titanium Library Construction and Pooling of Barcoded Libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phung, Wilson; Hack, Christopher; Shapiro, Harris
2009-03-23
We have been developing a high throughput 454 library construction process at the Joint Genome Institute to meet the needs of de novo sequencing a large number of microbial and eukaryote genomes, EST, and metagenome projects. We have been focusing efforts in three areas: (1) modifying the current process to allow the construction of 454 standard libraries on a 96-well format; (2) developing a robotic platform to perform the 454 library construction; and (3) designing molecular barcodes to allow pooling and sorting of many different samples. In the development of a high throughput process to scale up the number ofmore » libraries by adapting the process to a 96-well plate format, the key process change involves the replacement of gel electrophoresis for size selection with Solid Phase Reversible Immobilization (SPRI) beads. Although the standard deviation of the insert sizes increases, the overall quality sequence and distribution of the reads in the genome has not changed. The manual process of constructing 454 shotgun libraries on 96-well plates is a time-consuming, labor-intensive, and ergonomically hazardous process; we have been experimenting to program a BioMek robot to perform the library construction. This will not only enable library construction to be completed in a single day, but will also minimize any ergonomic risk. In addition, we have implemented a set of molecular barcodes (AKA Multiple Identifiers or MID) and a pooling process that allows us to sequence many targets simultaneously. Here we will present the testing of pooling a set of selected fosmids derived from the endomycorrhizal fungus Glomus intraradices. By combining the robotic library construction process and the use of molecular barcodes, it is now possible to sequence hundreds of fosmids that represent a minimal tiling path of this genome. Here we present the progress and the challenges of developing these scaled-up processes.« less
Hendrich, Lars; Morinière, Jérôme; Haszprunar, Gerhard; Hebert, Paul D N; Hausmann, Axel; Köhler, Frank; Balke, Michael
2015-07-01
Beetles are the most diverse group of animals and are crucial for ecosystem functioning. In many countries, they are well established for environmental impact assessment, but even in the well-studied Central European fauna, species identification can be very difficult. A comprehensive and taxonomically well-curated DNA barcode library could remedy this deficit and could also link hundreds of years of traditional knowledge with next generation sequencing technology. However, such a beetle library is missing to date. This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well-identified species (53% of the German fauna) with representatives from 97 of 103 families (94%). This study is the first comprehensive regional test of the efficiency of DNA barcoding for beetles with a focus on Germany. Sequences ≥500 bp were recovered from 63% of the specimens analysed (15 948 of 25 294) with short sequences from another 997 specimens. Whereas most specimens (92.2%) could be unambiguously assigned to a single known species by sequence diversity at CO1, 1089 specimens (6.8%) were assigned to more than one Barcode Index Number (BIN), creating 395 BINs which need further study to ascertain if they represent cryptic species, mitochondrial introgression, or simply regional variation in widespread species. We found 409 specimens (2.6%) that shared a BIN assignment with another species, most involving a pair of closely allied species as 43 BINs were involved. Most of these taxa were separated by barcodes although sequence divergences were low. Only 155 specimens (0.97%) show identical or overlapping clusters. © 2014 John Wiley & Sons Ltd.
Chambers, E Anne; Hebert, Paul D N
2016-01-01
High rates of species discovery and loss have led to the urgent need for more rapid assessment of species diversity in the herpetofauna. DNA barcoding allows for the preliminary identification of species based on sequence divergence. Prior DNA barcoding work on reptiles and amphibians has revealed higher biodiversity counts than previously estimated due to cases of cryptic and undiscovered species. Past studies have provided DNA barcodes for just 14% of the North American herpetofauna, revealing the need for expanded coverage. This study extends the DNA barcode reference library for North American herpetofauna, assesses the utility of this approach in aiding species delimitation, and examines the correspondence between current species boundaries and sequence clusters designated by the BIN system. Sequences were obtained from 730 specimens, representing 274 species (43%) from the North American herpetofauna. Mean intraspecific divergences were 1% and 3%, while average congeneric sequence divergences were 16% and 14% in amphibians and reptiles, respectively. BIN assignments corresponded with current species boundaries in 79% of amphibians, 100% of turtles, and 60% of squamates. Deep divergences (>2%) were noted in 35% of squamate and 16% of amphibian species, and low divergences (<2%) occurred in 12% of reptiles and 23% of amphibians, patterns reflected in BIN assignments. Sequence recovery declined with specimen age, and variation in recovery success was noted among collections. Within collections, barcodes effectively flagged seven mislabeled tissues, and barcode fragments were recovered from five formalin-fixed specimens. This study demonstrates that DNA barcodes can effectively flag errors in museum collections, while BIN splits and merges reveal taxa belonging to deeply diverged or hybridizing lineages. This study is the first effort to compile a reference library of DNA barcodes for herpetofauna on a continental scale.
Chambers, E. Anne; Hebert, Paul D. N.
2016-01-01
Background High rates of species discovery and loss have led to the urgent need for more rapid assessment of species diversity in the herpetofauna. DNA barcoding allows for the preliminary identification of species based on sequence divergence. Prior DNA barcoding work on reptiles and amphibians has revealed higher biodiversity counts than previously estimated due to cases of cryptic and undiscovered species. Past studies have provided DNA barcodes for just 14% of the North American herpetofauna, revealing the need for expanded coverage. Methodology/Principal Findings This study extends the DNA barcode reference library for North American herpetofauna, assesses the utility of this approach in aiding species delimitation, and examines the correspondence between current species boundaries and sequence clusters designated by the BIN system. Sequences were obtained from 730 specimens, representing 274 species (43%) from the North American herpetofauna. Mean intraspecific divergences were 1% and 3%, while average congeneric sequence divergences were 16% and 14% in amphibians and reptiles, respectively. BIN assignments corresponded with current species boundaries in 79% of amphibians, 100% of turtles, and 60% of squamates. Deep divergences (>2%) were noted in 35% of squamate and 16% of amphibian species, and low divergences (<2%) occurred in 12% of reptiles and 23% of amphibians, patterns reflected in BIN assignments. Sequence recovery declined with specimen age, and variation in recovery success was noted among collections. Within collections, barcodes effectively flagged seven mislabeled tissues, and barcode fragments were recovered from five formalin-fixed specimens. Conclusions/Significance This study demonstrates that DNA barcodes can effectively flag errors in museum collections, while BIN splits and merges reveal taxa belonging to deeply diverged or hybridizing lineages. This study is the first effort to compile a reference library of DNA barcodes for herpetofauna on a continental scale. PMID:27116180
A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.
Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John
2013-01-01
DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis
2012-01-01
Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.
Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong
2012-01-25
The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.
Raupach, Michael J.; Hannig, Karsten; Morinière, Jérome; Hendrich, Lars
2016-01-01
Abstract As molecular identification method, DNA barcoding based on partial cytochrome c oxidase subunit 1 (COI) sequences has been proven to be a useful tool for species determination in many insect taxa including ground beetles. In this study we tested the effectiveness of DNA barcodes to discriminate species of the ground beetle genus Bembidion and some closely related taxa of Germany. DNA barcodes were obtained from 819 individuals and 78 species, including sequences from previous studies as well as more than 300 new generated DNA barcodes. We found a 1:1 correspondence between BIN and traditionally recognized species for 69 species (89%). Low interspecific distances with maximum pairwise K2P values below 2.2% were found for three species pairs, including two species pairs with haplotype sharing (Bembidion atrocaeruleum/Bembidion varicolor and Bembidion guttula/Bembidion mannerheimii). In contrast to this, deep intraspecific sequence divergences with distinct lineages were revealed for two species (Bembidion geniculatum/Ocys harpaloides). Our study emphasizes the use of DNA barcodes for the identification of the analyzed ground beetles species and represents an important step in building-up a comprehensive barcode library for the Carabidae in Germany and Central Europe as well. PMID:27408547
The campaign to DNA barcode all fishes, FISH-BOL.
Ward, R D; Hanner, R; Hebert, P D N
2009-02-01
FISH-BOL, the Fish Barcode of Life campaign, is an international research collaboration that is assembling a standardized reference DNA sequence library for all fishes. Analysis is targeting a 648 base pair region of the mitochondrial cytochrome c oxidase I (COI) gene. More than 5000 species have already been DNA barcoded, with an average of five specimens per species, typically vouchers with authoritative identifications. The barcode sequence from any fish, fillet, fin, egg or larva can be matched against these reference sequences using BOLD; the Barcode of Life Data System (http://www.barcodinglife.org). The benefits of barcoding fishes include facilitating species identification, highlighting cases of range expansion for known species, flagging previously overlooked species and enabling identifications where traditional methods cannot be applied. Results thus far indicate that barcodes separate c. 98 and 93% of already described marine and freshwater fish species, respectively. Several specimens with divergent barcode sequences have been confirmed by integrative taxonomic analysis as new species. Past concerns in relation to the use of fish barcoding for species discrimination are discussed. These include hybridization, recent radiations, regional differentiation in barcode sequences and nuclear copies of the barcode region. However, current results indicate these issues are of little concern for the great majority of specimens.
Googling DNA sequences on the World Wide Web.
Hajibabaei, Mehrdad; Singer, Gregory A C
2009-11-10
New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.
Liu, Shanlin; Yang, Chentao; Zhou, Chengran; Zhou, Xin
2017-12-01
Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)-based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn't show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. © The Authors 2017. Published by Oxford University Press.
R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring.
Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès
2016-01-01
Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/. © The Author(s) 2016. Published by Oxford University Press.
R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring
Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès
2016-01-01
Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/ PMID:26989149
Vassou, Sophie Lorraine; Nithaniyal, Stalin; Raju, Balaji; Parani, Madasamy
2016-07-18
Ayurveda is a system of traditional medicine that originated in ancient India, and it is still in practice. Medicinal plants are the backbone of Ayurveda, which heavily relies on the plant-derived therapeutics. While Ayurveda is becoming more popular in several countries throughout the World, lack of authenticated medicinal plant raw drugs is a growing concern. Our aim was to DNA barcode the medicinal plants that are listed in the Ayurvedic Pharmacopoeia of India (API) to create a reference DNA barcode library, and to use the same to authenticate the raw drugs that are sold in markets. We have DNA barcoded 347 medicinal plants using rbcL marker, and curated rbcL DNA barcodes for 27 medicinal plants from public databases. These sequences were used to create Ayurvedic Pharmacopoeia of India - Reference DNA Barcode Library (API-RDBL). This library was used to authenticate 100 medicinal plant raw drugs, which were in the form of powders (82) and seeds (18). Ayurvedic Pharmacopoeia of India - Reference DNA Barcode Library (API-RDBL) was created with high quality and authentic rbcL barcodes for 374 out of the 395 medicinal plants that are included in the API. The rbcL DNA barcode differentiated 319 species (85 %) with the pairwise divergence ranging between 0.2 and 29.9 %. PCR amplification and DNA sequencing success rate of rbcL marker was 100 % even for the poorly preserved medicinal plant raw drugs that were collected from local markets. DNA barcoding revealed that only 79 % raw drugs were authentic, and the remaining 21 % samples were adulterated. Further, adulteration was found to be much higher with powders (ca. 25 %) when compared to seeds (ca. 5 %). The present study demonstrated the utility of DNA barcoding in authenticating medicinal plant raw drugs, and found that approximately one fifth of the market samples were adulterated. Powdered raw drugs, which are very difficult to be identified by taxonomists as well as common people, seem to be the easy target for adulteration. Developing a quality control protocol for medicinal plant raw drugs by incorporating DNA barcoding as a component is essential to ensure safety to the consumers.
Knebelsberger, Thomas; Landi, Monica; Neumann, Hermann; Kloppmann, Matthias; Sell, Anne F; Campbell, Patrick D; Laakmann, Silke; Raupach, Michael J; Carvalho, Gary R; Costa, Filipe O
2014-09-01
Valid fish species identification is an essential step both for fundamental science and fisheries management. The traditional identification is mainly based on external morphological diagnostic characters, leading to inconsistent results in many cases. Here, we provide a sequence reference library based on mitochondrial cytochrome c oxidase subunit I (COI) for a valid identification of 93 North Atlantic fish species originating from the North Sea and adjacent waters, including many commercially exploited species. Neighbour-joining analysis based on K2P genetic distances formed nonoverlapping clusters for all species with a ≥99% bootstrap support each. Identification was successful for 100% of the species as the minimum genetic distance to the nearest neighbour always exceeded the maximum intraspecific distance. A barcoding gap was apparent for the whole data set. Within-species distances ranged from 0 to 2.35%, while interspecific distances varied between 3.15 and 28.09%. Distances between congeners were on average 51-fold higher than those within species. The validation of the sequence library by applying BOLDs barcode index number (BIN) analysis tool and a ranking system demonstrated high taxonomic reliability of the DNA barcodes for 85% of the investigated fish species. Thus, the sequence library presented here can be confidently used as a benchmark for identification of at least two-thirds of the typical fish species recorded for the North Sea. © 2014 John Wiley & Sons Ltd.
DNA Barcoding the Geometrid Fauna of Bavaria (Lepidoptera): Successes, Surprises, and Questions
Hausmann, Axel; Haszprunar, Gerhard; Hebert, Paul D. N.
2011-01-01
Background The State of Bavaria is involved in a research program that will lead to the construction of a DNA barcode library for all animal species within its territorial boundaries. The present study provides a comprehensive DNA barcode library for the Geometridae, one of the most diverse of insect families. Methodology/Principal Findings This study reports DNA barcodes for 400 Bavarian geometrid species, 98 per cent of the known fauna, and approximately one per cent of all Bavarian animal species. Although 98.5% of these species possess diagnostic barcode sequences in Bavaria, records from neighbouring countries suggest that species-level resolution may be compromised in up to 3.5% of cases. All taxa which apparently share barcodes are discussed in detail. One case of modest divergence (1.4%) revealed a species overlooked by the current taxonomic system: Eupithecia goossensiata Mabille, 1869 stat.n. is raised from synonymy with Eupithecia absinthiata (Clerck, 1759) to species rank. Deep intraspecific sequence divergences (>2%) were detected in 20 traditionally recognized species. Conclusions/Significance The study emphasizes the effectiveness of DNA barcoding as a tool for monitoring biodiversity. Open access is provided to a data set that includes records for 1,395 geometrid specimens (331 species) from Bavaria, with 69 additional species from neighbouring regions. Taxa with deep intraspecific sequence divergences are undergoing more detailed analysis to ascertain if they represent cases of cryptic diversity. PMID:21423340
DNA barcoding reveal patterns of species diversity among northwestern Pacific molluscs
Sun, Shao’e; Li, Qi; Kong, Lingfeng; Yu, Hong; Zheng, Xiaodong; Yu, Ruihai; Dai, Lina; Sun, Yan; Chen, Jun; Liu, Jun; Ni, Lehai; Feng, Yanwei; Yu, Zhenzhen; Zou, Shanmei; Lin, Jiping
2016-01-01
This study represents the first comprehensive molecular assessment of northwestern Pacific molluscs. In total, 2801 DNA barcodes belonging to 569 species from China, Japan and Korea were analyzed. An overlap between intra- and interspecific genetic distances was present in 71 species. We tested the efficacy of this library by simulating a sequence-based specimen identification scenario using Best Match (BM), Best Close Match (BCM) and All Species Barcode (ASB) criteria with three threshold values. BM approach returned 89.15% true identifications (95.27% when excluding singletons). The highest success rate of congruent identifications was obtained with BCM at 0.053 threshold. The analysis of our barcode library together with public data resulted in 582 Barcode Index Numbers (BINs), 72.2% of which was found to be concordantly with morphology-based identifications. The discrepancies were divided in two groups: sequences from different species clustered in a single BIN and conspecific sequences divided in one more BINs. In Neighbour-Joining phenogram, 2,320 (83.0%) queries fromed 355 (62.4%) species-specific barcode clusters allowing their successful identification. 33 species showed paraphyletic and haplotype sharing. 62 cases are represented by deeply diverged lineages. This study suggest an increased species diversity in this region, highlighting taxonomic revision and conservation strategy for the cryptic complexes. PMID:27640675
The Hemiptera (Insecta) of Canada: Constructing a Reference Library of DNA Barcodes
Gwiazdowski, Rodger A.; Foottit, Robert G.; Maw, H. Eric L.; Hebert, Paul D. N.
2015-01-01
DNA barcode reference libraries linked to voucher specimens create new opportunities for high-throughput identification and taxonomic re-evaluations. This study provides a DNA barcode library for about 45% of the recognized species of Canadian Hemiptera, and the publically available R workflow used for its generation. The current library is based on the analysis of 20,851 specimens including 1849 species belonging to 628 genera and 64 families. These individuals were assigned to 1867 Barcode Index Numbers (BINs), sequence clusters that often coincide with species recognized through prior taxonomy. Museum collections were a key source for identified specimens, but we also employed high-throughput collection methods that generated large numbers of unidentified specimens. Many of these specimens represented novel BINs that were subsequently identified by taxonomists, adding barcode coverage for additional species. Our analyses based on both approaches includes 94 species not listed in the most recent Canadian checklist, representing a potential 3% increase in the fauna. We discuss the development of our workflow in the context of prior DNA barcode library construction projects, emphasizing the importance of delineating a set of reference specimens to aid investigations in cases of nomenclatural and DNA barcode discordance. The identification for each specimen in the reference set can be annotated on the Barcode of Life Data System (BOLD), allowing experts to highlight questionable identifications; annotations can be added by any registered user of BOLD, and instructions for this are provided. PMID:25923328
Development of High Throughput Process for Constructing 454 Titanium and Illumina Libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deshpande, Shweta; Hack, Christopher; Tang, Eric
2010-05-28
We have developed two processes with the Biomek FX robot to construct 454 titanium and Illumina libraries in order to meet the increasing library demands. All modifications in the library construction steps were made to enable the adaptation of the entire processes to work with the 96-well plate format. The key modifications include the shearing of DNA with Covaris E210 and the enzymatic reaction cleaning and fragment size selection with SPRI beads and magnetic plate holders. The construction of 96 Titanium libraries takes about 8 hours from sheared DNA to ssDNA recovery. The processing of 96 Illumina libraries takes lessmore » time than that of the Titanium library process. Although both processes still require manual transfer of plates from robot to other work stations such as thermocyclers, these robotic processes represent about 12- to 24-folds increase of library capacity comparing to the manual processes. To enable the sequencing of many libraries in parallel, we have also developed sets of molecular barcodes for both library types. The requirements for the 454 library barcodes include 10 bases, 40-60percent GC, no consecutive same base, and no less than 3 bases difference between barcodes. We have used 96 of the resulted 270 barcodes to construct libraries and pool to test the ability of accurately assigning reads to the right samples. When allowing 1 base error occurred in the 10 base barcodes, we could assign 99.6percent of the total reads and 100percent of them were uniquely assigned. As for the Illumina barcodes, the requirements include 4 bases, balanced GC, and at least 2 bases difference between barcodes. We have begun to assess the ability to assign reads after pooling different number of libraries. We will discuss the progress and the challenges of these scale-up processes.« less
Multiplex single-molecule interaction profiling of DNA-barcoded proteins.
Gu, Liangcai; Li, Chao; Aach, John; Hill, David E; Vidal, Marc; Church, George M
2014-11-27
In contrast with advances in massively parallel DNA sequencing, high-throughput protein analyses are often limited by ensemble measurements, individual analyte purification and hence compromised quality and cost-effectiveness. Single-molecule protein detection using optical methods is limited by the number of spectrally non-overlapping chromophores. Here we introduce a single-molecular-interaction sequencing (SMI-seq) technology for parallel protein interaction profiling leveraging single-molecule advantages. DNA barcodes are attached to proteins collectively via ribosome display or individually via enzymatic conjugation. Barcoded proteins are assayed en masse in aqueous solution and subsequently immobilized in a polyacrylamide thin film to construct a random single-molecule array, where barcoding DNAs are amplified into in situ polymerase colonies (polonies) and analysed by DNA sequencing. This method allows precise quantification of various proteins with a theoretical maximum array density of over one million polonies per square millimetre. Furthermore, protein interactions can be measured on the basis of the statistics of colocalized polonies arising from barcoding DNAs of interacting proteins. Two demanding applications, G-protein coupled receptor and antibody-binding profiling, are demonstrated. SMI-seq enables 'library versus library' screening in a one-pot assay, simultaneously interrogating molecular binding affinity and specificity.
Zhou, Chengran
2017-01-01
Abstract Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)–based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn’t show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. PMID:29077841
Plans and progress for building a Great Lakes fauna DNA ...
DNA reference libraries provide researchers with an important tool for assessing regional biodiversity by allowing unknown genetic sequences to be assigned identities, while also providing a means for taxonomists to validate identifications. Expanding the representation of Great Lakes species in such reference libraries is an explicit component of research at EPA’s Mid-Continent Ecology Division. Our DNA reference library building efforts began in 2012 with the goal of providing barcodes for at least 5 specimens of each native and nonindigenous fish and aquatic invertebrate species currently present in the Great Lakes. The approach is to pull taxonomically validated specimen for sequencing from EPA led sampling efforts of adult/juvenile fish, larval fish, benthic macroinvertebrates, and zooplankton; while also soliciting aid from state and federal agencies for tissue from “shopping list” organisms. The barcodes we generate are made available through the publicly accessible BOLD (Barcode of Life) database, and help inform a planned Great Lakes biodiversity inventory. To date, our submissions to BOLD are limited to fishes; of the 88 fish species listed as being present within Lake Superior, roughly half were successfully barcoded, while only 22 species met the desired quota of 5 barcoded specimens per species. As we continue to generate genomic information from our collections and the taxonomic representations become more complete, we will continue to
DNA Barcoding Identifies Argentine Fishes from Marine and Brackish Waters
Mabragaña, Ezequiel; Díaz de Astarloa, Juan Martín; Hanner, Robert; Zhang, Junbin; González Castro, Mariano
2011-01-01
Background DNA barcoding has been advanced as a promising tool to aid species identification and discovery through the use of short, standardized gene targets. Despite extensive taxonomic studies, for a variety of reasons the identification of fishes can be problematic, even for experts. DNA barcoding is proving to be a useful tool in this context. However, its broad application is impeded by the need to construct a comprehensive reference sequence library for all fish species. Here, we make a regional contribution to this grand challenge by calibrating the species discrimination efficiency of barcoding among 125 Argentine fish species, representing nearly one third of the known fauna, and examine the utility of these data to address several key taxonomic uncertainties pertaining to species in this region. Methodology/Principal Findings Specimens were collected and morphologically identified during crusies conducted between 2005 and 2008. The standard BARCODE fragment of COI was amplified and bi-directionally sequenced from 577 specimens (mean of 5 specimens/species), and all specimens and sequence data were archived and interrogated using analytical tools available on the Barcode of Life Data System (BOLD; www.barcodinglife.org). Nearly all species exhibited discrete clusters of closely related haplogroups which permitted the discrimination of 95% of the species (i.e. 119/125) examined while cases of shared haplotypes were detected among just three species-pairs. Notably, barcoding aided the identification of a new species of skate, Dipturus argentinensis, permitted the recognition of Genypterus brasiliensis as a valid species and questions the generic assignment of Paralichthys isosceles. Conclusions/Significance This study constitutes a significant contribution to the global barcode reference sequence library for fishes and demonstrates the utility of barcoding for regional species identification. As an independent assessment of alpha taxonomy, barcodes provide robust support for most morphologically based taxon concepts and also highlight key areas of taxonomic uncertainty worthy of reappraisal. PMID:22174860
Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee
2015-09-21
Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.
Arculeo, Marco; Bonello, Juan J.; Bonnici, Leanne; Cannas, Rita; Carbonara, Pierluigi; Cau, Alessandro; Charilaou, Charis; El Ouamari, Najib; Fiorentino, Fabio; Follesa, Maria Cristina; Garofalo, Germana; Golani, Daniel; Guarniero, Ilaria; Hanner, Robert; Hemida, Farid; Kada, Omar; Lo Brutto, Sabrina; Mancusi, Cecilia; Morey, Gabriel; Schembri, Patrick J.; Serena, Fabrizio; Sion, Letizia; Stagioni, Marco; Tursi, Angelo; Vrgoc, Nedo; Steinke, Dirk; Tinti, Fausto
2017-01-01
Cartilaginous fish are particularly vulnerable to anthropogenic stressors and environmental change because of their K-selected reproductive strategy. Accurate data from scientific surveys and landings are essential to assess conservation status and to develop robust protection and management plans. Currently available data are often incomplete or incorrect as a result of inaccurate species identifications, due to a high level of morphological stasis, especially among closely related taxa. Moreover, several diagnostic characters clearly visible in adult specimens are less evident in juveniles. Here we present results generated by the ELASMOMED Consortium, a regional network aiming to sample and DNA-barcode the Mediterranean Chondrichthyans with the ultimate goal to provide a comprehensive DNA barcode reference library. This library will support and improve the molecular taxonomy of this group and the effectiveness of management and conservation measures. We successfully barcoded 882 individuals belonging to 42 species (17 sharks, 24 batoids and one chimaera), including four endemic and several threatened ones. Morphological misidentifications were found across most orders, further confirming the need for a comprehensive DNA barcoding library as a valuable tool for the reliable identification of specimens in support of taxonomist who are reviewing current identification keys. Despite low intraspecific variation among their barcode sequences and reduced samples size, five species showed preliminary evidence of phylogeographic structure. Overall, the ELASMOMED initiative further emphasizes the key role accurate DNA barcoding libraries play in establishing reliable diagnostic species specific features in otherwise taxonomically problematic groups for biodiversity management and conservation actions. PMID:28107413
Wolbachia and DNA barcoding insects: patterns, potential, and problems.
Smith, M Alex; Bertrand, Claudia; Crosby, Kate; Eveleigh, Eldon S; Fernandez-Triana, Jose; Fisher, Brian L; Gibbs, Jason; Hajibabaei, Mehrdad; Hallwachs, Winnie; Hind, Katharine; Hrcek, Jan; Huang, Da-Wei; Janda, Milan; Janzen, Daniel H; Li, Yanwei; Miller, Scott E; Packer, Laurence; Quicke, Donald; Ratnasingham, Sujeevan; Rodriguez, Josephine; Rougerie, Rodolphe; Shaw, Mark R; Sheffield, Cory; Stahlhut, Julie K; Steinke, Dirk; Whitfield, James; Wood, Monty; Zhou, Xin
2012-01-01
Wolbachia is a genus of bacterial endosymbionts that impacts the breeding systems of their hosts. Wolbachia can confuse the patterns of mitochondrial variation, including DNA barcodes, because it influences the pathways through which mitochondria are inherited. We examined the extent to which these endosymbionts are detected in routine DNA barcoding, assessed their impact upon the insect sequence divergence and identification accuracy, and considered the variation present in Wolbachia COI. Using both standard PCR assays (Wolbachia surface coding protein--wsp), and bacterial COI fragments we found evidence of Wolbachia in insect total genomic extracts created for DNA barcoding library construction. When >2 million insect COI trace files were examined on the Barcode of Life Datasystem (BOLD) Wolbachia COI was present in 0.16% of the cases. It is possible to generate Wolbachia COI using standard insect primers; however, that amplicon was never confused with the COI of the host. Wolbachia alleles recovered were predominantly Supergroup A and were broadly distributed geographically and phylogenetically. We conclude that the presence of the Wolbachia DNA in total genomic extracts made from insects is unlikely to compromise the accuracy of the DNA barcode library; in fact, the ability to query this DNA library (the database and the extracts) for endosymbionts is one of the ancillary benefits of such a large scale endeavor--which we provide several examples. It is our conclusion that regular assays for Wolbachia presence and type can, and should, be adopted by large scale insect barcoding initiatives. While COI is one of the five multi-locus sequence typing (MLST) genes used for categorizing Wolbachia, there is limited overlap with the eukaryotic DNA barcode region.
Brancolini, Florencia; del Pazo, Felipe; Posner, Victoria Maria; Grimberg, Alexis; Arranz, Silvia Eda
2016-01-01
Valid fish species identification is essential for biodiversity conservation and fisheries management. Here, we provide a sequence reference library based on mitochondrial cytochrome c oxidase subunit I for a valid identification of 79 freshwater fish species from the Lower Paraná River. Neighbour-joining analysis based on K2P genetic distances formed non-overlapping clusters for almost all species with a ≥99% bootstrap support each. Identification was successful for 97.8% of species as the minimum genetic distance to the nearest neighbour exceeded the maximum intraspecific distance in all these cases. A barcoding gap of 2.5% was apparent for the whole data set with the exception of four cases. Within-species distances ranged from 0.00% to 7.59%, while interspecific distances varied between 4.06% and 19.98%, without considering Odontesthes species with a minimum genetic distance of 0%. Sequence library validation was performed by applying BOLDs BIN analysis tool, Poisson Tree Processes model and Automatic Barcode Gap Discovery, along with a reliable taxonomic assignment by experts. Exhaustive revision of vouchers was performed when a conflicting assignment was detected after sequence analysis and BIN discordance evaluation. Thus, the sequence library presented here can be confidently used as a benchmark for identification of half of the fish species recorded for the Lower Paraná River. PMID:27442116
Schäffer, Sylvia; Zachos, Frank E.
2017-01-01
DNA-barcoding is a rapidly developing method for efficiently identifying samples to species level by means of short standard DNA sequences. However, reliable species assignment requires the availability of a comprehensive DNA barcode reference library, and hence numerous initiatives aim at generating such barcode databases for particular taxa or geographic regions. Historical museum collections represent a potentially invaluable source for the DNA-barcoding of many taxa. This is particularly true for birds and mammals, for which collecting fresh (voucher) material is often very difficult to (nearly) impossible due to the special animal welfare and conservation regulations that apply to vertebrates in general, and birds and mammals in particular. Moreover, even great efforts might not guarantee sufficiently complete sampling of fresh material in a short period of time. DNA extracted from historical samples is usually degraded, such that only short fragments can be amplified, rendering the recovery of the barcoding region as a single fragment impossible. Here, we present a new set of primers that allows the efficient amplification and sequencing of the entire barcoding region in most higher taxa of Central European birds and mammals in six overlapping fragments, thus greatly increasing the value of historical museum collections for generating DNA barcode reference libraries. Applying our new primer set in recently established NGS protocols promises to further increase the efficiency of barcoding old bird and mammal specimens. PMID:28358863
Schäffer, Sylvia; Zachos, Frank E; Koblmüller, Stephan
2017-01-01
DNA-barcoding is a rapidly developing method for efficiently identifying samples to species level by means of short standard DNA sequences. However, reliable species assignment requires the availability of a comprehensive DNA barcode reference library, and hence numerous initiatives aim at generating such barcode databases for particular taxa or geographic regions. Historical museum collections represent a potentially invaluable source for the DNA-barcoding of many taxa. This is particularly true for birds and mammals, for which collecting fresh (voucher) material is often very difficult to (nearly) impossible due to the special animal welfare and conservation regulations that apply to vertebrates in general, and birds and mammals in particular. Moreover, even great efforts might not guarantee sufficiently complete sampling of fresh material in a short period of time. DNA extracted from historical samples is usually degraded, such that only short fragments can be amplified, rendering the recovery of the barcoding region as a single fragment impossible. Here, we present a new set of primers that allows the efficient amplification and sequencing of the entire barcoding region in most higher taxa of Central European birds and mammals in six overlapping fragments, thus greatly increasing the value of historical museum collections for generating DNA barcode reference libraries. Applying our new primer set in recently established NGS protocols promises to further increase the efficiency of barcoding old bird and mammal specimens.
Iftikhar, Romana; Ashfaq, Muhammad; Rasool, Akhtar; Hebert, Paul D N
2016-01-01
Although thrips are globally important crop pests and vectors of viral disease, species identifications are difficult because of their small size and inconspicuous morphological differences. Sequence variation in the mitochondrial COI-5' (DNA barcode) region has proven effective for the identification of species in many groups of insect pests. We analyzed barcode sequence variation among 471 thrips from various plant hosts in north-central Pakistan. The Barcode Index Number (BIN) system assigned these sequences to 55 BINs, while the Automatic Barcode Gap Discovery detected 56 partitions, a count that coincided with the number of monophyletic lineages recognized by Neighbor-Joining analysis and Bayesian inference. Congeneric species showed an average of 19% sequence divergence (range = 5.6% - 27%) at COI, while intraspecific distances averaged 0.6% (range = 0.0% - 7.6%). BIN analysis suggested that all intraspecific divergence >3.0% actually involved a species complex. In fact, sequences for three major pest species (Haplothrips reuteri, Thrips palmi, Thrips tabaci), and one predatory thrips (Aeolothrips intermedius) showed deep intraspecific divergences, providing evidence that each is a cryptic species complex. The study compiles the first barcode reference library for the thrips of Pakistan, and examines global haplotype diversity in four important pest thrips.
Morinière, Jérôme; Hendrich, Lars; Balke, Michael; Beermann, Arne J; König, Tobias; Hess, Monika; Koch, Stefan; Müller, Reinhard; Leese, Florian; Hebert, Paul D N; Hausmann, Axel; Schubart, Christoph D; Haszprunar, Gerhard
2017-11-01
Mayflies, stoneflies and caddisflies (Ephemeroptera, Plecoptera and Trichoptera) are prominent representatives of aquatic macroinvertebrates, commonly used as indicator organisms for water quality and ecosystem assessments. However, unambiguous morphological identification of EPT species, especially their immature life stages, is a challenging, yet fundamental task. A comprehensive DNA barcode library based upon taxonomically well-curated specimens is needed to overcome the problematic identification. Once available, this library will support the implementation of fast, cost-efficient and reliable DNA-based identifications and assessments of ecological status. This study represents a major step towards a DNA barcode reference library as it covers for two-thirds of Germany's EPT species including 2,613 individuals belonging to 363 identified species. As such, it provides coverage for 38 of 44 families (86%) and practically all major bioindicator species. DNA barcode compliant sequences (≥500 bp) were recovered from 98.74% of the analysed specimens. Whereas most species (325, i.e., 89.53%) were unambiguously assigned to a single Barcode Index Number (BIN) by its COI sequence, 38 species (18 Ephemeroptera, nine Plecoptera and 11 Trichoptera) were assigned to a total of 89 BINs. Most of these additional BINs formed nearest neighbour clusters, reflecting the discrimination of geographical subclades of a currently recognized species. BIN sharing was uncommon, involving only two species pairs of Ephemeroptera. Interestingly, both maximum pairwise and nearest neighbour distances were substantially higher for Ephemeroptera compared to Plecoptera and Trichoptera, possibly indicating older speciation events, stronger positive selection or faster rate of molecular evolution. © 2017 John Wiley & Sons Ltd.
DNA barcodes for bio-surveillance: regulated and economically important arthropod plant pests.
Ashfaq, Muhammad; Hebert, Paul D N
2016-11-01
Many of the arthropod species that are important pests of agriculture and forestry are impossible to discriminate morphologically throughout all of their life stages. Some cannot be differentiated at any life stage. Over the past decade, DNA barcoding has gained increasing adoption as a tool to both identify known species and to reveal cryptic taxa. Although there has not been a focused effort to develop a barcode library for them, reference sequences are now available for 77% of the 409 species of arthropods documented on major pest databases. Aside from developing the reference library needed to guide specimen identifications, past barcode studies have revealed that a significant fraction of arthropod pests are a complex of allied taxa. Because of their importance as pests and disease vectors impacting global agriculture and forestry, DNA barcode results on these arthropods have significant implications for quarantine detection, regulation, and management. The current review discusses these implications in light of the presence of cryptic species in plant pests exposed by DNA barcoding.
Vargas, Sergio; Kelly, Michelle; Schnabel, Kareen; Mills, Sadie; Bowden, David; Wörheide, Gert
2015-01-01
The approximately 350 demosponge species that have been described from Antarctica represent a faunistic component distinct from that of neighboring regions. Sponges provide structure to the Antarctic benthos and refuge to other invertebrates, and can be dominant in some communities. Despite the importance of sponges in the Antarctic subtidal environment, sponge DNA barcodes are scarce but can provide insight into the evolutionary relationships of this unique biogeographic province. We sequenced the standard barcoding COI region for a comprehensive selection of sponges collected during expeditions to the Ross Sea region in 2004 and 2008, and produced DNA-barcodes for 53 demosponge species covering about 60% of the species collected. The Antarctic sponge communities are phylogenetically diverse, matching the diversity of well-sampled sponge communities in the Lusitanic and Mediterranean marine provinces in the Temperate Northern Atlantic for which molecular data are readily available. Additionally, DNA-barcoding revealed levels of in situ molecular evolution comparable to those present among Caribbean sponges. DNA-barcoding using the Segregating Sites Algorithm correctly assigned approximately 54% of the barcoded species to the morphologically determined species. A barcode library for Antarctic sponges was assembled and used to advance the systematic and evolutionary research of Antarctic sponges. We provide insights on the evolutionary forces shaping Antarctica's diverse sponge communities, and a barcode library against which future sequence data from other regions or depth strata of Antarctica can be compared. The opportunity for rapid taxonomic identification of sponge collections for ecological research is now at the horizon.
DNA barcodes for Nearctic Auchenorrhyncha (Insecta: Hemiptera).
Foottit, Robert G; Maw, Eric; Hebert, P D N
2014-01-01
Many studies have shown the suitability of sequence variation in the 5' region of the mitochondrial cytochrome c oxidase I (COI) gene as a DNA barcode for the identification of species in a wide range of animal groups. We examined 471 species in 147 genera of Hemiptera: Auchenorrhyncha drawn from specimens in the Canadian National Collection of Insects to assess the effectiveness of DNA barcoding in this group. Analysis of the COI gene revealed less than 2% intra-specific divergence in 93% of the taxa examined, while minimum interspecific distances exceeded 2% in 70% of congeneric species pairs. Although most species are characterized by a distinct sequence cluster, sequences for members of many groups of closely related species either shared sequences or showed close similarity, with 25% of species separated from their nearest neighbor by less than 1%. This study, although preliminary, provides DNA barcodes for about 8% of the species of this hemipteran suborder found in North America north of Mexico. Barcodes can enable the identification of many species of Auchenorrhyncha, but members of some species groups cannot be discriminated. Future use of DNA barcodes in regulatory, pest management, and environmental applications will be possible as the barcode library for Auchenorrhyncha expands to include more species and broader geographic coverage.
DNA Barcodes for Nearctic Auchenorrhyncha (Insecta: Hemiptera)
Foottit, Robert G.; Maw, Eric; Hebert, P. D. N.
2014-01-01
Background Many studies have shown the suitability of sequence variation in the 5′ region of the mitochondrial cytochrome c oxidase I (COI) gene as a DNA barcode for the identification of species in a wide range of animal groups. We examined 471 species in 147 genera of Hemiptera: Auchenorrhyncha drawn from specimens in the Canadian National Collection of Insects to assess the effectiveness of DNA barcoding in this group. Methodology/Principal Findings Analysis of the COI gene revealed less than 2% intra-specific divergence in 93% of the taxa examined, while minimum interspecific distances exceeded 2% in 70% of congeneric species pairs. Although most species are characterized by a distinct sequence cluster, sequences for members of many groups of closely related species either shared sequences or showed close similarity, with 25% of species separated from their nearest neighbor by less than 1%. Conclusions/Significance This study, although preliminary, provides DNA barcodes for about 8% of the species of this hemipteran suborder found in North America north of Mexico. Barcodes can enable the identification of many species of Auchenorrhyncha, but members of some species groups cannot be discriminated. Future use of DNA barcodes in regulatory, pest management, and environmental applications will be possible as the barcode library for Auchenorrhyncha expands to include more species and broader geographic coverage. PMID:25004106
Moser, Lindsey A.; Ramirez-Carvajal, Lisbeth; Puri, Vinita; Pauszek, Steven J.; Matthews, Krystal; Dilley, Kari A.; Mullan, Clancy; McGraw, Jennifer; Khayat, Michael; Beeri, Karen; Yee, Anthony; Dugan, Vivien; Heise, Mark T.; Frieman, Matthew B.; Rodriguez, Luis L.; Bernard, Kristen A.; Wentworth, David E.
2016-01-01
ABSTRACT Several biosafety level 3 and/or 4 (BSL-3/4) pathogens are high-consequence, single-stranded RNA viruses, and their genomes, when introduced into permissive cells, are infectious. Moreover, many of these viruses are select agents (SAs), and their genomes are also considered SAs. For this reason, cDNAs and/or their derivatives must be tested to ensure the absence of infectious virus and/or viral RNA before transfer out of the BSL-3/4 and/or SA laboratory. This tremendously limits the capacity to conduct viral genomic research, particularly the application of next-generation sequencing (NGS). Here, we present a sequence-independent method to rapidly amplify viral genomic RNA while simultaneously abolishing both viral and genomic RNA infectivity across multiple single-stranded positive-sense RNA (ssRNA+) virus families. The process generates barcoded DNA amplicons that range in length from 300 to 1,000 bp, which cannot be used to rescue a virus and are stable to transport at room temperature. Our barcoding approach allows for up to 288 barcoded samples to be pooled into a single library and run across various NGS platforms without potential reconstitution of the viral genome. Our data demonstrate that this approach provides full-length genomic sequence information not only from high-titer virion preparations but it can also recover specific viral sequence from samples with limited starting material in the background of cellular RNA, and it can be used to identify pathogens from unknown samples. In summary, we describe a rapid, universal standard operating procedure that generates high-quality NGS libraries free of infectious virus and infectious viral RNA. IMPORTANCE This report establishes and validates a standard operating procedure (SOP) for select agents (SAs) and other biosafety level 3 and/or 4 (BSL-3/4) RNA viruses to rapidly generate noninfectious, barcoded cDNA amenable for next-generation sequencing (NGS). This eliminates the burden of testing all processed samples derived from high-consequence pathogens prior to transfer from high-containment laboratories to lower-containment facilities for sequencing. Our established protocol can be scaled up for high-throughput sequencing of hundreds of samples simultaneously, which can dramatically reduce the cost and effort required for NGS library construction. NGS data from this SOP can provide complete genome coverage from viral stocks and can also detect virus-specific reads from limited starting material. Our data suggest that the procedure can be implemented and easily validated by institutional biosafety committees across research laboratories. PMID:27822536
Establishing a community-wide DNA barcode library as a new tool for arctic research.
Wirta, H; Várkonyi, G; Rasmussen, C; Kaartinen, R; Schmidt, N M; Hebert, P D N; Barták, M; Blagoev, G; Disney, H; Ertl, S; Gjelstrup, P; Gwiazdowicz, D J; Huldén, L; Ilmonen, J; Jakovlev, J; Jaschhof, M; Kahanpää, J; Kankaanpää, T; Krogh, P H; Labbee, R; Lettner, C; Michelsen, V; Nielsen, S A; Nielsen, T R; Paasivirta, L; Pedersen, S; Pohjoismäki, J; Salmela, J; Vilkamaa, P; Väre, H; von Tschirnhaus, M; Roslin, T
2016-05-01
DNA sequences offer powerful tools for describing the members and interactions of natural communities. In this study, we establish the to-date most comprehensive library of DNA barcodes for a terrestrial site, including all known macroscopic animals and vascular plants of an intensively studied area of the High Arctic, the Zackenberg Valley in Northeast Greenland. To demonstrate its utility, we apply the library to identify nearly 20 000 arthropod individuals from two Malaise traps, each operated for two summers. Drawing on this material, we estimate the coverage of previous morphology-based species inventories, derive a snapshot of faunal turnover in space and time and describe the abundance and phenology of species in the rapidly changing arctic environment. Overall, 403 terrestrial animal and 160 vascular plant species were recorded by morphology-based techniques. DNA barcodes (CO1) offered high resolution in discriminating among the local animal taxa, with 92% of morphologically distinguishable taxa assigned to unique Barcode Index Numbers (BINs) and 93% to monophyletic clusters. For vascular plants, resolution was lower, with 54% of species forming monophyletic clusters based on barcode regions rbcLa and ITS2. Malaise catches revealed 122 BINs not detected by previous sampling and DNA barcoding. The insect community was dominated by a few highly abundant taxa. Even closely related taxa differed in phenology, emphasizing the need for species-level resolution when describing ongoing shifts in arctic communities and ecosystems. The DNA barcode library now established for Zackenberg offers new scope for such explorations, and for the detailed dissection of interspecific interactions throughout the community. © 2015 John Wiley & Sons Ltd.
The Microbial Ferrous Wheel in a Neutral pH Groundwater Seep
Roden, Eric E.; McBeth, Joyce M.; Blöthe, Marco; Percak-Dennett, Elizabeth M.; Fleming, Emily J.; Holyoke, Rebecca R.; Luther, George W.; Emerson, David; Schieber, Juergen
2012-01-01
Evidence for microbial Fe redox cycling was documented in a circumneutral pH groundwater seep near Bloomington, Indiana. Geochemical and microbiological analyses were conducted at two sites, a semi-consolidated microbial mat and a floating puffball structure. In situ voltammetric microelectrode measurements revealed steep opposing gradients of O2 and Fe(II) at both sites, similar to other groundwater seep and sedimentary environments known to support microbial Fe redox cycling. The puffball structure showed an abrupt increase in dissolved Fe(II) just at its surface (∼5 cm depth), suggesting an internal Fe(II) source coupled to active Fe(III) reduction. Most probable number enumerations detected microaerophilic Fe(II)-oxidizing bacteria (FeOB) and dissimilatory Fe(III)-reducing bacteria (FeRB) at densities of 102 to 105 cells mL−1 in samples from both sites. In vitro Fe(III) reduction experiments revealed the potential for immediate reduction (no lag period) of native Fe(III) oxides. Conventional full-length 16S rRNA gene clone libraries were compared with high throughput barcode sequencing of the V1, V4, or V6 variable regions of 16S rRNA genes in order to evaluate the extent to which new sequencing approaches could provide enhanced insight into the composition of Fe redox cycling microbial community structure. The composition of the clone libraries suggested a lithotroph-dominated microbial community centered around taxa related to known FeOB (e.g., Gallionella, Sideroxydans, Aquabacterium). Sequences related to recognized FeRB (e.g., Rhodoferax, Aeromonas, Geobacter, Desulfovibrio) were also well-represented. Overall, sequences related to known FeOB and FeRB accounted for 88 and 59% of total clone sequences in the mat and puffball libraries, respectively. Taxa identified in the barcode libraries showed partial overlap with the clone libraries, but were not always consistent across different variable regions and sequencing platforms. However, the barcode libraries provided confirmation of key clone library results (e.g., the predominance of Betaproteobacteria) and an expanded view of lithotrophic microbial community composition. PMID:22783228
Huemer, Peter; Mutanen, Marko; Sefc, Kristina M; Hebert, Paul D N
2014-01-01
This study examines the performance of DNA barcodes (mt cytochrome c oxidase 1 gene) in the identification of 1004 species of Lepidoptera shared by two localities (Finland, Austria) that are 1600 km apart. Maximum intraspecific distances for the pooled data were less than 2% for 880 species (87.6%), while deeper divergence was detected in 124 species. Despite such variation, the overall DNA barcode library possessed diagnostic COI sequences for 98.8% of the taxa. Because a reference library based on Finnish specimens was highly effective in identifying specimens from Austria, we conclude that barcode libraries based on regional sampling can often be effective for a much larger area. Moreover, dispersal ability (poor, good) and distribution patterns (disjunct, fragmented, continuous, migratory) had little impact on levels of intraspecific geographic divergence. Furthermore, the present study revealed that, despite the intensity of past taxonomic work on European Lepidoptera, nearly 20% of the species shared by Austria and Finland require further work to clarify their status. Particularly discordant BIN (Barcode Index Number) cases should be checked to ascertain possible explanatory factors such as incorrect taxonomy, hybridization, introgression, and Wolbachia infections.
Raupach, Michael J.; Hannig, Karsten; Moriniére, Jérôme; Hendrich, Lars
2018-01-01
Abstract The genus Amara Bonelli, 1810 is a very speciose and taxonomically difficult genus of the Carabidae. The identification of many of the species is accomplished with considerable difficulty, in particular for females and immature stages. In this study the effectiveness of DNA barcoding, the most popular method for molecular species identification, was examined to discriminate various species of this genus from Central Europe. DNA barcodes from 690 individuals and 47 species were analysed, including sequences from previous studies and more than 350 newly generated DNA barcodes. Our analysis revealed unique BINs for 38 species (81%). Interspecific K2P distances below 2.2% were found for three species pairs and one species trio, including haplotype sharing between Amara alpina/Amara torrida and Amara communis/Amara convexior/Amara makolskii. This study represents another step in generating an extensive reference library of DNA barcodes for carabids, highly valuable bioindicators for characterizing disturbances in various habitats. PMID:29853775
Plans and progress for building a Great Lakes fauna DNA barcode reference library
DNA reference libraries provide researchers with an important tool for assessing regional biodiversity by allowing unknown genetic sequences to be assigned identities, while also providing a means for taxonomists to validate identifications. Expanding the representation of Great...
Highlights of DNA Barcoding in identification of salient microorganisms like fungi.
Dulla, E L; Kathera, C; Gurijala, H K; Mallakuntla, T R; Srinivasan, P; Prasad, V; Mopati, R D; Jasti, P K
2016-12-01
Fungi, the second largest kingdom of eukaryotic life, are diverse and widespread. Fungi play a distinctive role in the production of different products on industrial scale, like fungal enzymes, antibiotics, fermented foods, etc., to give storage stability and improved health to meet major global challenges. To utilize algae perfectly for human needs, and to pave the way for getting a healthy relationship with fungi, it is important to identify them in a quick and robust manner with molecular-based identification system. So, there is a technique that aims to provide a well-organized method for species level identifications and to contribute powerfully to taxonomic and biodiversity research is DNA Barcoding. DNA Barcoding is generally achieved by the retrieval of a short DNA sequence - the 'barcode' - from a standard part of the genome and that barcode is then compared with a library of reference barcode sequences derived from individuals of known identity for identification. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Defining the ABC of gene essentiality in streptococci.
Charbonneau, Amelia R L; Forman, Oliver P; Cain, Amy K; Newland, Graham; Robinson, Carl; Boursnell, Mike; Parkhill, Julian; Leigh, James A; Maskell, Duncan J; Waller, Andrew S
2017-05-31
Utilising next generation sequencing to interrogate saturated bacterial mutant libraries provides unprecedented information for the assignment of genome-wide gene essentiality. Exposure of saturated mutant libraries to specific conditions and subsequent sequencing can be exploited to uncover gene essentiality relevant to the condition. Here we present a barcoded transposon directed insertion-site sequencing (TraDIS) system to define an essential gene list for Streptococcus equi subsp. equi, the causative agent of strangles in horses, for the first time. The gene essentiality data for this group C Streptococcus was compared to that of group A and B streptococci. Six barcoded variants of pGh9:ISS1 were designed and used to generate mutant libraries containing between 33,000-66,000 unique mutants. TraDIS was performed on DNA extracted from each library and data were analysed separately and as a combined master pool. Gene essentiality determined that 19.5% of the S. equi genome was essential. Gene essentialities were compared to those of group A and group B streptococci, identifying concordances of 90.2% and 89.4%, respectively and an overall concordance of 83.7% between the three species. The use of barcoded pGh9:ISS1 to generate mutant libraries provides a highly useful tool for the assignment of gene function in S. equi and other streptococci. The shared essential gene set of group A, B and C streptococci provides further evidence of the close genetic relationships between these important pathogenic bacteria. Therefore, the ABC of gene essentiality reported here provides a solid foundation towards reporting the functional genome of streptococci.
2015-11-16
detailed discussion of barcode designs in Supplementary Note 1, Supplementary Fig. 1 and sequences in Supplementary Note 2). Whereas the nicking and...eight subpools, each as a one- or as a two-barcode version ( design details in Supplementary Note 1). All subpools amplified strands with the expected...for the c2ca designs . We used the same restriction enzymes (Nb.BsrDI and Nt.BspQI) that were encoded between the primers and the target sequences to
Functional Analysis With a Barcoder Yeast Gene Overexpression System
Douglas, Alison C.; Smith, Andrew M.; Sharifpoor, Sara; Yan, Zhun; Durbic, Tanja; Heisler, Lawrence E.; Lee, Anna Y.; Ryan, Owen; Göttert, Hendrikje; Surendra, Anu; van Dyk, Dewald; Giaever, Guri; Boone, Charles; Nislow, Corey; Andrews, Brenda J.
2012-01-01
Systematic analysis of gene overexpression phenotypes provides an insight into gene function, enzyme targets, and biological pathways. Here, we describe a novel functional genomics platform that enables a highly parallel and systematic assessment of overexpression phenotypes in pooled cultures. First, we constructed a genome-level collection of ~5100 yeast barcoder strains, each of which carries a unique barcode, enabling pooled fitness assays with a barcode microarray or sequencing readout. Second, we constructed a yeast open reading frame (ORF) galactose-induced overexpression array by generating a genome-wide set of yeast transformants, each of which carries an individual plasmid-born and sequence-verified ORF derived from the Saccharomyces cerevisiae full-length EXpression-ready (FLEX) collection. We combined these collections genetically using synthetic genetic array methodology, generating ~5100 strains, each of which is barcoded and overexpresses a specific ORF, a set we termed “barFLEX.” Additional synthetic genetic array allows the barFLEX collection to be moved into different genetic backgrounds. As a proof-of-principle, we describe the properties of the barFLEX overexpression collection and its application in synthetic dosage lethality studies under different environmental conditions. PMID:23050238
Competitive Genomic Screens of Barcoded Yeast Libraries
Urbanus, Malene; Proctor, Michael; Heisler, Lawrence E.; Giaever, Guri; Nislow, Corey
2011-01-01
By virtue of advances in next generation sequencing technologies, we have access to new genome sequences almost daily. The tempo of these advances is accelerating, promising greater depth and breadth. In light of these extraordinary advances, the need for fast, parallel methods to define gene function becomes ever more important. Collections of genome-wide deletion mutants in yeasts and E. coli have served as workhorses for functional characterization of gene function, but this approach is not scalable, current gene-deletion approaches require each of the thousands of genes that comprise a genome to be deleted and verified. Only after this work is complete can we pursue high-throughput phenotyping. Over the past decade, our laboratory has refined a portfolio of competitive, miniaturized, high-throughput genome-wide assays that can be performed in parallel. This parallelization is possible because of the inclusion of DNA 'tags', or 'barcodes,' into each mutant, with the barcode serving as a proxy for the mutation and one can measure the barcode abundance to assess mutant fitness. In this study, we seek to fill the gap between DNA sequence and barcoded mutant collections. To accomplish this we introduce a combined transposon disruption-barcoding approach that opens up parallel barcode assays to newly sequenced, but poorly characterized microbes. To illustrate this approach we present a new Candida albicans barcoded disruption collection and describe how both microarray-based and next generation sequencing-based platforms can be used to collect 10,000 - 1,000,000 gene-gene and drug-gene interactions in a single experiment. PMID:21860376
Lavinia, Pablo D; Núñez Bustos, Ezequiel O; Kopuchian, Cecilia; Lijtmaer, Darío A; García, Natalia C; Hebert, Paul D N; Tubaro, Pablo L
2017-01-01
Because the tropical regions of America harbor the highest concentration of butterfly species, its fauna has attracted considerable attention. Much less is known about the butterflies of southern South America, particularly Argentina, where over 1,200 species occur. To advance understanding of this fauna, we assembled a DNA barcode reference library for 417 butterfly species of Argentina, focusing on the Atlantic Forest, a biodiversity hotspot. We tested the efficacy of this library for specimen identification, used it to assess the frequency of cryptic species, and examined geographic patterns of genetic variation, making this study the first large-scale genetic assessment of the butterflies of southern South America. The average sequence divergence to the nearest neighbor (i.e. minimum interspecific distance) was 6.91%, ten times larger than the mean distance to the furthest conspecific (0.69%), with a clear barcode gap present in all but four of the species represented by two or more specimens. As a consequence, the DNA barcode library was extremely effective in the discrimination of these species, allowing a correct identification in more than 95% of the cases. Singletons (i.e. species represented by a single sequence) were also distinguishable in the gene trees since they all had unique DNA barcodes, divergent from those of the closest non-conspecific. The clustering algorithms implemented recognized from 416 to 444 barcode clusters, suggesting that the actual diversity of butterflies in Argentina is 3%-9% higher than currently recognized. Furthermore, our survey added three new records of butterflies for the country (Eurema agave, Mithras hannelore, Melanis hillapana). In summary, this study not only supported the utility of DNA barcoding for the identification of the butterfly species of Argentina, but also highlighted several cases of both deep intraspecific and shallow interspecific divergence that should be studied in more detail.
Núñez Bustos, Ezequiel O.; Kopuchian, Cecilia; Lijtmaer, Darío A.; García, Natalia C.; Hebert, Paul D. N.; Tubaro, Pablo L.
2017-01-01
Because the tropical regions of America harbor the highest concentration of butterfly species, its fauna has attracted considerable attention. Much less is known about the butterflies of southern South America, particularly Argentina, where over 1,200 species occur. To advance understanding of this fauna, we assembled a DNA barcode reference library for 417 butterfly species of Argentina, focusing on the Atlantic Forest, a biodiversity hotspot. We tested the efficacy of this library for specimen identification, used it to assess the frequency of cryptic species, and examined geographic patterns of genetic variation, making this study the first large-scale genetic assessment of the butterflies of southern South America. The average sequence divergence to the nearest neighbor (i.e. minimum interspecific distance) was 6.91%, ten times larger than the mean distance to the furthest conspecific (0.69%), with a clear barcode gap present in all but four of the species represented by two or more specimens. As a consequence, the DNA barcode library was extremely effective in the discrimination of these species, allowing a correct identification in more than 95% of the cases. Singletons (i.e. species represented by a single sequence) were also distinguishable in the gene trees since they all had unique DNA barcodes, divergent from those of the closest non-conspecific. The clustering algorithms implemented recognized from 416 to 444 barcode clusters, suggesting that the actual diversity of butterflies in Argentina is 3%–9% higher than currently recognized. Furthermore, our survey added three new records of butterflies for the country (Eurema agave, Mithras hannelore, Melanis hillapana). In summary, this study not only supported the utility of DNA barcoding for the identification of the butterfly species of Argentina, but also highlighted several cases of both deep intraspecific and shallow interspecific divergence that should be studied in more detail. PMID:29049373
DNA barcode analysis of butterfly species from Pakistan points towards regional endemism
Ashfaq, Muhammad; Akhtar, Saleem; Khan, Arif M; Adamowicz, Sarah J; Hebert, Paul D N
2013-01-01
DNA barcodes were obtained for 81 butterfly species belonging to 52 genera from sites in north-central Pakistan to test the utility of barcoding for their identification and to gain a better understanding of regional barcode variation. These species represent 25% of the butterfly fauna of Pakistan and belong to five families, although the Nymphalidae were dominant, comprising 38% of the total specimens. Barcode analysis showed that maximum conspecific divergence was 1.6%, while there was 1.7–14.3% divergence from the nearest neighbour species. Barcode records for 55 species showed <2% sequence divergence to records in the Barcode of Life Data Systems (BOLD), but only 26 of these cases involved specimens from neighbouring India and Central Asia. Analysis revealed that most species showed little incremental sequence variation when specimens from other regions were considered, but a threefold increase was noted in a few cases. There was a clear gap between maximum intraspecific and minimum nearest neighbour distance for all 81 species. Neighbour-joining cluster analysis showed that members of each species formed a monophyletic cluster with strong bootstrap support. The barcode results revealed two provisional species that could not be clearly linked to known taxa, while 24 other species gained their first coverage. Future work should extend the barcode reference library to include all butterfly species from Pakistan as well as neighbouring countries to gain a better understanding of regional variation in barcode sequences in this topographically and climatically complex region. PMID:23789612
Genome-wide mapping of autonomous promoter activity in human cells
van Arensbergen, Joris; FitzPatrick, Vincent D.; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J.; van Steensel, Bas
2017-01-01
Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of sequences that could be tested. Here we present Survey of Regulatory Elements (SuRE), a method to assay more than 108 DNA fragments, each 0.2–2kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library is constructed of random genomic fragments upstream of a 20bp barcode and decoded by paired-end sequencing. This library is then transfected into cells and transcribed barcodes are quantified in the RNA by high throughput sequencing. When applied to the human genome, we achieved a 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide. By computational modeling we delineated subregions within promoters that are relevant for their activity. For instance, we show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites. PMID:28024146
Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun
2017-01-03
Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Hajibabaei, Mehrdad; Shokralla, Shadi; Zhou, Xin; Singer, Gregory A. C.; Baird, Donald J.
2011-01-01
Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs. PMID:21533287
Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Rao, A R
2016-11-05
DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists. Copyright © 2016 Elsevier B.V. All rights reserved.
DNA barcode analysis of butterfly species from Pakistan points towards regional endemism.
Ashfaq, Muhammad; Akhtar, Saleem; Khan, Arif M; Adamowicz, Sarah J; Hebert, Paul D N
2013-09-01
DNA barcodes were obtained for 81 butterfly species belonging to 52 genera from sites in north-central Pakistan to test the utility of barcoding for their identification and to gain a better understanding of regional barcode variation. These species represent 25% of the butterfly fauna of Pakistan and belong to five families, although the Nymphalidae were dominant, comprising 38% of the total specimens. Barcode analysis showed that maximum conspecific divergence was 1.6%, while there was 1.7-14.3% divergence from the nearest neighbour species. Barcode records for 55 species showed <2% sequence divergence to records in the Barcode of Life Data Systems (BOLD), but only 26 of these cases involved specimens from neighbouring India and Central Asia. Analysis revealed that most species showed little incremental sequence variation when specimens from other regions were considered, but a threefold increase was noted in a few cases. There was a clear gap between maximum intraspecific and minimum nearest neighbour distance for all 81 species. Neighbour-joining cluster analysis showed that members of each species formed a monophyletic cluster with strong bootstrap support. The barcode results revealed two provisional species that could not be clearly linked to known taxa, while 24 other species gained their first coverage. Future work should extend the barcode reference library to include all butterfly species from Pakistan as well as neighbouring countries to gain a better understanding of regional variation in barcode sequences in this topographically and climatically complex region. © 2013 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd.
Mulcahy, Daniel G.; Vanthomme, Hadrien; Tobi, Elie; Wynn, Addison H.; Zimkus, Breda M.; McDiarmid, Roy W.
2017-01-01
Development projects in west Central Africa are proceeding at an unprecedented rate, often with little concern for their effects on biodiversity. In an attempt to better understand potential impacts of a road development project on the anuran amphibian community, we conducted a biodiversity assessment employing multiple methodologies (visual encounter transects, auditory surveys, leaf litter plots and pitfall traps) to inventory species prior to construction of a new road within the buffer zone of Moukalaba-Doudou National Park, Gabon. Because of difficulties in morphological identification and taxonomic uncertainty of amphibian species observed in the area, we integrated a DNA barcoding analysis into the project to improve the overall quality and accuracy of the species inventory. Based on morphology alone, 48 species were recognized in the field and voucher specimens of each were collected. We used tissue samples from specimens collected at our field site, material available from amphibians collected in other parts of Gabon and the Republic of Congo to initiate a DNA barcode library for west Central African amphibians. We then compared our sequences with material in GenBank for the genera recorded at the study site to assist in identifications. The resulting COI and 16S barcode library allowed us to update the number of species documented at the study site to 28, thereby providing a more accurate assessment of diversity and distributions. We caution that because sequence data maintained in GenBank are often poorly curated by the original submitters and cannot be amended by third-parties, these data have limited utility for identification purposes. Nevertheless, the use of DNA barcoding is likely to benefit biodiversity inventories and long-term monitoring, particularly for taxa that can be difficult to identify based on morphology alone; likewise, inventory and monitoring programs can contribute invaluable data to the DNA barcode library and the taxonomy of complex groups. Our methods provide an example of how non-taxonomists and parataxonomists working in understudied parts of the world with limited geographic sampling and comparative morphological material can use DNA barcoding and publicly available sequence data (GenBank) to rapidly identify the number of species and assign tentative names to aid in urgent conservation management actions and contribute to taxonomic resolution. PMID:29131846
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening
2017-01-01
DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing. PMID:28199790
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening.
MacConnell, Andrew B; Price, Alexander K; Paegel, Brian M
2017-03-13
DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing.
Wilson, John-James; Sing, Kong-Wah; Sofian-Azirun, Mohd
2013-01-01
The objective of this study was to build a DNA barcode reference library for the true butterflies of Peninsula Malaysia and assess the value of attaching subspecies names to DNA barcode records. A new DNA barcode library was constructed with butterflies from the Museum of Zoology, University of Malaya collection. The library was analysed in conjunction with publicly available DNA barcodes from other Asia-Pacific localities to test the ability of the DNA barcodes to discriminate species and subspecies. Analyses confirmed the capacity of the new DNA barcode reference library to distinguish the vast majority of species (92%) and revealed that most subspecies possessed unique DNA barcodes (84%). In some cases conspecific subspecies exhibited genetic distances between their DNA barcodes that are typically seen between species, and these were often taxa that have previously been regarded as full species. Subspecies designations as shorthand for geographically and morphologically differentiated groups provide a useful heuristic for assessing how such groups correlate with clustering patterns of DNA barcodes, especially as the number of DNA barcodes per species in reference libraries increases. Our study demonstrates the value in attaching subspecies names to DNA barcode records as they can reveal a history of taxonomic concepts and expose important units of biodiversity.
Wilson, John-James; Sing, Kong-Wah; Sofian-Azirun, Mohd
2013-01-01
The objective of this study was to build a DNA barcode reference library for the true butterflies of Peninsula Malaysia and assess the value of attaching subspecies names to DNA barcode records. A new DNA barcode library was constructed with butterflies from the Museum of Zoology, University of Malaya collection. The library was analysed in conjunction with publicly available DNA barcodes from other Asia-Pacific localities to test the ability of the DNA barcodes to discriminate species and subspecies. Analyses confirmed the capacity of the new DNA barcode reference library to distinguish the vast majority of species (92%) and revealed that most subspecies possessed unique DNA barcodes (84%). In some cases conspecific subspecies exhibited genetic distances between their DNA barcodes that are typically seen between species, and these were often taxa that have previously been regarded as full species. Subspecies designations as shorthand for geographically and morphologically differentiated groups provide a useful heuristic for assessing how such groups correlate with clustering patterns of DNA barcodes, especially as the number of DNA barcodes per species in reference libraries increases. Our study demonstrates the value in attaching subspecies names to DNA barcode records as they can reveal a history of taxonomic concepts and expose important units of biodiversity. PMID:24282514
Multiplex single-molecule interaction profiling of DNA barcoded proteins
Gu, Liangcai; Li, Chao; Aach, John; Hill, David E.; Vidal, Marc; Church, George M.
2014-01-01
In contrast with advances in massively parallel DNA sequencing1, high-throughput protein analyses2-4 are often limited by ensemble measurements, individual analyte purification and hence compromised quality and cost-effectiveness. Single-molecule (SM) protein detection achieved using optical methods5 is limited by the number of spectrally nonoverlapping chromophores. Here, we introduce a single molecular interaction-sequencing (SMI-Seq) technology for parallel protein interaction profiling leveraging SM advantages. DNA barcodes are attached to proteins collectively via ribosome display6 or individually via enzymatic conjugation. Barcoded proteins are assayed en masse in aqueous solution and subsequently immobilized in a polyacrylamide (PAA) thin film to construct a random SM array, where barcoding DNAs are amplified into in situ polymerase colonies (polonies)7 and analyzed by DNA sequencing. This method allows precise quantification of various proteins with a theoretical maximum array density of over one million polonies per square millimeter. Furthermore, protein interactions can be measured based on the statistics of colocalized polonies arising from barcoding DNAs of interacting proteins. Two demanding applications, G-protein coupled receptor (GPCR) and antibody binding profiling, were demonstrated. SMI-Seq enables “library vs. library” screening in a one-pot assay, simultaneously interrogating molecular binding affinity and specificity. PMID:25252978
Lee, Shiou Yih; Ng, Wei Lun; Mahat, Mohd Noor; Nazre, Mohd; Mohamed, Rozi
2016-01-01
The identification of Aquilaria species from their resinous non-wood product, the agarwood, is challenging as conventional techniques alone are unable to ascertain the species origin. Aquilaria is a highly protected species due to the excessive exploitation of its precious agarwood. Here, we applied the DNA barcoding technique to generate barcode sequences for Aquilaria species and later applied the barcodes to identify the source species of agarwood found in the market. We developed a reference DNA barcode library using eight candidate barcode loci (matK, rbcL, rpoB, rpoC1, psbA-trnH, trnL-trnF, ITS, and ITS2) amplified from 24 leaf accessions of seven Aquilaria species obtained from living trees. Our results indicated that all single barcodes can be easily amplified and sequenced with the selected primers. The combination of trnL-trnF+ITS and trnL-trnF+ITS2 yielded the greatest species resolution using the least number of loci combination, while matK+trnL-trnF+ITS showed potential in detecting the geographical origins of Aquilaria species. We propose trnL-trnF+ITS2 as the best candidate barcode for Aquilaria as ITS2 has a shorter sequence length compared to ITS, which eases PCR amplification especially when using degraded DNA samples such as those extracted from processed agarwood products. A blind test conducted on eight agarwood samples in different forms using the proposed barcode combination proved successful in their identification up to the species level. Such potential of DNA barcoding in identifying the source species of agarwood will contribute to the international timber trade control, by providing an effective method for species identification and product authentication. PMID:27128309
Barcoding and Border Biosecurity: Identifying Cyprinid Fishes in the Aquarium Trade
Collins, Rupert A.; Armstrong, Karen F.; Meier, Rudolf; Yi, Youguang; Brown, Samuel D. J.; Cruickshank, Robert H.; Keeling, Suzanne; Johnston, Colin
2012-01-01
Background Poorly regulated international trade in ornamental fishes poses risks to both biodiversity and economic activity via invasive alien species and exotic pathogens. Border security officials need robust tools to confirm identifications, often requiring hard-to-obtain taxonomic literature and expertise. DNA barcoding offers a potentially attractive tool for quarantine inspection, but has yet to be scrutinised for aquarium fishes. Here, we present a barcoding approach for ornamental cyprinid fishes by: (1) expanding current barcode reference libraries; (2) assessing barcode congruence with morphological identifications under numerous scenarios (e.g. inclusion of GenBank data, presence of singleton species, choice of analytical method); and (3) providing supplementary information to identify difficult species. Methodology/Principal Findings We sampled 172 ornamental cyprinid fish species from the international trade, and provide data for 91 species currently unrepresented in reference libraries (GenBank/Bold). DNA barcodes were found to be highly congruent with our morphological assignments, achieving success rates of 90–99%, depending on the method used (neighbour-joining monophyly, bootstrap, nearest neighbour, GMYC, percent threshold). Inclusion of data from GenBank (additional 157 spp.) resulted in a more comprehensive library, but at a cost to success rate due to the increased number of singleton species. In addition to DNA barcodes, our study also provides supporting data in the form of specimen images, morphological characters, taxonomic bibliography, preserved vouchers, and nuclear rhodopsin sequences. Using this nuclear rhodopsin data we also uncovered evidence of interspecific hybridisation, and highlighted unrecognised diversity within popular aquarium species, including the endangered Indian barb Puntius denisonii. Conclusions/Significance We demonstrate that DNA barcoding provides a highly effective biosecurity tool for rapidly identifying ornamental fishes. In cases where DNA barcodes are unable to offer an identification, we improve on previous studies by consolidating supplementary information from multiple data sources, and empower biosecurity agencies to confidently identify high-risk fishes in the aquarium trade. PMID:22276096
DNA barcode data accurately assign higher spider taxa
Coddington, Jonathan A.; Agnarsson, Ingi; Cheng, Ren-Chung; Čandek, Klemen; Driskell, Amy; Frick, Holger; Gregorič, Matjaž; Kostanjšek, Rok; Kropf, Christian; Kweskin, Matthew; Lokovšek, Tjaša; Pipan, Miha; Vidergar, Nina
2016-01-01
The use of unique DNA sequences as a method for taxonomic identification is no longer fundamentally controversial, even though debate continues on the best markers, methods, and technology to use. Although both existing databanks such as GenBank and BOLD, as well as reference taxonomies, are imperfect, in best case scenarios “barcodes” (whether single or multiple, organelle or nuclear, loci) clearly are an increasingly fast and inexpensive method of identification, especially as compared to manual identification of unknowns by increasingly rare expert taxonomists. Because most species on Earth are undescribed, a complete reference database at the species level is impractical in the near term. The question therefore arises whether unidentified species can, using DNA barcodes, be accurately assigned to more inclusive groups such as genera and families—taxonomic ranks of putatively monophyletic groups for which the global inventory is more complete and stable. We used a carefully chosen test library of CO1 sequences from 49 families, 313 genera, and 816 species of spiders to assess the accuracy of genus and family-level assignment. We used BLAST queries of each sequence against the entire library and got the top ten hits. The percent sequence identity was reported from these hits (PIdent, range 75–100%). Accurate assignment of higher taxa (PIdent above which errors totaled less than 5%) occurred for genera at PIdent values >95 and families at PIdent values ≥ 91, suggesting these as heuristic thresholds for accurate generic and familial identifications in spiders. Accuracy of identification increases with numbers of species/genus and genera/family in the library; above five genera per family and fifteen species per genus all higher taxon assignments were correct. We propose that using percent sequence identity between conventional barcode sequences may be a feasible and reasonably accurate method to identify animals to family/genus. However, the quality of the underlying database impacts accuracy of results; many outliers in our dataset could be attributed to taxonomic and/or sequencing errors in BOLD and GenBank. It seems that an accurate and complete reference library of families and genera of life could provide accurate higher level taxonomic identifications cheaply and accessibly, within years rather than decades. PMID:27547527
Ten years of barcoding at the African Centre for DNA Barcoding.
Bezeng, B S; Davies, T J; Daru, B H; Kabongo, R M; Maurin, O; Yessoufou, K; van der Bank, H; van der Bank, M
2017-07-01
The African Centre for DNA Barcoding (ACDB) was established in 2005 as part of a global initiative to accurately and rapidly survey biodiversity using short DNA sequences. The mitochondrial cytochrome c oxidase 1 gene (CO1) was rapidly adopted as the de facto barcode for animals. Following the evaluation of several candidate loci for plants, the Plant Working Group of the Consortium for the Barcoding of Life in 2009 recommended that two plastid genes, rbcLa and matK, be adopted as core DNA barcodes for terrestrial plants. To date, numerous studies continue to test the discriminatory power of these markers across various plant lineages. Over the past decade, we at the ACDB have used these core DNA barcodes to generate a barcode library for southern Africa. To date, the ACDB has contributed more than 21 000 plant barcodes and over 3000 CO1 barcodes for animals to the Barcode of Life Database (BOLD). Building upon this effort, we at the ACDB have addressed questions related to community assembly, biogeography, phylogenetic diversification, and invasion biology. Collectively, our work demonstrates the diverse applications of DNA barcoding in ecology, systematics, evolutionary biology, and conservation.
Library preparation and data analysis packages for rapid genome sequencing.
Pomraning, Kyle R; Smith, Kristina M; Bredeweg, Erin L; Connolly, Lanelle R; Phatale, Pallavi A; Freitag, Michael
2012-01-01
High-throughput sequencing (HTS) has quickly become a valuable tool for comparative genetics and genomics and is now regularly carried out in laboratories that are not connected to large sequencing centers. Here we describe an updated version of our protocol for constructing single- and paired-end Illumina sequencing libraries, beginning with purified genomic DNA. The present protocol can also be used for "multiplexing," i.e. the analysis of several samples in a single flowcell lane by generating "barcoded" or "indexed" Illumina sequencing libraries in a way that is independent from Illumina-supported methods. To analyze sequencing results, we suggest several independent approaches but end users should be aware that this is a quickly evolving field and that currently many alignment (or "mapping") and counting algorithms are being developed and tested.
Genetic Control of Plant Root Colonization by the Biocontrol agent, Pseudomonas fluorescens
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cole, Benjamin J.; Fletcher, Meghan; Waters, Jordan
Plant growth promoting rhizobacteria (PGPR) are a critical component of plant root ecosystems. PGPR promote plant growth by solubilizing inaccessible minerals, suppressing pathogenic microorganisms in the soil, and directly stimulating growth through hormone synthesis. Pseudomonas fluorescens is a well-established PGPR isolated from wheat roots that can also colonize the root system of the model plant, Arabidopsis thaliana. We have created barcoded transposon insertion mutant libraries suitable for genome-wide transposon-mediated mutagenesis followed by sequencing (TnSeq). These libraries consist of over 105 independent insertions, collectively providing loss-of-function mutants for nearly all genes in the P.fluorescens genome. Each insertion mutant can be unambiguouslymore » identified by a randomized 20 nucleotide sequence (barcode) engineered into the transposon sequence. We used these libraries in a gnotobiotic assay to examine the colonization ability of P.fluorescens on A.thaliana roots. Taking advantage of the ability to distinguish individual colonization events using barcode sequences, we assessed the timing and microbial concentration dependence of colonization of the rhizoplane niche. These data provide direct insight into the dynamics of plant root colonization in an in vivo system and define baseline parameters for the systematic identification of the bacterial genes and molecular pathways using TnSeq assays. Having determined parameters that facilitate potential colonization of roots by thousands of independent insertion mutants in a single assay, we are currently establishing a genome-wide functional map of genes required for root colonization in P.fluorescens. Importantly, the approach developed and optimized here for P.fluorescens>A.thaliana colonization will be applicable to a wide range of plant-microbe interactions, including biofuel feedstock plants and microbes known or hypothesized to impact on biofuel-relevant traits including biomass productivity and pathogen resistance.« less
Park, D-S; Suh, S-J; Hebert, P D N; Oh, H-W; Hong, K-J
2011-08-01
Although DNA barcode coverage has grown rapidly for many insect orders, there are some groups, such as scale insects, where sequence recovery has been difficult. However, using a recently developed primer set, we recovered barcode records from 373 specimens, providing coverage for 75 species from 31 genera in two families. Overall success was >90% for mealybugs and >80% for armored scale species. The G·C content was very low in most species, averaging just 16.3%. Sequence divergences (K2P) between congeneric species averaged 10.7%, while intra-specific divergences averaged 0.97%. However, the latter value was inflated by high intra-specific divergence in nine taxa, cases that may indicate species overlooked by current taxonomic treatments. Our study establishes the feasibility of developing a comprehensive barcode library for scale insects and indicates that its construction will both create an effective system for identifying scale insects and reveal taxonomic situations worthy of deeper analysis.
Probing planetary biodiversity with DNA barcodes: The Noctuoidea of North America
Lafontaine, J. Donald; Schmidt, B. Christian; deWaard, Jeremy R.; Zakharov, Evgeny V.; Hebert, Paul D. N.
2017-01-01
This study reports the assembly of a DNA barcode reference library for species in the lepidopteran superfamily Noctuoidea from Canada and the USA. Based on the analysis of 69,378 specimens, the library provides coverage for 97.3% of the noctuoid fauna (3565 of 3664 species). In addition to verifying the strong performance of DNA barcodes in the discrimination of these species, the results indicate close congruence between the number of species analyzed (3565) and the number of sequence clusters (3816) recognized by the Barcode Index Number (BIN) system. Distributional patterns across 12 North American ecoregions are examined for the 3251 species that have GPS data while BIN analysis is used to quantify overlap between the noctuoid faunas of North America and other zoogeographic regions. This analysis reveals that 90% of North American noctuoids are endemic and that just 7.5% and 1.8% of BINs are shared with the Neotropics and with the Palearctic, respectively. One third (29) of the latter species are recent introductions and, as expected, they possess low intraspecific divergences. PMID:28570635
Petukhov, Viktor; Guo, Jimin; Baryawno, Ninib; Severe, Nicolas; Scadden, David T; Samsonova, Maria G; Kharchenko, Peter V
2018-06-19
Recent single-cell RNA-seq protocols based on droplet microfluidics use massively multiplexed barcoding to enable simultaneous measurements of transcriptomes for thousands of individual cells. The increasing complexity of such data creates challenges for subsequent computational processing and troubleshooting of these experiments, with few software options currently available. Here, we describe a flexible pipeline for processing droplet-based transcriptome data that implements barcode corrections, classification of cell quality, and diagnostic information about the droplet libraries. We introduce advanced methods for correcting composition bias and sequencing errors affecting cellular and molecular barcodes to provide more accurate estimates of molecular counts in individual cells.
Morgan, Benjamin S. T.; Egerton-Warburton, Louise M.
2017-01-01
Premise of the study: Arbuscular mycorrhizal fungi (AMF) are globally important root symbioses that enhance plant growth and nutrition and influence ecosystem structure and function. To better characterize levels of AMF diversity relevant to ecosystem function, deeper sequencing depth in environmental samples is needed. In this study, Illumina barcoded primers and a bioinformatics pipeline were developed and applied to study AMF diversity and community structure in environmental samples. Methods: Libraries of small subunit ribosomal RNA fragment amplicons were amplified from environmental DNA using a single-step PCR reaction with barcoded NS31/AML2 primers. Amplicons were sequenced on an Illumina MiSeq sequencer using version 2, 2 × 250-bp paired-end chemistry, and analyzed using QIIME and RDP Classifier. Results: Sequencing captured 196 to 6416 operational taxonomic units (OTUs; depending on clustering parameters) representing nine AMF genera. Regardless of clustering parameters, ∼20 OTUs dominated AMF communities (78–87% reads) with the remaining reads distributed among other OTUs. Analyses also showed significant biogeographic differences in AMF communities and that community composition could be linked to specific edaphic factors. Discussion: Barcoded NS31/AML2 primers and Illumina MiSeq sequencing provide a powerful approach to address AMF diversity and variations in fungal assemblages across host plants, ecosystems, and responses to environmental drivers including global change. PMID:28924511
Untangling taxonomy: a DNA barcode reference library for Canadian spiders.
Blagoev, Gergin A; deWaard, Jeremy R; Ratnasingham, Sujeevan; deWaard, Stephanie L; Lu, Liuqiong; Robertson, James; Telfer, Angela C; Hebert, Paul D N
2016-01-01
Approximately 1460 species of spiders have been reported from Canada, 3% of the global fauna. This study provides a DNA barcode reference library for 1018 of these species based upon the analysis of more than 30,000 specimens. The sequence results show a clear barcode gap in most cases with a mean intraspecific divergence of 0.78% vs. a minimum nearest-neighbour (NN) distance averaging 7.85%. The sequences were assigned to 1359 Barcode index numbers (BINs) with 1344 of these BINs composed of specimens belonging to a single currently recognized species. There was a perfect correspondence between BIN membership and a known species in 795 cases, while another 197 species were assigned to two or more BINs (556 in total). A few other species (26) were involved in BIN merges or in a combination of merges and splits. There was only a weak relationship between the number of specimens analysed for a species and its BIN count. However, three species were clear outliers with their specimens being placed in 11-22 BINs. Although all BIN splits need further study to clarify the taxonomic status of the entities involved, DNA barcodes discriminated 98% of the 1018 species. The present survey conservatively revealed 16 species new to science, 52 species new to Canada and major range extensions for 426 species. However, if most BIN splits detected in this study reflect cryptic taxa, the true species count for Canadian spiders could be 30-50% higher than currently recognized. © 2015 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
deWaard, Jeremy R; Mitchell, Andrew; Keena, Melody A; Gopurenko, David; Boykin, Laura M; Armstrong, Karen F; Pogue, Michael G; Lima, Joao; Floyd, Robin; Hanner, Robert H; Humble, Leland M
2010-12-09
Detecting and controlling the movements of invasive species, such as insect pests, relies upon rapid and accurate species identification in order to initiate containment procedures by the appropriate authorities. Many species in the tussock moth genus Lymantria are significant forestry pests, including the gypsy moth Lymantria dispar L., and consequently have been a focus for the development of molecular diagnostic tools to assist in identifying species and source populations. In this study we expand the taxonomic and geographic coverage of the DNA barcode reference library, and further test the utility of this diagnostic method, both for species/subspecies assignment and for determination of geographic provenance of populations. Cytochrome oxidase I (COI) barcodes were obtained from 518 individuals and 36 species of Lymantria, including sequences assembled and generated from previous studies, vouchered material in public collections, and intercepted specimens obtained from surveillance programs in Canada. A maximum likelihood tree was constructed, revealing high bootstrap support for 90% of species clusters. Bayesian species assignment was also tested, and resulted in correct assignment to species and subspecies in all instances. The performance of barcoding was also compared against the commonly employed NB restriction digest system (also based on COI); while the latter is informative for discriminating gypsy moth subspecies, COI barcode sequences provide greater resolution and generality by encompassing a greater number of haplotypes across all Lymantria species, none shared between species. This study demonstrates the efficacy of DNA barcodes for diagnosing species of Lymantria and reinforces the view that the approach is an under-utilized resource with substantial potential for biosecurity and surveillance. Biomonitoring agencies currently employing the NB restriction digest system would gather more information by transitioning to the use of DNA barcoding, a change which could be made relatively seamlessly as the same gene region underlies both protocols.
2011-01-01
Background When a specimen belongs to a species not yet represented in DNA barcode reference libraries there is disagreement over the effectiveness of using sequence comparisons to assign the query accurately to a higher taxon. Library completeness and the assignment criteria used have been proposed as critical factors affecting the accuracy of such assignments but have not been thoroughly investigated. We explored the accuracy of assignments to genus, tribe and subfamily in the Sphingidae, using the almost complete global DNA barcode reference library (1095 species) available for this family. Costa Rican sphingids (118 species), a well-documented, diverse subset of the family, with each of the tribes and subfamilies represented were used as queries. We simulated libraries with different levels of completeness (10-100% of the available species), and recorded assignments (positive or ambiguous) and their accuracy (true or false) under six criteria. Results A liberal tree-based criterion assigned 83% of queries accurately to genus, 74% to tribe and 90% to subfamily, compared to a strict tree-based criterion, which assigned 75% of queries accurately to genus, 66% to tribe and 84% to subfamily, with a library containing 100% of available species (but excluding the species of the query). The greater number of true positives delivered by more relaxed criteria was negatively balanced by the occurrence of more false positives. This effect was most sharply observed with libraries of the lowest completeness where, for example at the genus level, 32% of assignments were false positives with the liberal criterion versus < 1% when using the strict. We observed little difference (< 8% using the liberal criterion) however, in the overall accuracy of the assignments between the lowest and highest levels of library completeness at the tribe and subfamily level. Conclusions Our results suggest that when using a strict tree-based criterion for higher taxon assignment with DNA barcodes, the likelihood of assigning a query a genus name incorrectly is very low, if a genus name is provided it has a high likelihood of being accurate, and if no genus match is available the query can nevertheless be assigned to a subfamily with high accuracy regardless of library completeness. DNA barcoding often correctly assigned sphingid moths to higher taxa when species matches were unavailable, suggesting that barcode reference libraries can be useful for higher taxon assignments long before they achieve complete species coverage. PMID:21806794
Raupach, Michael J; Barco, Andrea; Steinke, Dirk; Beermann, Jan; Laakmann, Silke; Mohrbeck, Inga; Neumann, Hermann; Kihara, Terue C; Pointner, Karin; Radulovici, Adriana; Segelken-Voigt, Alexandra; Wesse, Christina; Knebelsberger, Thomas
2015-01-01
During the last years DNA barcoding has become a popular method of choice for molecular specimen identification. Here we present a comprehensive DNA barcode library of various crustacean taxa found in the North Sea, one of the most extensively studied marine regions of the world. Our data set includes 1,332 barcodes covering 205 species, including taxa of the Amphipoda, Copepoda, Decapoda, Isopoda, Thecostraca, and others. This dataset represents the most extensive DNA barcode library of the Crustacea in terms of species number to date. By using the Barcode of Life Data Systems (BOLD), unique BINs were identified for 198 (96.6%) of the analyzed species. Six species were characterized by two BINs (2.9%), and three BINs were found for the amphipod species Gammarus salinus Spooner, 1947 (0.4%). Intraspecific distances with values higher than 2.2% were revealed for 13 species (6.3%). Exceptionally high distances of up to 14.87% between two distinct but monophyletic clusters were found for the parasitic copepod Caligus elongatus Nordmann, 1832, supporting the results of previous studies that indicated the existence of an overlooked sea louse species. In contrast to these high distances, haplotype-sharing was observed for two decapod spider crab species, Macropodia parva Van Noort & Adema, 1985 and Macropodia rostrata (Linnaeus, 1761), underlining the need for a taxonomic revision of both species. Summarizing the results, our study confirms the application of DNA barcodes as highly effective identification system for the analyzed marine crustaceans of the North Sea and represents an important milestone for modern biodiversity assessment studies using barcode sequences.
2011-01-01
Genome targeting methods enable cost-effective capture of specific subsets of the genome for sequencing. We present here an automated, highly scalable method for carrying out the Solution Hybrid Selection capture approach that provides a dramatic increase in scale and throughput of sequence-ready libraries produced. Significant process improvements and a series of in-process quality control checkpoints are also added. These process improvements can also be used in a manual version of the protocol. PMID:21205303
Elías-Gutiérrez, Manuel; Valdez-Moreno, Martha; Topan, Janet; Young, Monica R; Cohuo-Colli, José Angel
2018-03-01
Currently, freshwater zooplankton sampling and identification methodologies have remained virtually unchanged since they were first established in the beginning of the XX century. One major contributing factor to this slow progress is the limited success of modern genetic methodologies, such as DNA barcoding, in several of the main groups. This study demonstrates improved protocols which enable the rapid assessment of most animal taxa inhabiting any freshwater system by combining the use of light traps, careful fixation at low temperatures using ethanol, and zooplankton-specific primers. We DNA-barcoded 2,136 specimens from a diverse array of taxonomic assemblages (rotifers, mollusks, mites, crustaceans, insects, and fishes) from several Canadian and Mexican lakes with an average sequence success rate of 85.3%. In total, 325 Barcode Index Numbers (BINs) were detected with only three BINs (two cladocerans and one copepod) shared between Canada and Mexico, suggesting a much narrower distribution range of freshwater zooplankton than previously thought. This study is the first to broadly explore the metazoan biodiversity of freshwater systems with DNA barcodes to construct a reference library that represents the first step for future programs which aim to monitor ecosystem health, track invasive species, or improve knowledge of the ecology and distribution of freshwater zooplankton.
Nwani, Christopher D; Becker, Sven; Braid, Heather E; Ude, Emmanuel F; Okogwu, Okechukwu I; Hanner, Robert
2011-10-01
Fishes are the main animal protein source for human beings and play a vital role in aquatic ecosystems and food webs. Fish identification can be challenging, especially in the tropics (due to high diversity), and this is particularly true for larval forms or fragmentary remains. DNA barcoding, which uses the 5' region of the mitochondrial cytochrome c oxidase subunit I (COI) as a target gene, is an efficient method for standardized species-level identification for biodiversity assessment and conservation, pending the establishment of reference sequence libraries. In this study, fishes were collected from three rivers in southeastern Nigeria, identified morphologically, and imaged digitally. DNA was extracted, PCR-amplified, and the standard barcode region was bidirectionally sequenced for 363 individuals belonging to 70 species in 38 genera. All specimen provenance data and associated sequence information were recorded in the barcode of life data systems (BOLD; www.barcodinglife.org ). Analytical tools on BOLD were used to assess the performance of barcoding to identify species. Using neighbor-joining distance comparison, the average genetic distance was 60-fold higher between species than within species, as pairwise genetic distance estimates averaged 10.29% among congeners and only 0.17% among conspecifics. Despite low levels of divergence within species, we observed river system-specific haplotype partitioning within eight species (11.4% of all species). Our preliminary results suggest that DNA barcoding is very effective for species identification of Nigerian freshwater fishes.
NASA Astrophysics Data System (ADS)
von Beeren, Christoph; Stoeckle, Mark Y.; Xia, Joyce; Burke, Griffin; Kronauer, Daniel J. C.
2015-02-01
DNA barcoding promises to be a useful tool to identify pest species assuming adequate representation of genetic variants in a reference library. Here we examined mitochondrial DNA barcodes in a global urban pest, the American cockroach (Periplaneta americana). Our sampling effort generated 284 cockroach specimens, most from New York City, plus 15 additional U.S. states and six other countries, enabling the first large-scale survey of P. americana barcode variation. Periplaneta americana barcode sequences (n = 247, including 24 GenBank records) formed a monophyletic lineage separate from other Periplaneta species. We found three distinct P. americana haplogroups with relatively small differences within (<=0.6%) and larger differences among groups (2.4%-4.7%). This could be interpreted as indicative of multiple cryptic species. However, nuclear DNA sequences (n = 77 specimens) revealed extensive gene flow among mitochondrial haplogroups, confirming a single species. This unusual genetic pattern likely reflects multiple introductions from genetically divergent source populations, followed by interbreeding in the invasive range. Our findings highlight the need for comprehensive reference databases in DNA barcoding studies, especially when dealing with invasive populations that might be derived from multiple genetically distinct source populations.
Identification of the vascular plants of Churchill, Manitoba, using a DNA barcode library
2012-01-01
Background Because arctic plant communities are highly vulnerable to climate change, shifts in their composition require rapid, accurate identifications, often for specimens that lack diagnostic floral characters. The present study examines the role that DNA barcoding can play in aiding floristic evaluations in the arctic by testing the effectiveness of the core plant barcode regions (rbcL, matK) and a supplemental ribosomal DNA (ITS2) marker for a well-studied flora near Churchill, Manitoba. Results This investigation examined 900 specimens representing 312 of the 354 species of vascular plants known from Churchill. Sequencing success was high for rbcL: 95% for fresh specimens and 85% for herbarium samples (mean age 20 years). ITS2 worked equally well for the fresh and herbarium material (89% and 88%). However, sequencing success was lower for matK, despite two rounds of PCR amplification, which reflected less effective primer binding and sensitivity to the DNA degradation (76% of fresh, 45% of herbaria samples). A species was considered as taxonomically resolved if its members showed at least one diagnostic difference from any other taxon in the study and formed a monophyletic clade. The highest species resolution (69%) was obtained by combining information from all three genes. The joint sequence information for rbcL and matK distinguished 54% of 286 species, while rbcL and ITS2 distinguished 63% of 285 species. Discrimination of species within Salix, which constituted 8% of the flora, was particularly problematic. Despite incomplete resolution, the barcode results revealed 22 misidentified herbarium specimens, and enabled the identification of field specimens which were otherwise too immature to identify. Although seven cases of ITS2 paralogy were noted in the families Cyperaceae, Juncaceae and Juncaginaceae, this intergenic spacer played an important role in resolving congeneric plant species at Churchill. Conclusions Our results provided fast and cost-effective solution to create a comprehensive, effective DNA barcode reference library for a local flora. PMID:23190419
Identification of the vascular plants of Churchill, Manitoba, using a DNA barcode library.
Kuzmina, Maria L; Johnson, Karen L; Barron, Hannah R; Hebert, Paul Dn
2012-11-28
Because arctic plant communities are highly vulnerable to climate change, shifts in their composition require rapid, accurate identifications, often for specimens that lack diagnostic floral characters. The present study examines the role that DNA barcoding can play in aiding floristic evaluations in the arctic by testing the effectiveness of the core plant barcode regions (rbcL, matK) and a supplemental ribosomal DNA (ITS2) marker for a well-studied flora near Churchill, Manitoba. This investigation examined 900 specimens representing 312 of the 354 species of vascular plants known from Churchill. Sequencing success was high for rbcL: 95% for fresh specimens and 85% for herbarium samples (mean age 20 years). ITS2 worked equally well for the fresh and herbarium material (89% and 88%). However, sequencing success was lower for matK, despite two rounds of PCR amplification, which reflected less effective primer binding and sensitivity to the DNA degradation (76% of fresh, 45% of herbaria samples). A species was considered as taxonomically resolved if its members showed at least one diagnostic difference from any other taxon in the study and formed a monophyletic clade. The highest species resolution (69%) was obtained by combining information from all three genes. The joint sequence information for rbcL and matK distinguished 54% of 286 species, while rbcL and ITS2 distinguished 63% of 285 species. Discrimination of species within Salix, which constituted 8% of the flora, was particularly problematic. Despite incomplete resolution, the barcode results revealed 22 misidentified herbarium specimens, and enabled the identification of field specimens which were otherwise too immature to identify. Although seven cases of ITS2 paralogy were noted in the families Cyperaceae, Juncaceae and Juncaginaceae, this intergenic spacer played an important role in resolving congeneric plant species at Churchill. Our results provided fast and cost-effective solution to create a comprehensive, effective DNA barcode reference library for a local flora.
QR Codes in the Library: "It's Not Your Mother's Barcode!"
ERIC Educational Resources Information Center
Dobbs, Cheri
2011-01-01
Barcode scanning has become more than just fun. Now libraries and businesses are leveraging barcode technology as an innovative tool to market their products and ideas. Developed and popularized in Japan, these Quick Response (QR) or two-dimensional barcodes allow marketers to provide interactive content in an otherwise static environment. In this…
Single-cell barcoding and sequencing using droplet microfluidics.
Zilionis, Rapolas; Nainys, Juozas; Veres, Adrian; Savova, Virginia; Zemmour, David; Klein, Allon M; Mazutis, Linas
2017-01-01
Single-cell RNA sequencing has recently emerged as a powerful tool for mapping cellular heterogeneity in diseased and healthy tissues, yet high-throughput methods are needed for capturing the unbiased diversity of cells. Droplet microfluidics is among the most promising candidates for capturing and processing thousands of individual cells for whole-transcriptome or genomic analysis in a massively parallel manner with minimal reagent use. We recently established a method called inDrops, which has the capability to index >15,000 cells in an hour. A suspension of cells is first encapsulated into nanoliter droplets with hydrogel beads (HBs) bearing barcoding DNA primers. Cells are then lysed and mRNA is barcoded (indexed) by a reverse transcription (RT) reaction. Here we provide details for (i) establishing an inDrops platform (1 d); (ii) performing hydrogel bead synthesis (4 d); (iii) encapsulating and barcoding cells (1 d); and (iv) RNA-seq library preparation (2 d). inDrops is a robust and scalable platform, and it is unique in its ability to capture and profile >75% of cells in even very small samples, on a scale of thousands or tens of thousands of cells.
Raupach, Michael J.; Barco, Andrea; Steinke, Dirk; Beermann, Jan; Laakmann, Silke; Mohrbeck, Inga; Neumann, Hermann; Kihara, Terue C.; Pointner, Karin; Radulovici, Adriana; Segelken-Voigt, Alexandra; Wesse, Christina; Knebelsberger, Thomas
2015-01-01
During the last years DNA barcoding has become a popular method of choice for molecular specimen identification. Here we present a comprehensive DNA barcode library of various crustacean taxa found in the North Sea, one of the most extensively studied marine regions of the world. Our data set includes 1,332 barcodes covering 205 species, including taxa of the Amphipoda, Copepoda, Decapoda, Isopoda, Thecostraca, and others. This dataset represents the most extensive DNA barcode library of the Crustacea in terms of species number to date. By using the Barcode of Life Data Systems (BOLD), unique BINs were identified for 198 (96.6%) of the analyzed species. Six species were characterized by two BINs (2.9%), and three BINs were found for the amphipod species Gammarus salinus Spooner, 1947 (0.4%). Intraspecific distances with values higher than 2.2% were revealed for 13 species (6.3%). Exceptionally high distances of up to 14.87% between two distinct but monophyletic clusters were found for the parasitic copepod Caligus elongatus Nordmann, 1832, supporting the results of previous studies that indicated the existence of an overlooked sea louse species. In contrast to these high distances, haplotype-sharing was observed for two decapod spider crab species, Macropodia parva Van Noort & Adema, 1985 and Macropodia rostrata (Linnaeus, 1761), underlining the need for a taxonomic revision of both species. Summarizing the results, our study confirms the application of DNA barcodes as highly effective identification system for the analyzed marine crustaceans of the North Sea and represents an important milestone for modern biodiversity assessment studies using barcode sequences. PMID:26417993
Loudig, Olivier; Liu, Christina; Rohan, Thomas; Ben-Dov, Iddo Z
2018-05-05
-Archived, clinically classified formalin-fixed paraffin-embedded (FFPE) tissues can provide nucleic acids for retrospective molecular studies of cancer development. By using non-invasive or pre-malignant lesions from patients who later develop invasive disease, gene expression analyses may help identify early molecular alterations that predispose to cancer risk. It has been well described that nucleic acids recovered from FFPE tissues have undergone severe physical damage and chemical modifications, which make their analysis difficult and generally requires adapted assays. MicroRNAs (miRNAs), however, which represent a small class of RNA molecules spanning only up to ~18-24 nucleotides, have been shown to withstand long-term storage and have been successfully analyzed in FFPE samples. Here we present a 3' barcoded complementary DNA (cDNA) library preparation protocol specifically optimized for the analysis of small RNAs extracted from archived tissues, which was recently demonstrated to be robust and highly reproducible when using archived clinical specimens stored for up to 35 years. This library preparation is well adapted to the multiplex analysis of compromised/degraded material where RNA samples (up to 18) are ligated with individual 3' barcoded adapters and then pooled together for subsequent enzymatic and biochemical preparations prior to analysis. All purifications are performed by polyacrylamide gel electrophoresis (PAGE), which allows size-specific selections and enrichments of barcoded small RNA species. This cDNA library preparation is well adapted to minute RNA inputs, as a pilot polymerase chain reaction (PCR) allows determination of a specific amplification cycle to produce optimal amounts of material for next-generation sequencing (NGS). This approach was optimized for the use of degraded FFPE RNA from specimens archived for up to 35 years and provides highly reproducible NGS data.
Highly multiplexed targeted DNA sequencing from single nuclei.
Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E
2016-02-01
Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
Raupach, Michael J; Hendrich, Lars; Küchler, Stefan M; Deister, Fabian; Morinière, Jérome; Gossner, Martin M
2014-01-01
During the last few years, DNA barcoding has become an efficient method for the identification of species. In the case of insects, most published DNA barcoding studies focus on species of the Ephemeroptera, Trichoptera, Hymenoptera and especially Lepidoptera. In this study we test the efficiency of DNA barcoding for true bugs (Hemiptera: Heteroptera), an ecological and economical highly important as well as morphologically diverse insect taxon. As part of our study we analyzed DNA barcodes for 1742 specimens of 457 species, comprising 39 families of the Heteroptera. We found low nucleotide distances with a minimum pairwise K2P distance <2.2% within 21 species pairs (39 species). For ten of these species pairs (18 species), minimum pairwise distances were zero. In contrast to this, deep intraspecific sequence divergences with maximum pairwise distances >2.2% were detected for 16 traditionally recognized and valid species. With a successful identification rate of 91.5% (418 species) our study emphasizes the use of DNA barcodes for the identification of true bugs and represents an important step in building-up a comprehensive barcode library for true bugs in Germany and Central Europe as well. Our study also highlights the urgent necessity of taxonomic revisions for various taxa of the Heteroptera, with a special focus on various species of the Miridae. In this context we found evidence for on-going hybridization events within various taxonomically challenging genera (e.g. Nabis Latreille, 1802 (Nabidae), Lygus Hahn, 1833 (Miridae), Phytocoris Fallén, 1814 (Miridae)) as well as the putative existence of cryptic species (e.g. Aneurus avenius (Duffour, 1833) (Aradidae) or Orius niger (Wolff, 1811) (Anthocoridae)).
Raupach, Michael J.; Hendrich, Lars; Küchler, Stefan M.; Deister, Fabian; Morinière, Jérome; Gossner, Martin M.
2014-01-01
During the last few years, DNA barcoding has become an efficient method for the identification of species. In the case of insects, most published DNA barcoding studies focus on species of the Ephemeroptera, Trichoptera, Hymenoptera and especially Lepidoptera. In this study we test the efficiency of DNA barcoding for true bugs (Hemiptera: Heteroptera), an ecological and economical highly important as well as morphologically diverse insect taxon. As part of our study we analyzed DNA barcodes for 1742 specimens of 457 species, comprising 39 families of the Heteroptera. We found low nucleotide distances with a minimum pairwise K2P distance <2.2% within 21 species pairs (39 species). For ten of these species pairs (18 species), minimum pairwise distances were zero. In contrast to this, deep intraspecific sequence divergences with maximum pairwise distances >2.2% were detected for 16 traditionally recognized and valid species. With a successful identification rate of 91.5% (418 species) our study emphasizes the use of DNA barcodes for the identification of true bugs and represents an important step in building-up a comprehensive barcode library for true bugs in Germany and Central Europe as well. Our study also highlights the urgent necessity of taxonomic revisions for various taxa of the Heteroptera, with a special focus on various species of the Miridae. In this context we found evidence for on-going hybridization events within various taxonomically challenging genera (e.g. Nabis Latreille, 1802 (Nabidae), Lygus Hahn, 1833 (Miridae), Phytocoris Fallén, 1814 (Miridae)) as well as the putative existence of cryptic species (e.g. Aneurus avenius (Duffour, 1833) (Aradidae) or Orius niger (Wolff, 1811) (Anthocoridae)). PMID:25203616
PCR cycles above routine numbers do not compromise high-throughput DNA barcoding results.
Vierna, J; Doña, J; Vizcaíno, A; Serrano, D; Jovani, R
2017-10-01
High-throughput DNA barcoding has become essential in ecology and evolution, but some technical questions still remain. Increasing the number of PCR cycles above the routine 20-30 cycles is a common practice when working with old-type specimens, which provide little amounts of DNA, or when facing annealing issues with the primers. However, increasing the number of cycles can raise the number of artificial mutations due to polymerase errors. In this work, we sequenced 20 COI libraries in the Illumina MiSeq platform. Libraries were prepared with 40, 45, 50, 55, and 60 PCR cycles from four individuals belonging to four species of four genera of cephalopods. We found no relationship between the number of PCR cycles and the number of mutations despite using a nonproofreading polymerase. Moreover, even when using a high number of PCR cycles, the resulting number of mutations was low enough not to be an issue in the context of high-throughput DNA barcoding (but may still remain an issue in DNA metabarcoding due to chimera formation). We conclude that the common practice of increasing the number of PCR cycles should not negatively impact the outcome of a high-throughput DNA barcoding study in terms of the occurrence of point mutations.
Decru, Eva; Moelants, Tuur; De Gelas, Koen; Vreven, Emmanuel; Verheyen, Erik; Snoeks, Jos
2016-01-01
This study evaluates the utility of DNA barcoding to traditional morphology-based species identifications for the fish fauna of the north-eastern Congo basin. We compared DNA sequences (COI) of 821 samples from 206 morphologically identified species. Best match, best close match and all species barcoding analyses resulted in a rather low identification success of 87.5%, 84.5% and 64.1%, respectively. The ratio 'nearest-neighbour distance/maximum intraspecific divergence' was lower than 1 for 26.1% of the samples, indicating possible taxonomic problems. In ten genera, belonging to six families, the number of species inferred from mtDNA data exceeded the number of species identified using morphological features; and in four cases indications of possible synonymy were detected. Finally, the DNA barcodes confirmed previously known identification problems within certain genera of the Clariidae, Cyprinidae and Mormyridae. Our results underscore the large number of taxonomic problems lingering in the taxonomy of the fish fauna of the Congo basin and illustrate why DNA barcodes will contribute to future efforts to compile a reliable taxonomic inventory of the Congo basin fish fauna. Therefore, the obtained barcodes were deposited in the reference barcode library of the Barcode of Life Initiative. © 2015 John Wiley & Sons Ltd.
Using DNA barcodes to identify a bird involved in a birdstrike at a Chinese airport.
Yang, Rong; Wu, Xiaobing; Yan, Peng; Li, Xiaoqiang
2010-10-01
One day at dusk in August, 200X, an airplane was struck by a bird at a Chinese airport (M Airport). After a careful check, some blades of the plane's engine were found to be out of shape and a few feathers and some bloodstains were found in the air intake of the engine. In order to know which species of bird was involved in the birdstrike, firstly we extracted DNA from the bloodstains; secondly, the DNA barcode (portion of COI gene) of the unknown species was amplified by PCR method; thirdly, sequence divergences (K2P differences) of the DNA barcode between the unknown species and a library of 59 common bird species distributed at the airport area were analyzed. Furthermore, a neighbor-joining (NJ) tree based on COI barcodes was created to provide graphic representation of sequence divergences among the species to confirm the identification. The result showed that red-rumped swallow (Hirundo daurica) was involved in the birdstrike incident. Some suggestions to avoid birdstrikes caused by red-rumped swallows were given to the administrative department of M Airport to ensure flying safety.
Serrao, Natasha R; Steinke, Dirk; Hanner, Robert H
2014-01-01
Detecting and documenting the occurrence of invasive species outside their native range requires tools to support their identification. This can be challenging for taxa with diverse life stages and/or problematic or unresolved morphological taxonomies. DNA barcoding provides a potent method for identifying invasive species, as it allows for species identification at all life stages, including fragmentary remains. It also provides an efficient interim taxonomic framework for quantifying cryptic genetic diversity by parsing barcode sequences into discontinuous haplogroup clusters (typical of reproductively isolated species) and labelling them with unique alphanumeric identifiers. Snakehead fishes are a diverse group of opportunistic predators endemic to Asia and Africa that may potentially pose significant threats as aquatic invasive species. At least three snakehead species (Channa argus, C. maculata, and C. marulius) are thought to have entered North America through the aquarium and live-food fish markets, and have established populations, yet their origins remain unclear. The objectives of this study were to assemble a library of DNA barcode sequences derived from expert identified reference specimens in order to determine the identity and aid invasion pathway analysis of the non-indigenous species found in North America using DNA barcodes. Sequences were obtained from 121 tissue samples representing 25 species and combined with public records from GenBank for a total of 36 putative species, which then partitioned into 49 discrete haplogroups. Multiple divergent clusters were observed within C. gachua, C. marulius, C. punctata and C. striata suggesting the potential presence of cryptic species diversity within these lineages. Our findings demonstrate that DNA barcoding is a valuable tool for species identification in challenging and under-studied taxonomic groups such as snakeheads, and provides a useful framework for inferring invasion pathway analysis.
Linking adults and immatures of South African marine fishes.
Steinke, Dirk; Connell, Allan D; Hebert, Paul D N
2016-11-01
The early life-history stages of fishes are poorly known, impeding acquisition of the identifications needed to monitor larval recruitment and year-class strength. A comprehensive database of COI sequences, linked to authoritatively identified voucher specimens, promises to change this situation, representing a significant advance for fisheries science. Barcode records were obtained from 2526 early larvae and pelagic eggs of fishes collected on the inshore shelf within 5 km of the KwaZulu-Natal coast, about 50 km south of Durban, South Africa. Barcodes were also obtained from 3215 adults, representing 946 South African fish species. Using the COI reference library on BOLD based on adults, 89% of the immature fishes could be identified to a species level; they represented 450 species. Most of the uncertain sequences could be assigned to a genus, family, or order; only 92 specimens (4%) were unassigned. Accumulation curves based on inference of phylogenetic diversity indicate near-completeness of the collecting effort. The entire set of adult and larval fishes included 1006 species, representing 43% of all fish species known from South African waters. However, this total included 189 species not previously recorded from this region. The fact that almost 90% of the immatures gained a species identification demonstrates the power and completeness of the DNA barcode reference library for fishes generated during the 10 years of FishBOL.
Jiao, Lichao; Yu, Min; Wiedenhoeft, Alex C; He, Tuo; Li, Jianing; Liu, Bo; Jiang, Xiaomei; Yin, Yafang
2018-01-31
DNA barcoding has been proposed as a useful tool for forensic wood identification and development of a reliable DNA reference library is an essential first step. Xylaria (wood collections) are potentially enormous data repositories if DNA information could be extracted from wood specimens. In this study, 31 xylarium wood specimens and 8 leaf specimens of six important commercial species of Pterocarpus were selected to investigate the reliability of DNA barcodes for authentication at the species level and to determine the feasibility of building wood DNA barcode reference libraries from xylarium specimens. Four DNA barcodes (ITS2, matK, ndhF-rpl32 and rbcL) and their combination were tested to evaluate their discrimination ability for Pterocarpus species with both TaxonDNA and tree-based analytical methods. The results indicated that the combination barcode of matK + ndhF-rpl32 + ITS2 yielded the best discrimination for the Pterocarpus species studied. The mini-barcode ndhF-rpl32 (167-173 bps) performed well distinguishing P. santalinus from its wood anatomically inseparable species P. tinctorius. Results from this study verified not only the feasibility of building DNA barcode libraries using xylarium wood specimens, but the importance of using wood rather than leaves as the source tissue, when wood is the botanical material to be identified.
Wilson, J-J; Sing, K-W; Halim, M R A; Ramli, R; Hashim, R; Sofian-Azirun, M
2014-02-19
Bats are important flagship species for biodiversity research; however, diversity in Southeast Asia is considerably underestimated in the current checklists and field guides. Incorporation of DNA barcoding into surveys has revealed numerous species-level taxa overlooked by conventional methods. Inclusion of these taxa in inventories provides a more informative record of diversity, but is problematic as these species lack formal description. We investigated how frequently documented, but undescribed, bat taxa are encountered in Peninsular Malaysia. We discuss whether a barcode library provides a means of recognizing and recording these taxa across biodiversity inventories. Tissue was sampled from bats trapped at Pasir Raja, Dungun Terengganu, Peninsular Malaysia. The DNA was extracted and the COI barcode region amplified and sequenced. We identified 9 species-level taxa within our samples, based on analysis of the DNA barcodes. Six specimens matched to four previously documented taxa considered candidate species but currently lacking formal taxonomic status. This study confirms the high diversity of bats within Peninsular Malaysia (9 species in 13 samples) and demonstrates how DNA barcoding allows for inventory and documentation of known taxa lacking formal taxonomic status.
Magic Pools: Parallel Assessment of Transposon Delivery Vectors in Bacteria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Hualan; Price, Morgan N.; Waters, Robert Jordan
Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach for discovering the functions of bacterial genes. However, the development of a suitable TnSeq strategy for a given bacterium can be costly and time-consuming. To meet this challenge, we describe a part-based strategy for constructing libraries of hundreds of transposon delivery vectors, which we term “magic pools.” Within a magic pool, each transposon vector has a different combination of upstream sequences (promoters and ribosome binding sites) and antibiotic resistance markers as well as a random DNA barcode sequence, which allows the tracking of each vector during mutagenesis experiments. Tomore » identify an efficient vector for a given bacterium, we mutagenize it with a magic pool and sequence the resulting insertions; we then use this efficient vector to generate a large mutant library. We used the magic pool strategy to construct transposon mutant libraries in five genera of bacteria, including three genera of the phylumBacteroidetes. IMPORTANCEMolecular genetics is indispensable for interrogating the physiology of bacteria. However, the development of a functional genetic system for any given bacterium can be time-consuming. Here, we present a streamlined approach for identifying an effective transposon mutagenesis system for a new bacterium. Our strategy first involves the construction of hundreds of different transposon vector variants, which we term a “magic pool.” The efficacy of each vector in a magic pool is monitored in parallel using a unique DNA barcode that is introduced into each vector design. Using archived DNA “parts,” we next reassemble an effective vector for making a whole-genome transposon mutant library that is suitable for large-scale interrogation of gene function using competitive growth assays. Here, we demonstrate the utility of the magic pool system to make mutant libraries in five genera of bacteria.« less
Magic Pools: Parallel Assessment of Transposon Delivery Vectors in Bacteria
Liu, Hualan; Price, Morgan N.; Waters, Robert Jordan; ...
2018-01-16
Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach for discovering the functions of bacterial genes. However, the development of a suitable TnSeq strategy for a given bacterium can be costly and time-consuming. To meet this challenge, we describe a part-based strategy for constructing libraries of hundreds of transposon delivery vectors, which we term “magic pools.” Within a magic pool, each transposon vector has a different combination of upstream sequences (promoters and ribosome binding sites) and antibiotic resistance markers as well as a random DNA barcode sequence, which allows the tracking of each vector during mutagenesis experiments. Tomore » identify an efficient vector for a given bacterium, we mutagenize it with a magic pool and sequence the resulting insertions; we then use this efficient vector to generate a large mutant library. We used the magic pool strategy to construct transposon mutant libraries in five genera of bacteria, including three genera of the phylumBacteroidetes. IMPORTANCEMolecular genetics is indispensable for interrogating the physiology of bacteria. However, the development of a functional genetic system for any given bacterium can be time-consuming. Here, we present a streamlined approach for identifying an effective transposon mutagenesis system for a new bacterium. Our strategy first involves the construction of hundreds of different transposon vector variants, which we term a “magic pool.” The efficacy of each vector in a magic pool is monitored in parallel using a unique DNA barcode that is introduced into each vector design. Using archived DNA “parts,” we next reassemble an effective vector for making a whole-genome transposon mutant library that is suitable for large-scale interrogation of gene function using competitive growth assays. Here, we demonstrate the utility of the magic pool system to make mutant libraries in five genera of bacteria.« less
Design of 240,000 orthogonal 25mer DNA barcode probes.
Xu, Qikai; Schlabach, Michael R; Hannon, Gregory J; Elledge, Stephen J
2009-02-17
DNA barcodes linked to genetic features greatly facilitate screening these features in pooled formats using microarray hybridization, and new tools are needed to design large sets of barcodes to allow construction of large barcoded mammalian libraries such as shRNA libraries. Here we report a framework for designing large sets of orthogonal barcode probes. We demonstrate the utility of this framework by designing 240,000 barcode probes and testing their performance by hybridization. From the test hybridizations, we also discovered new probe design rules that significantly reduce cross-hybridization after their introduction into the framework of the algorithm. These rules should improve the performance of DNA microarray probe designs for many applications.
Design of 240,000 orthogonal 25mer DNA barcode probes
Xu, Qikai; Schlabach, Michael R.; Hannon, Gregory J.; Elledge, Stephen J.
2009-01-01
DNA barcodes linked to genetic features greatly facilitate screening these features in pooled formats using microarray hybridization, and new tools are needed to design large sets of barcodes to allow construction of large barcoded mammalian libraries such as shRNA libraries. Here we report a framework for designing large sets of orthogonal barcode probes. We demonstrate the utility of this framework by designing 240,000 barcode probes and testing their performance by hybridization. From the test hybridizations, we also discovered new probe design rules that significantly reduce cross-hybridization after their introduction into the framework of the algorithm. These rules should improve the performance of DNA microarray probe designs for many applications. PMID:19171886
Multiplexed microsatellite recovery using massively parallel sequencing
Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.
2011-01-01
Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).
USDA-ARS?s Scientific Manuscript database
Several biosafety level (BSL)-3/4 pathogens are high consequence, single-stranded RNA viruses and their genomes, when introduced into permissive cells, are infectious. Moreover many of these viruses are Select Agents (SAs), and their genomes are also considered SAs. For this reason cDNAs and/or th...
Linking Project Procedure Manual for Using Dumb-Barcode Linking on GEAC.
ERIC Educational Resources Information Center
Condron, Lyn
This procedure manual is designed to assist cataloging staff members at a university library through the 10-step process of barcoding and linking books classified by the Library of Congress system to the library's GEAC online computer system. A brief introduction provides background information on the project. The procedures involved in each…
Olivar, Jay Edneil C; Alaba, Joanner Paulus Erik P; Atienza, Jose Francisco M; Tan, Jerick Jeffrey S; Umali, Maximo T; Alejandro, Grecebio Jonathan D
2016-05-01
The majority of the population in the Philippines relies on herbal products as their primary source for their healthcare needs. After the recognition of Vitex negundo L. (lagundi) as an important and effective alternative medicine for cough, sore throat, asthma and fever by the Philippine Department of Health (DOH), there was an increase in the production of lagundi-based herbal products in the form of teas, capsules and syrups. The efficiency of these products is greatly reliant on the use of authentic plant material, and to this day no standard protocol has been established to authenticate plant materials. DNA barcoding offers a quick and reliable species authentication tool, but its application to plant material has been less successful due to (1) lack of a standard DNA barcoding loci in plants and (2) poor DNA yield from powderised plant products. This study reports the successful application of DNA barcoding in the authentication of five V. negundo herbal products sold in the Philippines. Also, the first standard reference material (SRM) herbal library for the recognition of authentic V. negundo samples was established using 42 gene accessions of ITS, psbA-trnH and matK barcoding loci. Authentication of the herbal products utilised the SRM following the BLASTn and maximum-likelihood (ML) tree construction criterion. Barcode sequences were retrieved for ITS and psbA-trnH of all products tested and the results of the study revealed that only one out of five herbal products satisfied both BLASTn and ML criterion and was considered to contain authentic V. negundo. The results prompt the urgent need to utilise DNA barcoding in authenticating herbal products available in the Philippine market. Authentication of these products will secure consumer health by preventing the negative effects of adulteration, substitution and contamination.
2013-01-01
Background Next-generation-sequencing (NGS) technologies combined with a classic DNA barcoding approach have enabled fast and credible measurement for biodiversity of mixed environmental samples. However, the PCR amplification involved in nearly all existing NGS protocols inevitably introduces taxonomic biases. In the present study, we developed new Illumina pipelines without PCR amplifications to analyze terrestrial arthropod communities. Results Mitochondrial enrichment directly followed by Illumina shotgun sequencing, at an ultra-high sequence volume, enabled the recovery of Cytochrome c Oxidase subunit 1 (COI) barcode sequences, which allowed for the estimation of species composition at high fidelity for a terrestrial insect community. With 15.5 Gbp Illumina data, approximately 97% and 92% were detected out of the 37 input Operational Taxonomic Units (OTUs), whether the reference barcode library was used or not, respectively, while only 1 novel OTU was found for the latter. Additionally, relatively strong correlation between the sequencing volume and the total biomass was observed for species from the bulk sample, suggesting a potential solution to reveal relative abundance. Conclusions The ability of the new Illumina PCR-free pipeline for DNA metabarcoding to detect small arthropod specimens and its tendency to avoid most, if not all, false positives suggests its great potential in biodiversity-related surveillance, such as in biomonitoring programs. However, further improvement for mitochondrial enrichment is likely needed for the application of the new pipeline in analyzing arthropod communities at higher diversity. PMID:23587339
Conte-Grand, Cecilia; Britz, Ralf; Dahanukar, Neelesh; Raghavan, Rajeev; Pethiyagoda, Rohan; Tan, Heok Hui; Hadiaty, Renny K.; Yaakob, Norsham S.
2017-01-01
Snakehead fishes of the family Channidae are predatory freshwater teleosts from Africa and Asia comprising 38 valid species. Snakeheads are important food fishes (aquaculture, live food trade) and have been introduced widely with several species becoming highly invasive. A channid barcode library was recently assembled by Serrao and co-workers to better detect and identify potential and established invasive snakehead species outside their native range. Comparing our own recent phylogenetic results of this taxonomically confusing group with those previously reported revealed several inconsistencies that prompted us to expand and improve on previous studies. By generating 343 novel snakehead coxI sequences and combining them with an additional 434 coxI sequences from GenBank we highlight several problems with previous efforts towards the assembly of a snakehead reference barcode library. We found that 16.3% of the channid coxI sequences deposited in GenBank are based on misidentifications. With the inclusion of our own data we were, however, able to solve these cases of perpetuated taxonomic confusion. Different species delimitation approaches we employed (BIN, GMYC, and PTP) were congruent in suggesting a potentially much higher species diversity within snakeheads than currently recognized. In total, 90 BINs were recovered and within a total of 15 currently recognized species multiple BINs were identified. This higher species diversity is mostly due to either the incorporation of undescribed, narrow range, endemics from the Eastern Himalaya biodiversity hotspot or the incorporation of several widespread species characterized by deep genetic splits between geographically well-defined lineages. In the latter case, over-lumping in the past has deflated the actual species numbers. Further integrative approaches are clearly needed for providing a better taxonomic understanding of snakehead diversity, new species descriptions and taxonomic revisions of the group. PMID:28931084
Loo, Jacky F C; Lau, P M; Ho, H P; Kong, S K
2013-10-15
Based on a recently reported ultra-sensitive bio-barcode (BBC) assay, we have developed an aptamer-based bio-barcode (ABC) alternative to detect a cell death marker cytochrome-c (Cyto-c) and its subsequent application to screen anti-cancer drugs. Aptamer is a short single-stranded DNA selected from a synthetic DNA library by virtue of its high binding affinity and specificity to its target based on its unique 3D structure from the nucleotide sequence after folding. In the BBC assay, an antigen (Ag) in analytes is captured by a micro-magnetic particle (MMP) coated with capturing antibodies (Abs). Gold nanoparticles (NPs) with another recognition Ab against the same target and hundreds of identical DNA molecules of known sequence are subsequently added to allow the formation of sandwich structures ([MMP-Ab1]-Ag-[Ab2-NP-DNA]). After isolating the sandwiches by a magnetic field, the DNAs hybridized to their complementary DNAs covalently bound on the NPs are released from the sandwiches after heating. Acting as an Ag identification tag, these bio-barcode DNAs with known DNA sequence are then amplified by polymerase chain reaction (PCR) and detected by fluorescence. In our ABC assay, we employed a Cyto-c-specific aptamer to substitute both the recognition Ab and barcode DNAs on the NPs in the BBC assay; and a novel isothermal recombinase polymerase amplification for the time-consuming PCR. The detection limit of our ABC assay for the Cyto-c was found to be 10 ng/mL and this new assay can be completed within 3h. Several potential anti-cancer drugs have been tested in vitro for their efficacy to kill liver cancer with or without multi-drug resistance. © 2013 Elsevier B.V. All rights reserved.
Critical factors for assembling a high volume of DNA barcodes
Hajibabaei, Mehrdad; deWaard, Jeremy R; Ivanova, Natalia V; Ratnasingham, Sujeevan; Dooh, Robert T; Kirk, Stephanie L; Mackie, Paula M; Hebert, Paul D.N
2005-01-01
Large-scale DNA barcoding projects are now moving toward activation while the creation of a comprehensive barcode library for eukaryotes will ultimately require the acquisition of some 100 million barcodes. To satisfy this need, analytical facilities must adopt protocols that can support the rapid, cost-effective assembly of barcodes. In this paper we discuss the prospects for establishing high volume DNA barcoding facilities by evaluating key steps in the analytical chain from specimens to barcodes. Alliances with members of the taxonomic community represent the most effective strategy for provisioning the analytical chain with specimens. The optimal protocols for DNA extraction and subsequent PCR amplification of the barcode region depend strongly on their condition, but production targets of 100K barcode records per year are now feasible for facilities working with compliant specimens. The analysis of museum collections is currently challenging, but PCR cocktails that combine polymerases with repair enzyme(s) promise future success. Barcode analysis is already a cost-effective option for species identification in some situations and this will increasingly be the case as reference libraries are assembled and analytical protocols are simplified. PMID:16214753
DNA barcoding of human-biting black flies (Diptera: Simuliidae) in Thailand.
Pramual, Pairot; Thaijarern, Jiraporn; Wongpakam, Komgrit
2016-12-01
Black flies (Diptera: Simuliidae) are important insect vectors and pests of humans and animals. Accurate identification, therefore, is important for control and management. In this study, we used mitochondrial cytochrome oxidase I (COI) barcoding sequences to test the efficiency of species identification for the human-biting black flies in Thailand. We used human-biting specimens because they enabled us to link information with previous studies involving the immature stages. Three black fly taxa, Simulium nodosum, S. nigrogilvum and S. doipuiense complex, were collected. The S. doipuiense complex was confirmed for the first time as having human-biting habits. The COI sequences revealed considerable genetic diversity in all three species. Comparisons to a COI sequence library of black flies in Thailand and in a public database indicated a high efficiency for specimen identification for S. nodosum and S. nigrogilvum, but this method was not successful for the S. doipuiense complex. Phylogenetic analyses revealed two divergent lineages in the S. doipuiense complex. Human-biting specimens formed a separate clade from other members of this complex. The results are consistent with the Barcoding Index Number System (BINs) analysis that found six BINs in the S. doipuiense complex. Further taxonomic work is needed to clarify the species status of these human-biting specimens. Copyright © 2016 Elsevier B.V. All rights reserved.
The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.
Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A
2016-10-11
Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.
Analyzing Mosquito (Diptera: Culicidae) Diversity in Pakistan by DNA Barcoding
Ashfaq, Muhammad; Hebert, Paul D. N.; Mirza, Jawwad H.; Khan, Arif M.; Zafar, Yusuf; Mirza, M. Sajjad
2014-01-01
Background Although they are important disease vectors mosquito biodiversity in Pakistan is poorly known. Recent epidemics of dengue fever have revealed the need for more detailed understanding of the diversity and distributions of mosquito species in this region. DNA barcoding improves the accuracy of mosquito inventories because morphological differences between many species are subtle, leading to misidentifications. Methodology/Principal Findings Sequence variation in the barcode region of the mitochondrial COI gene was used to identify mosquito species, reveal genetic diversity, and map the distribution of the dengue-vector species in Pakistan. Analysis of 1684 mosquitoes from 491 sites in Punjab and Khyber Pakhtunkhwa during 2010–2013 revealed 32 species with the assemblage dominated by Culex quinquefasciatus (61% of the collection). The genus Aedes (Stegomyia) comprised 15% of the specimens, and was represented by six taxa with the two dengue vector species, Ae. albopictus and Ae. aegypti, dominant and broadly distributed. Anopheles made up another 6% of the catch with An. subpictus dominating. Barcode sequence divergence in conspecific specimens ranged from 0–2.4%, while congeneric species showed from 2.3–17.8% divergence. A global haplotype analysis of disease-vectors showed the presence of multiple haplotypes, although a single haplotype of each dengue-vector species was dominant in most countries. Geographic distribution of Ae. aegypti and Ae. albopictus showed the later species was dominant and found in both rural and urban environments. Conclusions As the first DNA-based analysis of mosquitoes in Pakistan, this study has begun the construction of a barcode reference library for the mosquitoes of this region. Levels of genetic diversity varied among species. Because of its capacity to differentiate species, even those with subtle morphological differences, DNA barcoding aids accurate tracking of vector populations. PMID:24827460
Analyzing mosquito (Diptera: culicidae) diversity in Pakistan by DNA barcoding.
Ashfaq, Muhammad; Hebert, Paul D N; Mirza, Jawwad H; Khan, Arif M; Zafar, Yusuf; Mirza, M Sajjad
2014-01-01
Although they are important disease vectors mosquito biodiversity in Pakistan is poorly known. Recent epidemics of dengue fever have revealed the need for more detailed understanding of the diversity and distributions of mosquito species in this region. DNA barcoding improves the accuracy of mosquito inventories because morphological differences between many species are subtle, leading to misidentifications. Sequence variation in the barcode region of the mitochondrial COI gene was used to identify mosquito species, reveal genetic diversity, and map the distribution of the dengue-vector species in Pakistan. Analysis of 1684 mosquitoes from 491 sites in Punjab and Khyber Pakhtunkhwa during 2010-2013 revealed 32 species with the assemblage dominated by Culex quinquefasciatus (61% of the collection). The genus Aedes (Stegomyia) comprised 15% of the specimens, and was represented by six taxa with the two dengue vector species, Ae. albopictus and Ae. aegypti, dominant and broadly distributed. Anopheles made up another 6% of the catch with An. subpictus dominating. Barcode sequence divergence in conspecific specimens ranged from 0-2.4%, while congeneric species showed from 2.3-17.8% divergence. A global haplotype analysis of disease-vectors showed the presence of multiple haplotypes, although a single haplotype of each dengue-vector species was dominant in most countries. Geographic distribution of Ae. aegypti and Ae. albopictus showed the later species was dominant and found in both rural and urban environments. As the first DNA-based analysis of mosquitoes in Pakistan, this study has begun the construction of a barcode reference library for the mosquitoes of this region. Levels of genetic diversity varied among species. Because of its capacity to differentiate species, even those with subtle morphological differences, DNA barcoding aids accurate tracking of vector populations.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Yang, Fang; Lei, Yingying; Zhou, Meiling; Yao, Qili; Han, Yichao; Wu, Xiang; Zhong, Wanshun; Zhu, Chenghang; Xu, Weize; Tao, Ran; Chen, Xi; Lin, Da; Rahman, Khaista; Tyagi, Rohit; Habib, Zeshan; Xiao, Shaobo; Wang, Dang; Yu, Yang; Chen, Huanchun; Fu, Zhenfang; Cao, Gang
2018-02-16
Protein-protein interaction (PPI) network maintains proper function of all organisms. Simple high-throughput technologies are desperately needed to delineate the landscape of PPI networks. While recent state-of-the-art yeast two-hybrid (Y2H) systems improved screening efficiency, either individual colony isolation, library preparation arrays, gene barcoding or massive sequencing are still required. Here, we developed a recombination-based 'library vs library' Y2H system (RLL-Y2H), by which multi-library screening can be accomplished in a single pool without any individual treatment. This system is based on the phiC31 integrase-mediated integration between bait and prey plasmids. The integrated fragments were digested by MmeI and subjected to deep sequencing to decode the interaction matrix. We applied this system to decipher the trans-kingdom interactome between Mycobacterium tuberculosis and host cells and further identified Rv2427c interfering with the phagosome-lysosome fusion. This concept can also be applied to other systems to screen protein-RNA and protein-DNA interactions and delineate signaling landscape in cells.
Managing Archival Collections in an Automated Environment: The Joys of Barcoding
ERIC Educational Resources Information Center
Hamburger, Susan; Charles, Jane Veronica
2006-01-01
In a desire for automated collection control, archival repositories are adopting barcoding from their library and records center colleagues. This article discusses the planning, design, and implementation phases of barcoding. The authors focus on reasons for barcoding, security benefits, in-room circulation tracking, potential for gathering…
Long-range barcode labeling-sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Feng; Zhang, Tao; Singh, Kanwar K.
Methods for sequencing single large DNA molecules by clonal multiple displacement amplification using barcoded primers. Sequences are binned based on barcode sequences and sequenced using a microdroplet-based method for sequencing large polynucleotide templates to enable assembly of haplotype-resolved complex genomes and metagenomes.
Virgilio, Massimiliano; Jordaens, Kurt; Breman, Floris C; Backeljau, Thierry; De Meyer, Marc
2012-01-01
We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold profitably reduces identification errors, we modelled relationships between identification performances and distance thresholds in four DNA barcode libraries of Diptera (n = 4270), Lepidoptera (n = 7577), Hymenoptera (n = 2067) and Tephritidae (n = 602 DNA barcodes). In all cases, more restrictive distance thresholds produced a gradual increase in the proportion of true negatives, a gradual decrease of false positives and more abrupt variations in the proportions of true positives and false negatives. More restrictive distance thresholds improved precision, yet negatively affected accuracy due to the higher proportions of queries discarded (viz. having a distance query-best match above the threshold). Using a simple linear regression we calculated an ad hoc distance threshold for the tephritid library producing an estimated relative identification error <0.05. According to the expectations, when we used this threshold for the identification of 188 independently collected tephritids, less than 5% of queries with a distance query-best match below the threshold were misidentified. Ad hoc thresholds can be calculated for each particular reference library of DNA barcodes and should be used as cut-off mark defining whether we can proceed identifying the query with a known estimated error probability (e.g. 5%) or whether we should discard the query and consider alternative/complementary identification methods.
Virgilio, Massimiliano; Jordaens, Kurt; Breman, Floris C.; Backeljau, Thierry; De Meyer, Marc
2012-01-01
We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold profitably reduces identification errors, we modelled relationships between identification performances and distance thresholds in four DNA barcode libraries of Diptera (n = 4270), Lepidoptera (n = 7577), Hymenoptera (n = 2067) and Tephritidae (n = 602 DNA barcodes). In all cases, more restrictive distance thresholds produced a gradual increase in the proportion of true negatives, a gradual decrease of false positives and more abrupt variations in the proportions of true positives and false negatives. More restrictive distance thresholds improved precision, yet negatively affected accuracy due to the higher proportions of queries discarded (viz. having a distance query-best match above the threshold). Using a simple linear regression we calculated an ad hoc distance threshold for the tephritid library producing an estimated relative identification error <0.05. According to the expectations, when we used this threshold for the identification of 188 independently collected tephritids, less than 5% of queries with a distance query-best match below the threshold were misidentified. Ad hoc thresholds can be calculated for each particular reference library of DNA barcodes and should be used as cut-off mark defining whether we can proceed identifying the query with a known estimated error probability (e.g. 5%) or whether we should discard the query and consider alternative/complementary identification methods. PMID:22359600
TagDust2: a generic method to extract reads from sequencing data.
Lassmann, Timo
2015-01-28
Arguably the most basic step in the analysis of next generation sequencing data (NGS) involves the extraction of mappable reads from the raw reads produced by sequencing instruments. The presence of barcodes, adaptors and artifacts subject to sequencing errors makes this step non-trivial. Here I present TagDust2, a generic approach utilizing a library of hidden Markov models (HMM) to accurately extract reads from a wide array of possible read architectures. TagDust2 extracts more reads of higher quality compared to other approaches. Processing of multiplexed single, paired end and libraries containing unique molecular identifiers is fully supported. Two additional post processing steps are included to exclude known contaminants and filter out low complexity sequences. Finally, TagDust2 can automatically detect the library type of sequenced data from a predefined selection. Taken together TagDust2 is a feature rich, flexible and adaptive solution to go from raw to mappable NGS reads in a single step. The ability to recognize and record the contents of raw reads will help to automate and demystify the initial, and often poorly documented, steps in NGS data analysis pipelines. TagDust2 is freely available at: http://tagdust.sourceforge.net .
DNA barcodes from four loci provide poor resolution of taxonomic groups in the genus Crataegus.
Zarrei, Mehdi; Talent, Nadia; Kuzmina, Maria; Lee, Jeanette; Lund, Jensen; Shipley, Paul R; Stefanović, Saša; Dickinson, Timothy A
2015-04-28
DNA barcodes can facilitate identification of organisms especially when morphological characters are limited or unobservable. To what extent this potential is realized in specific groups of plants remains to be determined. Libraries of barcode sequences from well-studied authoritatively identified plants represented by herbarium voucher specimens are needed in order for DNA barcodes to serve their intended purpose, where this is possible, and to understand the reasons behind their failure to do so, when this occurs. We evaluated four loci, widely regarded as universal DNA barcodes for plants, for their utility in hawthorn species identification. Three plastid regions, matK, rbcLa and psbA-trnH, and the internal transcribed spacer 2 (ITS2) of nuclear ribosomal DNA discriminate only some of the species of Crataegus that can be recognized on the basis of their morphology etc. This is, in part, because in Rosaceae tribe Maleae most individual plastid loci yield relatively little taxonomic resolution and, in part, because the effects of allopolyploidization have not been eliminated by concerted evolution of the ITS regions. Although individual plastid markers provided generally poor resolution of taxonomic groups in Crataegus, a few species were notable exceptions. In contrast, analyses of concatenated sequences of the 3 plastid barcode loci plus 11 additional plastid loci gave a well-resolved maternal phylogeny. In the ITS2 tree, different individuals of some species formed groups with taxonomically unrelated species. This is a sign of lineage sorting due to incomplete concerted evolution in ITS2. Incongruence between the ITS2 and plastid trees is best explained by hybridization between different lineages within the genus. In aggregate, limited between-species variation in plastid loci, hybridization and a lack of concerted evolution in ITS2 all combine to limit the utility of standard barcoding markers in Crataegus. These results have implications for authentication of hawthorn materials in natural health products. Published by Oxford University Press on behalf of the Annals of Botany Company.
DNA barcodes from four loci provide poor resolution of taxonomic groups in the genus Crataegus
Zarrei, Mehdi; Talent, Nadia; Kuzmina, Maria; Lee, Jeanette; Lund, Jensen; Shipley, Paul R.; Stefanović, Saša; Dickinson, Timothy A.
2015-01-01
DNA barcodes can facilitate identification of organisms especially when morphological characters are limited or unobservable. To what extent this potential is realized in specific groups of plants remains to be determined. Libraries of barcode sequences from well-studied authoritatively identified plants represented by herbarium voucher specimens are needed in order for DNA barcodes to serve their intended purpose, where this is possible, and to understand the reasons behind their failure to do so, when this occurs. We evaluated four loci, widely regarded as universal DNA barcodes for plants, for their utility in hawthorn species identification. Three plastid regions, matK, rbcLa and psbA-trnH, and the internal transcribed spacer 2 (ITS2) of nuclear ribosomal DNA discriminate only some of the species of Crataegus that can be recognized on the basis of their morphology etc. This is, in part, because in Rosaceae tribe Maleae most individual plastid loci yield relatively little taxonomic resolution and, in part, because the effects of allopolyploidization have not been eliminated by concerted evolution of the ITS regions. Although individual plastid markers provided generally poor resolution of taxonomic groups in Crataegus, a few species were notable exceptions. In contrast, analyses of concatenated sequences of the 3 plastid barcode loci plus 11 additional plastid loci gave a well-resolved maternal phylogeny. In the ITS2 tree, different individuals of some species formed groups with taxonomically unrelated species. This is a sign of lineage sorting due to incomplete concerted evolution in ITS2. Incongruence between the ITS2 and plastid trees is best explained by hybridization between different lineages within the genus. In aggregate, limited between-species variation in plastid loci, hybridization and a lack of concerted evolution in ITS2 all combine to limit the utility of standard barcoding markers in Crataegus. These results have implications for authentication of hawthorn materials in natural health products. PMID:25926325
Geiger, M F; Herder, F; Monaghan, M T; Almada, V; Barbieri, R; Bariche, M; Berrebi, P; Bohlen, J; Casal-Lopez, M; Delmastro, G B; Denys, G P J; Dettai, A; Doadrio, I; Kalogianni, E; Kärst, H; Kottelat, M; Kovačić, M; Laporte, M; Lorenzoni, M; Marčić, Z; Özuluğ, M; Perdices, A; Perea, S; Persat, H; Porcelotti, S; Puzzi, C; Robalo, J; Šanda, R; Schneider, M; Šlechtová, V; Stoumboudi, M; Walter, S; Freyhof, J
2014-11-01
Incomplete knowledge of biodiversity remains a stumbling block for conservation planning and even occurs within globally important Biodiversity Hotspots (BH). Although technical advances have boosted the power of molecular biodiversity assessments, the link between DNA sequences and species and the analytics to discriminate entities remain crucial. Here, we present an analysis of the first DNA barcode library for the freshwater fish fauna of the Mediterranean BH (526 spp.), with virtually complete species coverage (498 spp., 98% extant species). In order to build an identification system supporting conservation, we compared species determination by taxonomists to multiple clustering analyses of DNA barcodes for 3165 specimens. The congruence of barcode clusters with morphological determination was strongly dependent on the method of cluster delineation, but was highest with the general mixed Yule-coalescent (GMYC) model-based approach (83% of all species recovered as GMYC entity). Overall, genetic morphological discontinuities suggest the existence of up to 64 previously unrecognized candidate species. We found reduced identification accuracy when using the entire DNA-barcode database, compared with analyses on databases for individual river catchments. This scale effect has important implications for barcoding assessments and suggests that fairly simple identification pipelines provide sufficient resolution in local applications. We calculated Evolutionarily Distinct and Globally Endangered scores in order to identify candidate species for conservation priority and argue that the evolutionary content of barcode data can be used to detect priority species for future IUCN assessments. We show that large-scale barcoding inventories of complex biotas are feasible and contribute directly to the evaluation of conservation priorities. © 2014 John Wiley & Sons Ltd.
DNA barcoding of vouchered xylarium wood specimens of nine endangered Dalbergia species.
Yu, Min; Jiao, Lichao; Guo, Juan; Wiedenhoeft, Alex C; He, Tuo; Jiang, Xiaomei; Yin, Yafang
2017-12-01
ITS2+ trnH - psbA was the best combination of DNA barcode to resolve the Dalbergia wood species studied. We demonstrate the feasibility of building a DNA barcode reference database using xylarium wood specimens. The increase in illegal logging and timber trade of CITES-listed tropical species necessitates the development of unambiguous identification methods at the species level. For these methods to be fully functional and deployable for law enforcement, they must work using wood or wood products. DNA barcoding of wood has been promoted as a promising tool for species identification; however, the main barrier to extensive application of DNA barcoding to wood is the lack of a comprehensive and reliable DNA reference library of barcodes from wood. In this study, xylarium wood specimens of nine Dalbergia species were selected from the Wood Collection of the Chinese Academy of Forestry and DNA was then extracted from them for further PCR amplification of eight potential DNA barcode sequences (ITS2, matK, trnL, trnH-psbA, trnV-trnM1, trnV-trnM2, trnC-petN, and trnS-trnG). The barcodes were tested singly and in combination for species-level discrimination ability by tree-based [neighbor-joining (NJ)] and distance-based (TaxonDNA) methods. We found that the discrimination ability of DNA barcodes in combination was higher than any single DNA marker among the Dalbergia species studied, with the best two-marker combination of ITS2+trnH-psbA analyzed with NJ trees performing the best (100% accuracy). These barcodes are relatively short regions (<350 bp) and amplification reactions were performed with high success (≥90%) using wood as the source material, a necessary factor to apply DNA barcoding to timber trade. The present results demonstrate the feasibility of using vouchered xylarium specimens to build DNA barcoding reference databases.
DNA barcode goes two-dimensions: DNA QR code web server.
Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
Trujillano, Daniel; Weiss, Maximilian E R; Schneider, Juliane; Köster, Julia; Papachristos, Efstathios B; Saviouk, Viatcheslav; Zakharkina, Tetyana; Nahavandi, Nahid; Kovacevic, Lejla; Rolfs, Arndt
2015-03-01
Genetic testing for hereditary breast and/or ovarian cancer mostly relies on laborious molecular tools that use Sanger sequencing to scan for mutations in the BRCA1 and BRCA2 genes. We explored a more efficient genetic screening strategy based on next-generation sequencing of the BRCA1 and BRCA2 genes in 210 hereditary breast and/or ovarian cancer patients. We first validated this approach in a cohort of 115 samples with previously known BRCA1 and BRCA2 mutations and polymorphisms. Genomic DNA was amplified using the Ion AmpliSeq BRCA1 and BRCA2 panel. The DNA Libraries were pooled, barcoded, and sequenced using an Ion Torrent Personal Genome Machine sequencer. The combination of different robust bioinformatics tools allowed detection of all previously known pathogenic mutations and polymorphisms in the 115 samples, without detecting spurious pathogenic calls. We then used the same assay in a discovery cohort of 95 uncharacterized hereditary breast and/or ovarian cancer patients for BRCA1 and BRCA2. In addition, we describe the allelic frequencies across 210 hereditary breast and/or ovarian cancer patients of 74 unique definitely and likely pathogenic and uncertain BRCA1 and BRCA2 variants, some of which have not been previously annotated in the public databases. Targeted next-generation sequencing is ready to substitute classic molecular methods to perform genetic testing on the BRCA1 and BRCA2 genes and provides a greater opportunity for more comprehensive testing of at-risk patients. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Havemann, Nadine; Gossner, Martin M.; Hendrich, Lars; Morinière, Jèrôme; Niedringhaus, Rolf; Schäfer, Peter
2018-01-01
With about 5,000 species worldwide, the Heteroptera or true bugs are the most diverse taxon among the hemimetabolous insects in aquatic and semi-aquatic ecosystems. Species may be found in almost every freshwater environment and have very specific habitat requirements, making them excellent bioindicator organisms for water quality. However, a correct determination by morphology is challenging in many species groups due to high morphological variability and polymorphisms within, but low variability between species. Furthermore, it is very difficult or even impossible to identify the immature life stages or females of some species, e.g., of the corixid genus Sigara. In this study we tested the effectiveness of a DNA barcode library to discriminate species of the Gerromorpha and Nepomorpha of Germany. We analyzed about 700 specimens of 67 species, with 63 species sampled in Germany, covering more than 90% of all recorded species. Our library included various morphological similar taxa, e.g., species within the genera Sigara and Notonecta as well as water striders of the genus Gerris. Fifty-five species (82%) were unambiguously assigned to a single Barcode Index Number (BIN) by their barcode sequences, whereas BIN sharing was observed for 10 species. Furthermore, we found monophyletic lineages for 52 analyzed species. Our data revealed interspecific K2P distances with below 2.2% for 18 species. Intraspecific distances above 2.2% were shown for 11 species. We found evidence for hybridization between various corixid species (Sigara, Callicorixa), but our molecular data also revealed exceptionally high intraspecific distances as a consequence of distinct mitochondrial lineages for Cymatia coleoptrata and the pygmy backswimmer Plea minutissima. Our study clearly demonstrates the usefulness of DNA barcodes for the identification of the aquatic Heteroptera of Germany and adjacent regions. In this context, our data set represents an essential baseline for a reference library for bioassessment studies of freshwater habitats using modern high-throughput technologies in the near future. The existing data also opens new questions regarding the causes of observed low inter- and high intraspecific genetic variation and furthermore highlight the necessity of taxonomic revisions for various taxa, combining both molecular and morphological data. PMID:29736329
Looking back on a decade of barcoding crustaceans
Raupach, Michael J.; Radulovici, Adriana E.
2015-01-01
Abstract Species identification represents a pivotal component for large-scale biodiversity studies and conservation planning but represents a challenge for many taxa when using morphological traits only. Consequently, alternative identification methods based on molecular markers have been proposed. In this context, DNA barcoding has become a popular and accepted method for the identification of unknown animals across all life stages by comparison to a reference library. In this review we examine the progress of barcoding studies for the Crustacea using the Web of Science data base from 2003 to 2014. All references were classified in terms of taxonomy covered, subject area (identification/library, genetic variability, species descriptions, phylogenetics, methods, pseudogenes/numts), habitat, geographical area, authors, journals, citations, and the use of the Barcode of Life Data Systems (BOLD). Our analysis revealed a total number of 164 barcoding studies for crustaceans with a preference for malacostracan crustaceans, in particular Decapoda, and for building reference libraries in order to identify organisms. So far, BOLD did not establish itself as a popular informatics platform among carcinologists although it offers many advantages for standardized data storage, analyses and publication. PMID:26798245
Liu, Jie; Milne, Richard I; Möller, Michael; Zhu, Guang-Fu; Ye, Lin-Jiang; Luo, Ya-Huang; Yang, Jun-Bo; Wambulwa, Moses C; Wang, Chun-Neng; Li, De-Zhu; Gao, Lian-Ming
2018-05-22
Rapid and accurate identification of endangered species is a critical component of biosurveillance and conservation management, and potentially policing illegal trades. However, this is often not possible using traditional taxonomy, especially where only small or preprocessed parts of plants are available. Reliable identification can be achieved via a comprehensive DNA barcode reference library, accompanied by precise distribution data. However, these require extensive sampling at spatial and taxonomic scales, which has rarely been achieved for cosmopolitan taxa. Here, we construct a comprehensive DNA barcode reference library and generate distribution maps using species distribution modelling (SDM), for all 15 Taxus species worldwide. We find that trnL-trnF is the ideal barcode for Taxus: It can distinguish all Taxus species and in combination with ITS identify hybrids. Among five analysis methods tested, NJ was the most effective. Among 4,151 individuals screened for trnL-trnF, 73 haplotypes were detected, all species-specific and some population private. Taxonomical, geographical and genetic dimensions of sampling strategy were all found to affect the comprehensiveness of the resulting DNA barcode library. Maps from SDM showed that most species had allopatric distributions, except T. mairei in the Sino-Himalayan region. Using the barcode library and distribution map data, two unknown forensic samples were identified to species (and in one case, population) level and another was determined as a putative interspecific hybrid. This integrated species identification system for Taxus can be used for biosurveillance, conservation management and to monitor and prosecute illegal trade. Similar identification systems are recommended for other IUCN- and CITES-listed taxa. © 2018 John Wiley & Sons Ltd.
Stiffler, Michael A; Subramanian, Subu K; Salinas, Victor H; Ranganathan, Rama
2016-07-03
Site-directed mutagenesis has long been used as a method to interrogate protein structure, function and evolution. Recent advances in massively-parallel sequencing technology have opened up the possibility of assessing the functional or fitness effects of large numbers of mutations simultaneously. Here, we present a protocol for experimentally determining the effects of all possible single amino acid mutations in a protein of interest utilizing high-throughput sequencing technology, using the 263 amino acid antibiotic resistance enzyme TEM-1 β-lactamase as an example. In this approach, a whole-protein saturation mutagenesis library is constructed by site-directed mutagenic PCR, randomizing each position individually to all possible amino acids. The library is then transformed into bacteria, and selected for the ability to confer resistance to β-lactam antibiotics. The fitness effect of each mutation is then determined by deep sequencing of the library before and after selection. Importantly, this protocol introduces methods which maximize sequencing read depth and permit the simultaneous selection of the entire mutation library, by mixing adjacent positions into groups of length accommodated by high-throughput sequencing read length and utilizing orthogonal primers to barcode each group. Representative results using this protocol are provided by assessing the fitness effects of all single amino acid mutations in TEM-1 at a clinically relevant dosage of ampicillin. The method should be easily extendable to other proteins for which a high-throughput selection assay is in place.
Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney
2012-01-01
RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676
Trujillano, D; Ramos, M D; González, J; Tornador, C; Sotillo, F; Escaramis, G; Ossowski, S; Armengol, L; Casals, T; Estivill, X
2013-07-01
Here we have developed a novel and much more efficient strategy for the complete molecular characterisation of the cystic fibrosis (CF) transmembrane regulator (CFTR) gene, based on multiplexed targeted resequencing. We have tested this approach in a cohort of 92 samples with previously characterised CFTR mutations and polymorphisms. After enrichment of the pooled barcoded DNA libraries with a custom NimbleGen SeqCap EZ Choice array (Roche) and sequencing with a HiSeq2000 (Illumina) sequencer, we applied several bioinformatics tools to call mutations and polymorphisms in CFTR. The combination of several bioinformatics tools allowed us to detect all known pathogenic variants (point mutations, short insertions/deletions, and large genomic rearrangements) and polymorphisms (including the poly-T and poly-thymidine-guanine polymorphic tracts) in the 92 samples. In addition, we report the precise characterisation of the breakpoints of seven genomic rearrangements in CFTR, including those of a novel deletion of exon 22 and a complex 85 kb inversion which includes two large deletions affecting exons 4-8 and 12-21, respectively. This work is a proof-of-principle that targeted resequencing is an accurate and cost-effective approach for the genetic testing of CF and CFTR-related disorders (ie, male infertility) amenable to the routine clinical practice, and ready to substitute classical molecular methods in medical genetics.
Buschmann, Tilo; Zhang, Rong; Brash, Douglas E; Bystrykh, Leonid V
2014-08-07
DNA barcodes are short unique sequences used to label DNA or RNA-derived samples in multiplexed deep sequencing experiments. During the demultiplexing step, barcodes must be detected and their position identified. In some cases (e.g., with PacBio SMRT), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives.For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements. In our analysis, barcode sequences showed high rates of coincidental similarities with the Mus musculus reference DNA. This problem became more acute when the length of the barcode sequence decreased and the number of barcodes in the set increased. The method presented in this paper controls the tail area-based false discovery rate to distinguish between barcoded and unbarcoded reads. This method helps to establish the highest acceptable minimal distance between reads and barcode sequences. In a proof of concept experiment we correctly detected barcodes in 83% of the reads with a precision of 89%. Sensitivity improved to 99% at 99% precision when the adjacent primer sequence was incorporated in the analysis. The analysis was further improved using a paired end strategy. Following an analysis of the data for sequence variants induced in the Atp1a1 gene of C57BL/6 murine melanocytes by ultraviolet light and conferring resistance to ouabain, we found no evidence of cross-contamination of DNA material between samples. Our method offers a proper quantitative treatment of the problem of detecting barcoded reads in a noisy sequencing environment. It is based on the false discovery rate statistics that allows a proper trade-off between sensitivity and precision to be chosen.
Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga
2015-01-01
Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how analysis can be repeated from saved sequencing images using the Long Template Protocol to increase accuracy. PMID:25860802
Designing robust watermark barcodes for multiplex long-read sequencing.
Ezpeleta, Joaquín; Krsticevic, Flavia J; Bulacio, Pilar; Tapia, Elizabeth
2017-03-15
To attain acceptable sample misassignment rates, current approaches to multiplex single-molecule real-time sequencing require upstream quality improvement, which is obtained from multiple passes over the sequenced insert and significantly reduces the effective read length. In order to fully exploit the raw read length on multiplex applications, robust barcodes capable of dealing with the full single-pass error rates are needed. We present a method for designing sequencing barcodes that can withstand a large number of insertion, deletion and substitution errors and are suitable for use in multiplex single-molecule real-time sequencing. The manuscript focuses on the design of barcodes for full-length single-pass reads, impaired by challenging error rates in the order of 11%. The proposed barcodes can multiplex hundreds or thousands of samples while achieving sample misassignment probabilities as low as 10-7 under the above conditions, and are designed to be compatible with chemical constraints imposed by the sequencing process. Software tools for constructing watermark barcode sets and demultiplexing barcoded reads, together with example sets of barcodes and synthetic barcoded reads, are freely available at www.cifasis-conicet.gov.ar/ezpeleta/NS-watermark . ezpeleta@cifasis-conicet.gov.ar. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Quantitative phenotyping via deep barcode sequencing.
Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey
2009-10-01
Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.
Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei
2018-01-01
DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.
Colony-PCR Is a Rapid Method for DNA Amplification of Hyphomycetes
Walch, Georg; Knapp, Maria; Rainer, Georg; Peintner, Ursula
2016-01-01
Fungal pure cultures identified with both classical morphological methods and through barcoding sequences are a basic requirement for reliable reference sequences in public databases. Improved techniques for an accelerated DNA barcode reference library construction will result in considerably improved sequence databases covering a wider taxonomic range. Fast, cheap, and reliable methods for obtaining DNA sequences from fungal isolates are, therefore, a valuable tool for the scientific community. Direct colony PCR was already successfully established for yeasts, but has not been evaluated for a wide range of anamorphic soil fungi up to now, and a direct amplification protocol for hyphomycetes without tissue pre-treatment has not been published so far. Here, we present a colony PCR technique directly from fungal hyphae without previous DNA extraction or other prior manipulation. Seven hundred eighty-eight fungal strains from 48 genera were tested with a success rate of 86%. PCR success varied considerably: DNA of fungi belonging to the genera Cladosporium, Geomyces, Fusarium, and Mortierella could be amplified with high success. DNA of soil-borne yeasts was always successfully amplified. Absidia, Mucor, Trichoderma, and Penicillium isolates had noticeably lower PCR success. PMID:29376929
A DNA mini-barcode for land plants.
Little, Damon P
2014-05-01
Small portions of the barcode region - mini-barcodes - may be used in place of full-length barcodes to overcome DNA degradation for samples with poor DNA preservation. 591,491,286 rbcL mini-barcode primer combinations were electronically evaluated for PCR universality, and two novel highly universal sets of priming sites were identified. Novel and published rbcL mini-barcode primers were evaluated for PCR amplification [determined with a validated electronic simulation (n = 2765) and empirically (n = 188)], Sanger sequence quality [determined empirically (n = 188)], and taxonomic discrimination [determined empirically (n = 30,472)]. PCR amplification for all mini-barcodes, as estimated by validated electronic simulation, was successful for 90.2-99.8% of species. Overall Sanger sequence quality for mini-barcodes was very low - the best mini-barcode tested produced sequences of adequate quality (B20 ≥ 0.5) for 74.5% of samples. The majority of mini-barcodes provide correct identifications of families in excess of 70.1% of the time. Discriminatory power noticeably decreased at lower taxonomic levels. At the species level, the discriminatory power of the best mini-barcode was less than 38.2%. For samples believed to contain DNA from only one species, an investigator should attempt to sequence, in decreasing order of utility and probability of success, mini-barcodes F (rbcL1/rbcLB), D (F52/R193) and K (F517/R604). For samples believed to contain DNA from more than one species, an investigator should amplify and sequence mini-barcode D (F52/R193). © 2013 John Wiley & Sons Ltd.
Mishra, Priyanka; Kumar, Amit; Nagireddy, Akshitha; Shukla, Ashutosh K.
2017-01-01
DNA barcoding is used as a universal tool for delimiting species boundaries in taxonomically challenging groups, with different plastid and nuclear regions (rbcL, matK, ITS and psbA-trnH) being recommended as primary DNA barcodes for plants. We evaluated the feasibility of using these regions in the species-rich genus Terminalia, which exhibits various overlapping morphotypes with pantropical distribution, owing to its complex taxonomy. Terminalia bellerica and T. chebula are ingredients of the famous Ayurvedic Rasayana formulation Triphala, used for detoxification and rejuvenation. High demand for extracted phytochemicals as well as the high trade value of several species renders mandatory the need for the correct identification of traded plant material. Three different analytical methods with single and multilocus barcoding regions were tested to develop a DNA barcode reference library from 222 individuals representing 41 Terminalia species. All the single barcodes tested had a lower discriminatory power than the multilocus regions, and the combination of matK+ITS had the highest resolution rate (94.44%). The average intra-specific variations (0.0188±0.0019) were less than the distance to the nearest neighbour (0.106±0.009) with matK and ITS. Distance-based Neighbour Joining analysis outperformed the character-based Maximum Parsimony method in the identification of traded species such as T. arjuna, T. chebula and T. tomentosa, which are prone to adulteration. rbcL was shown to be a highly conservative region with only 3.45% variability between all of the sequences. The recommended barcode combination, rbcL+matK, failed to perform in the genus Terminalia. Considering the complexity of resolution observed with single regions, the present study proposes the combination of matK+ITS as the most successful barcode in Terminalia. PMID:28829803
Efficient alignment-free DNA barcode analytics.
Kuksa, Pavel; Pavlovic, Vladimir
2009-11-10
In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.
The practical evaluation of DNA barcode efficacy.
Spouge, John L; Mariño-Ramírez, Leonardo
2012-01-01
This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.
Organic Phase Change Nanoparticles for in-Product Labeling of Agrochemicals.
Wang, Miao; Duong, Binh; Su, Ming
2015-10-28
There is an urgent need to develop in-product covert barcodes for anti-counterfeiting of agrochemicals. This paper reports a new organic nanoparticle-based in-product barcode system, in which a panel of organic phase change nanoparticles is added as a barcode into in a variety of chemicals (herein agrochemicals). The barcode is readout by detecting melting peaks of organic nanoparticles using differential scanning calorimetry. This method has high labeling capacity due to small sizes of nanoparticles, sharp melting peaks, and large scan range of thermal analysis. The in-product barcode can be effectively used to protect agrochemical products from being counterfeited due to its large coding capacity, technical readiness, covertness, and robustness.
Kang, Homan; Jeong, Sinyoung; Koh, Yul; Geun Cha, Myeong; Yang, Jin-Kyoung; Kyeong, San; Kim, Jaehi; Kwak, Seon-Yeong; Chang, Hye-Jin; Lee, Hyunmi; Jeong, Cheolhwan; Kim, Jong-Ho; Jun, Bong-Hyun; Kim, Yong-Kweon; Hong Jeong, Dae; Lee, Yoon-Sik
2015-01-01
Recently, preparation and screening of compound libraries remain one of the most challenging tasks in drug discovery, biomarker detection, and biomolecular profiling processes. So far, several distinct encoding/decoding methods such as chemical encoding, graphical encoding, and optical encoding have been reported to identify those libraries. In this paper, a simple and efficient surface-enhanced Raman spectroscopic (SERS) barcoding method using highly sensitive SERS nanoparticles (SERS ID) is presented. The 44 kinds of SERS IDs were able to generate simple codes and could possibly generate more than one million kinds of codes by incorporating combinations of different SERS IDs. The barcoding method exhibited high stability and reliability under bioassay conditions. The SERS ID encoding based screening platform can identify the peptide ligand on the bead and also quantify its binding affinity for specific protein. We believe that our SERS barcoding technology is a promising method in the screening of one-bead-one-compound (OBOC) libraries for drug discovery. PMID:26017924
Kang, Homan; Jeong, Sinyoung; Koh, Yul; Geun Cha, Myeong; Yang, Jin-Kyoung; Kyeong, San; Kim, Jaehi; Kwak, Seon-Yeong; Chang, Hye-Jin; Lee, Hyunmi; Jeong, Cheolhwan; Kim, Jong-Ho; Jun, Bong-Hyun; Kim, Yong-Kweon; Hong Jeong, Dae; Lee, Yoon-Sik
2015-05-28
Recently, preparation and screening of compound libraries remain one of the most challenging tasks in drug discovery, biomarker detection, and biomolecular profiling processes. So far, several distinct encoding/decoding methods such as chemical encoding, graphical encoding, and optical encoding have been reported to identify those libraries. In this paper, a simple and efficient surface-enhanced Raman spectroscopic (SERS) barcoding method using highly sensitive SERS nanoparticles (SERS ID) is presented. The 44 kinds of SERS IDs were able to generate simple codes and could possibly generate more than one million kinds of codes by incorporating combinations of different SERS IDs. The barcoding method exhibited high stability and reliability under bioassay conditions. The SERS ID encoding based screening platform can identify the peptide ligand on the bead and also quantify its binding affinity for specific protein. We believe that our SERS barcoding technology is a promising method in the screening of one-bead-one-compound (OBOC) libraries for drug discovery.
Blagoev, Gergin A; Nikolova, Nadya I; Sobel, Crystal N; Hebert, Paul D N; Adamowicz, Sarah J
2013-11-26
Arctic ecosystems, especially those near transition zones, are expected to be strongly impacted by climate change. Because it is positioned on the ecotone between tundra and boreal forest, the Churchill area is a strategic locality for the analysis of shifts in faunal composition. This fact has motivated the effort to develop a comprehensive biodiversity inventory for the Churchill region by coupling DNA barcoding with morphological studies. The present study represents one element of this effort; it focuses on analysis of the spider fauna at Churchill. 198 species were detected among 2704 spiders analyzed, tripling the count for the Churchill region. Estimates of overall diversity suggest that another 10-20 species await detection. Most species displayed little intraspecific sequence variation (maximum <1%) in the barcode region of the cytochrome c oxidase subunit I (COI) gene, but four species showed considerably higher values (maximum = 4.1-6.2%), suggesting cryptic species. All recognized species possessed a distinct haplotype array at COI with nearest-neighbour interspecific distances averaging 8.57%. Three species new to Canada were detected: Robertus lyrifer (Theridiidae), Baryphyma trifrons (Linyphiidae), and Satilatlas monticola (Linyphiidae). The first two species may represent human-mediated introductions linked to the port in Churchill, but the other species represents a range extension from the USA. The first description of the female of S. monticola was also presented. As well, one probable new species of Alopecosa (Lycosidae) was recognized. This study provides the first comprehensive DNA barcode reference library for the spider fauna of any region. Few cryptic species of spiders were detected, a result contrasting with the prevalence of undescribed species in several other terrestrial arthropod groups at Churchill. Because most (97.5%) sequence clusters at COI corresponded with a named taxon, DNA barcoding reliably identifies spiders in the Churchill fauna. The capacity of DNA barcoding to enable the identification of otherwise taxonomically ambiguous specimens (juveniles, females) also represents a major advance for future monitoring efforts on this group.
Identifying Canadian Freshwater Fishes through DNA Barcodes
Hubert, Nicolas; Hanner, Robert; Holm, Erling; Mandrak, Nicholas E.; Taylor, Eric; Burridge, Mary; Watkinson, Douglas; Dumont, Pierre; Curry, Allen; Bentzen, Paul; Zhang, Junbin; April, Julien; Bernatchez, Louis
2008-01-01
Background DNA barcoding aims to provide an efficient method for species-level identifications using an array of species specific molecular tags derived from the 5′ region of the mitochondrial cytochrome c oxidase I (COI) gene. The efficiency of the method hinges on the degree of sequence divergence among species and species-level identifications are relatively straightforward when the average genetic distance among individuals within a species does not exceed the average genetic distance between sister species. Fishes constitute a highly diverse group of vertebrates that exhibit deep phenotypic changes during development. In this context, the identification of fish species is challenging and DNA barcoding provide new perspectives in ecology and systematics of fishes. Here we examined the degree to which DNA barcoding discriminate freshwater fish species from the well-known Canadian fauna, which currently encompasses nearly 200 species, some which are of high economic value like salmons and sturgeons. Methodology/Principal Findings We bi-directionally sequenced the standard 652 bp “barcode” region of COI for 1360 individuals belonging to 190 of the 203 Canadian freshwater fish species (95%). Most species were represented by multiple individuals (7.6 on average), the majority of which were retained as voucher specimens. The average genetic distance was 27 fold higher between species than within species, as K2P distance estimates averaged 8.3% among congeners and only 0.3% among concpecifics. However, shared polymorphism between sister-species was detected in 15 species (8% of the cases). The distribution of K2P distance between individuals and species overlapped and identifications were only possible to species group using DNA barcodes in these cases. Conversely, deep hidden genetic divergence was revealed within two species, suggesting the presence of cryptic species. Conclusions/Significance The present study evidenced that freshwater fish species can be efficiently identified through the use of DNA barcoding, especially the species complex of small-sized species, and that the present COI library can be used for subsequent applications in ecology and systematics. PMID:22423312
Ermakov, Oleg A.; Simonov, Evgeniy; Surin, Vadim L.; Titov, Sergey V.; Brandler, Oleg V.; Ivanova, Natalia V.; Borisenko, Alex V.
2015-01-01
The utility of DNA Barcoding for species identification and discovery has catalyzed a concerted effort to build the global reference library; however, many animal groups of economical or conservational importance remain poorly represented. This study aims to contribute DNA barcode records for all ground squirrel species (Xerinae, Sciuridae, Rodentia) inhabiting Eurasia and to test efficiency of this approach for species discrimination. Cytochrome c oxidase subunit 1 (COI) gene sequences were obtained for 97 individuals representing 16 ground squirrel species of which 12 were correctly identified. Taxonomic allocation of some specimens within four species was complicated by geographically restricted mtDNA introgression. Exclusion of individuals with introgressed mtDNA allowed reaching a 91.6% identification success rate. Significant COI divergence (3.5–4.4%) was observed within the most widespread ground squirrel species (Spermophilus erythrogenys, S. pygmaeus, S. suslicus, Urocitellus undulatus), suggesting the presence of cryptic species. A single putative NUMT (nuclear mitochondrial pseudogene) sequence was recovered during molecular analysis; mitochondrial COI from this sample was amplified following re-extraction of DNA. Our data show high discrimination ability of 100 bp COI fragments for Eurasian ground squirrels (84.3%) with no incorrect assessments, underscoring the potential utility of the existing reference librariy for the development of diagnostic ‘mini-barcodes’. PMID:25617768
Assembling and auditing a comprehensive DNA barcode reference library for European marine fishes.
Oliveira, L M; Knebelsberger, T; Landi, M; Soares, P; Raupach, M J; Costa, F O
2016-12-01
A large-scale comprehensive reference library of DNA barcodes for European marine fishes was assembled, allowing the evaluation of taxonomic uncertainties and species genetic diversity that were otherwise hidden in geographically restricted studies. A total of 4118 DNA barcodes were assigned to 358 species generating 366 Barcode Index Numbers (BIN). Initial examination revealed as much as 141 BIN discordances (more than one species in each BIN). After implementing an auditing and five-grade (A-E) annotation protocol, the number of discordant species BINs was reduced to 44 (13% grade E), while concordant species BINs amounted to 271 (78% grades A and B) and 14 other had insufficient data (grade D). Fifteen species displayed comparatively high intraspecific divergences ranging from 2·6 to 18·5% (grade C), which is biologically paramount information to be considered in fish species monitoring and stock assessment. On balance, this compilation contributed to the detection of 59 European fish species probably in need of taxonomic clarification or re-evaluation. The generalized implementation of an auditing and annotation protocol for reference libraries of DNA barcodes is recommended. © 2016 The Fisheries Society of the British Isles.
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server
Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113
Zúñiga, Jose D.; Gostel, Morgan R.; Mulcahy, Daniel G.; Barker, Katharine; Asia Hill; Sedaghatpour, Maryam; Vo, Samantha Q.; Funk, Vicki A.; Coddington, Jonathan A.
2017-01-01
Abstract The Global Genome Initiative has sequenced and released 1961 DNA barcodes for genetic samples obtained as part of the Global Genome Initiative for Gardens Program. The dataset includes barcodes for 29 plant families and 309 genera that did not have sequences flagged as barcodes in GenBank and sequences from officially recognized barcoding genetic markers meet the data standard of the Consortium for the Barcode of Life. The genetic samples were deposited in the Smithsonian Institution’s National Museum of Natural History Biorepository and their records were made public through the Global Genome Biodiversity Network’s portal. The DNA barcodes are now available on GenBank. PMID:29118648
Massively parallel cis-regulatory analysis in the mammalian central nervous system
Shen, Susan Q.; Myers, Connie A.; Hughes, Andrew E.O.; Byrne, Leah C.; Flannery, John G.; Corbo, Joseph C.
2016-01-01
Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell–derived organoids. PMID:26576614
Quantitative phenotyping via deep barcode sequencing
Smith, Andrew M.; Heisler, Lawrence E.; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J.; Chee, Mark; Roth, Frederick P.; Giaever, Guri; Nislow, Corey
2009-01-01
Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or “Bar-seq,” outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that ∼20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene–environment interactions on a genome-wide scale. PMID:19622793
Efficient alignment-free DNA barcode analytics
Kuksa, Pavel; Pavlovic, Vladimir
2009-01-01
Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305
Nithaniyal, Stalin; Newmaster, Steven G; Ragupathy, Subramanyam; Krishnamoorthy, Devanathan; Vassou, Sophie Lorraine; Parani, Madasamy
2014-01-01
India is rich with biodiversity, which includes a large number of endemic, rare and threatened plant species. Previous studies have used DNA barcoding to inventory species for applications in biodiversity monitoring, conservation impact assessment, monitoring of illegal trading, authentication of traded medicinal plants etc. This is the first tropical dry evergreen forest (TDEF) barcode study in the World and the first attempt to assemble a reference barcode library for the trees of India as part of a larger project initiated by this research group. We sampled 429 trees representing 143 tropical dry evergreen forest (TDEF) species, which included 16 threatened species. DNA barcoding was completed using rbcL and matK markers. The tiered approach (1st tier rbcL; 2nd tier matK) correctly identified 136 out of 143 species (95%). This high level of species resolution was largely due to the fact that the tree species were taxonomically diverse in the TDEF. Ability to resolve taxonomically diverse tree species of TDEF was comparable among the best match method, the phylogenetic method, and the characteristic attribute organization system method. We demonstrated the utility of the TDEF reference barcode library to authenticate wood samples from timber operations in the TDEF. This pilot research study will enable more comprehensive surveys of the illegal timber trade of threatened species in the TDEF. This TDEF reference barcode library also contains trees that have medicinal properties, which could be used to monitor unsustainable and indiscriminate collection of plants from the wild for their medicinal value.
Nithaniyal, Stalin; Newmaster, Steven G.; Ragupathy, Subramanyam; Krishnamoorthy, Devanathan; Vassou, Sophie Lorraine; Parani, Madasamy
2014-01-01
Background India is rich with biodiversity, which includes a large number of endemic, rare and threatened plant species. Previous studies have used DNA barcoding to inventory species for applications in biodiversity monitoring, conservation impact assessment, monitoring of illegal trading, authentication of traded medicinal plants etc. This is the first tropical dry evergreen forest (TDEF) barcode study in the World and the first attempt to assemble a reference barcode library for the trees of India as part of a larger project initiated by this research group. Methodology/Principal Findings We sampled 429 trees representing 143 tropical dry evergreen forest (TDEF) species, which included 16 threatened species. DNA barcoding was completed using rbcL and matK markers. The tiered approach (1st tier rbcL; 2nd tier matK) correctly identified 136 out of 143 species (95%). This high level of species resolution was largely due to the fact that the tree species were taxonomically diverse in the TDEF. Ability to resolve taxonomically diverse tree species of TDEF was comparable among the best match method, the phylogenetic method, and the characteristic attribute organization system method. Conclusions We demonstrated the utility of the TDEF reference barcode library to authenticate wood samples from timber operations in the TDEF. This pilot research study will enable more comprehensive surveys of the illegal timber trade of threatened species in the TDEF. This TDEF reference barcode library also contains trees that have medicinal properties, which could be used to monitor unsustainable and indiscriminate collection of plants from the wild for their medicinal value. PMID:25259794
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources.
Lim, Jeongheui; Kim, Sang-Yoon; Kim, Sungmin; Eo, Hae-Seok; Kim, Chang-Bae; Paek, Woon Kee; Kim, Won; Bhak, Jong
2009-12-03
DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org.
DNA barcoding Indian freshwater fishes.
Lakra, Wazir Singh; Singh, M; Goswami, Mukunda; Gopalakrishnan, A; Lal, K K; Mohindra, V; Sarkar, U K; Punia, P P; Singh, K V; Bhatt, J P; Ayyappan, S
2016-11-01
DNA barcoding is a promising technique for species identification using a short mitochondrial DNA sequence of cytochrome c oxidase I (COI) gene. In the present study, DNA barcodes were generated from 72 species of freshwater fish covering the Orders Cypriniformes, Siluriformes, Perciformes, Synbranchiformes, and Osteoglossiformes representing 50 genera and 19 families. All the samples were collected from diverse sites except the species endemic to a particular location. Species were represented by multiple specimens in the great majority of the barcoded species. A total of 284 COI sequences were generated. After amplification and sequencing of 700 base pair fragment of COI, primers were trimmed which invariably generated a 655 base pair barcode sequence. The average Kimura two-parameter (K2P) distances within-species, genera, families, and orders were 0.40%, 9.60%, 13.10%, and 17.16%, respectively. DNA barcode discriminated congeneric species without any confusion. The study strongly validated the efficiency of COI as an ideal marker for DNA barcoding of Indian freshwater fishes.
Tank, David C.
2016-01-01
Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples. PMID:26828929
Lobo, Jorge; Teixeira, Marcos A L; Borges, Luisa M S; Ferreira, Maria S G; Hollatz, Claudia; Gomes, Pedro T; Sousa, Ronaldo; Ravara, Ascensão; Costa, Maria H; Costa, Filipe O
2016-01-01
Annelid polychaetes have been seldom the focus of dedicated DNA barcoding studies, despite their ecological relevance and often dominance, particularly in soft-bottom estuarine and coastal marine ecosystems. Here, we report the first assessment of the performance of DNA barcodes in the discrimination of shallow water polychaete species from the southern European Atlantic coast, focusing on specimens collected in estuaries and coastal ecosystems of Portugal. We analysed cytochrome oxidase I DNA barcodes (COI-5P) from 164 specimens, which were assigned to 51 morphospecies. To our data set from Portugal, we added available published sequences selected from the same species, genus or family, to inspect for taxonomic congruence among studies and collection location. The final data set comprised 290 specimens and 79 morphospecies, which generated 99 Barcode Index Numbers (BINs) within Barcode of Life Data Systems (BOLD). Among these, 22 BINs were singletons, 47 other BINs were concordant, confirming the initial identification based on morphological characters, and 30 were discordant, most of which consisted on multiple BINs found for the same morphospecies. Some of the most prominent cases in the latter category include Hediste diversicolor (O.F. Müller, 1776) (7), Eulalia viridis (Linnaeus, 1767) (2) and Owenia fusiformis (delle Chiaje, 1844) (5), all of them reported from Portugal and frequently used in ecological studies as environmental quality indicators. Our results for these species showed discordance between molecular lineages and morphospecies, or added additional relatively divergent lineages. The potential inaccuracies in environmental assessments, where underpinning polychaete species diversity is poorly resolved or clarified, demand additional and extensive investigation of the DNA barcode diversity in this group, in parallel with alpha taxonomy efforts. © 2015 John Wiley & Sons Ltd.
Identification of species adulteration in traded medicinal plant raw drugs using DNA barcoding.
Nithaniyal, Stalin; Vassou, Sophie Lorraine; Poovitha, Sundar; Raju, Balaji; Parani, Madasamy
2017-02-01
Plants are the major source of therapeutic ingredients in complementary and alternative medicine (CAM). However, species adulteration in traded medicinal plant raw drugs threatens the reliability and safety of CAM. Since morphological features of medicinal plants are often not intact in the raw drugs, DNA barcoding was employed for species identification. Adulteration in 112 traded raw drugs was tested after creating a reference DNA barcode library consisting of 1452 rbcL and matK barcodes from 521 medicinal plant species. Species resolution of this library was 74.4%, 90.2%, and 93.0% for rbcL, matK, and rbcL + matK, respectively. DNA barcoding revealed adulteration in about 20% of the raw drugs, and at least 6% of them were derived from plants with completely different medicinal or toxic properties. Raw drugs in the form of dried roots, powders, and whole plants were found to be more prone to adulteration than rhizomes, fruits, and seeds. Morphological resemblance, co-occurrence, mislabeling, confusing vernacular names, and unauthorized or fraudulent substitutions might have contributed to species adulteration in the raw drugs. Therefore, this library can be routinely used to authenticate traded raw drugs for the benefit of all stakeholders: traders, consumers, and regulatory agencies.
[Hydrophidae identification through analysis on Cyt b gene barcode].
Liao, Li-xi; Zeng, Ke-wu; Tu, Peng-fei
2015-08-01
Hydrophidae, one of the precious traditional Chinese medicines, is generally drily preserved to prevent corruption, but it is hard to identify the species of Hydrophidae through the appearance because of the change due to the drying process. The identification through analysis on gene barcode, a new technique in species identification, can avoid the problem. The gene barcodes of the 6 species of Hydrophidae like Lapemis hardwickii were aquired through DNA extraction and gene sequencing. These barcodes were then in sequence alignment and test the identification efficency by BLAST. Our results revealed that the barcode sequences performed high identification efficiency, and had obvious difference between intra- and inter-species. These all indicated that Cyt b DNA barcoding can confirm the Hydrophidae identification.
DNA Barcoding for the Identification and Authentication of Animal Species in Traditional Medicine.
Yang, Fan; Ding, Fei; Chen, Hong; He, Mingqi; Zhu, Shixin; Ma, Xin; Jiang, Li; Li, Haifeng
2018-01-01
Animal-based traditional medicine not only plays a significant role in therapeutic practices worldwide but also provides a potential compound library for drug discovery. However, persistent hunting and illegal trade markedly threaten numerous medicinal animal species, and increasing demand further provokes the emergence of various adulterants. As the conventional methods are difficult and time-consuming to detect processed products or identify animal species with similar morphology, developing novel authentication methods for animal-based traditional medicine represents an urgent need. During the last decade, DNA barcoding offers an accurate and efficient strategy that can identify existing species and discover unknown species via analysis of sequence variation in a standardized region of DNA. Recent studies have shown that DNA barcoding as well as minibarcoding and metabarcoding is capable of identifying animal species and discriminating the authentics from the adulterants in various types of traditional medicines, including raw materials, processed products, and complex preparations. These techniques can also be used to detect the unlabelled and threatened animal species in traditional medicine. Here, we review the recent progress of DNA barcoding for the identification and authentication of animal species used in traditional medicine, which provides a reference for quality control and trade supervision of animal-based traditional medicine.
DNA Barcoding for the Identification and Authentication of Animal Species in Traditional Medicine
Yang, Fan; Ding, Fei; Chen, Hong; He, Mingqi; Zhu, Shixin; Ma, Xin; Jiang, Li
2018-01-01
Animal-based traditional medicine not only plays a significant role in therapeutic practices worldwide but also provides a potential compound library for drug discovery. However, persistent hunting and illegal trade markedly threaten numerous medicinal animal species, and increasing demand further provokes the emergence of various adulterants. As the conventional methods are difficult and time-consuming to detect processed products or identify animal species with similar morphology, developing novel authentication methods for animal-based traditional medicine represents an urgent need. During the last decade, DNA barcoding offers an accurate and efficient strategy that can identify existing species and discover unknown species via analysis of sequence variation in a standardized region of DNA. Recent studies have shown that DNA barcoding as well as minibarcoding and metabarcoding is capable of identifying animal species and discriminating the authentics from the adulterants in various types of traditional medicines, including raw materials, processed products, and complex preparations. These techniques can also be used to detect the unlabelled and threatened animal species in traditional medicine. Here, we review the recent progress of DNA barcoding for the identification and authentication of animal species used in traditional medicine, which provides a reference for quality control and trade supervision of animal-based traditional medicine. PMID:29849709
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources
2009-01-01
Background DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. Results We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Conclusion Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org. PMID:19958506
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
A DNA Mini-Barcoding System for Authentication of Processed Fish Products.
Shokralla, Shadi; Hellberg, Rosalee S; Handy, Sara M; King, Ian; Hajibabaei, Mehrdad
2015-10-30
Species substitution is a form of seafood fraud for the purpose of economic gain. DNA barcoding utilizes species-specific DNA sequence information for specimen identification. Previous work has established the usability of short DNA sequences-mini-barcodes-for identification of specimens harboring degraded DNA. This study aims at establishing a DNA mini-barcoding system for all fish species commonly used in processed fish products in North America. Six mini-barcode primer pairs targeting short (127-314 bp) fragments of the cytochrome c oxidase I (CO1) DNA barcode region were developed by examining over 8,000 DNA barcodes from species in the U.S. Food and Drug Administration (FDA) Seafood List. The mini-barcode primer pairs were then tested against 44 processed fish products representing a range of species and product types. Of the 44 products, 41 (93.2%) could be identified at the species or genus level. The greatest mini-barcoding success rate found with an individual primer pair was 88.6% compared to 20.5% success rate achieved by the full-length DNA barcode primers. Overall, this study presents a mini-barcoding system that can be used to identify a wide range of fish species in commercial products and may be utilized in high throughput DNA sequencing for authentication of heavily processed fish products.
DNA-encoded chemical libraries: advancing beyond conventional small-molecule libraries.
Franzini, Raphael M; Neri, Dario; Scheuermann, Jörg
2014-04-15
DNA-encoded chemical libraries (DECLs) represent a promising tool in drug discovery. DECL technology allows the synthesis and screening of chemical libraries of unprecedented size at moderate costs. In analogy to phage-display technology, where large antibody libraries are displayed on the surface of filamentous phage and are genetically encoded in the phage genome, DECLs feature the display of individual small organic chemical moieties on DNA fragments serving as amplifiable identification barcodes. The DNA-tag facilitates the synthesis and allows the simultaneous screening of very large sets of compounds (up to billions of molecules), because the hit compounds can easily be identified and quantified by PCR-amplification of the DNA-barcode followed by high-throughput DNA sequencing. Several approaches have been used to generate DECLs, differing both in the methods used for library encoding and for the combinatorial assembly of chemical moieties. For example, DECLs can be used for fragment-based drug discovery, displaying a single molecule on DNA or two chemical moieties at the extremities of complementary DNA strands. DECLs can vary substantially in the chemical structures and the library size. While ultralarge libraries containing billions of compounds have been reported containing four or more sets of building blocks, also smaller libraries have been shown to be efficient for ligand discovery. In general, it has been found that the overall library size is a poor predictor for library performance and that the number and diversity of the building blocks are rather important indicators. Smaller libraries consisting of two to three sets of building blocks better fulfill the criteria of drug-likeness and often have higher quality. In this Account, we present advances in the DECL field from proof-of-principle studies to practical applications for drug discovery, both in industry and in academia. DECL technology can yield specific binders to a variety of target proteins and is likely to become a standard tool for pharmaceutical hit discovery, lead expansion, and Chemical Biology research. The introduction of new methodologies for library encoding and for compound synthesis in the presence of DNA is an exciting research field and will crucially contribute to the performance and the propagation of the technology.
DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability
Little, Damon P.
2011-01-01
For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types. PMID:21857897
Links, Matthew G; Dumonceaux, Tim J; Hemmingsen, Sean M; Hill, Janet E
2012-01-01
Barcoding with molecular sequences is widely used to catalogue eukaryotic biodiversity. Studies investigating the community dynamics of microbes have relied heavily on gene-centric metagenomic profiling using two genes (16S rRNA and cpn60) to identify and track Bacteria. While there have been criteria formalized for barcoding of eukaryotes, these criteria have not been used to evaluate gene targets for other domains of life. Using the framework of the International Barcode of Life we evaluated DNA barcodes for Bacteria. Candidates from the 16S rRNA gene and the protein coding cpn60 gene were evaluated. Within complete bacterial genomes in the public domain representing 983 species from 21 phyla, the largest difference between median pairwise inter- and intra-specific distances ("barcode gap") was found from cpn60. Distribution of sequence diversity along the ∼555 bp cpn60 target region was remarkably uniform. The barcode gap of the cpn60 universal target facilitated the faithful de novo assembly of full-length operational taxonomic units from pyrosequencing data from a synthetic microbial community. Analysis supported the recognition of both 16S rRNA and cpn60 as DNA barcodes for Bacteria. The cpn60 universal target was found to have a much larger barcode gap than 16S rRNA suggesting cpn60 as a preferred barcode for Bacteria. A large barcode gap for cpn60 provided a robust target for species-level characterization of data. The assembly of consensus sequences for barcodes was shown to be a reliable method for the identification and tracking of novel microbes in metagenomic studies.
USDA-ARS?s Scientific Manuscript database
DNA barcoding revealed the presence of the polyphagous leafminer pest Liriomyza sativae Blanchard in Bangladesh. DNA barcode sequences for mitochondrial COI were generated for Agromyzidae larvae, pupae and adults collected from field populations across Bangladesh. BLAST sequence similarity searches ...
Machine Learned Replacement of N-Labels for Basecalled Sequences in DNA Barcoding.
Ma, Eddie Y T; Ratnasingham, Sujeevan; Kremer, Stefan C
2018-01-01
This study presents a machine learning method that increases the number of identified bases in Sanger Sequencing. The system post-processes a KB basecalled chromatogram. It selects a recoverable subset of N-labels in the KB-called chromatogram to replace with basecalls (A,C,G,T). An N-label correction is defined given an additional read of the same sequence, and a human finished sequence. Corrections are added to the dataset when an alignment determines the additional read and human agree on the identity of the N-label. KB must also rate the replacement with quality value of in the additional read. Corrections are only available during system training. Developing the system, nearly 850,000 N-labels are obtained from Barcode of Life Datasystems, the premier database of genetic markers called DNA Barcodes. Increasing the number of correct bases improves reference sequence reliability, increases sequence identification accuracy, and assures analysis correctness. Keeping with barcoding standards, our system maintains an error rate of percent. Our system only applies corrections when it estimates low rate of error. Tested on this data, our automation selects and recovers: 79 percent of N-labels from COI (animal barcode); 80 percent from matK and rbcL (plant barcodes); and 58 percent from non-protein-coding sequences (across eukaryotes).
A DNA barcode for land plants.
2009-08-04
DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL+matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.
Hollingsworth, Peter M.; Forrest, Laura L.; Spouge, John L.; Hajibabaei, Mehrdad; Ratnasingham, Sujeevan; van der Bank, Michelle; Chase, Mark W.; Cowan, Robyn S.; Erickson, David L.; Fazekas, Aron J.; Graham, Sean W.; James, Karen E.; Kim, Ki-Joong; Kress, W. John; Schneider, Harald; van AlphenStahl, Jonathan; Barrett, Spencer C.H.; van den Berg, Cassio; Bogarin, Diego; Burgess, Kevin S.; Cameron, Kenneth M.; Carine, Mark; Chacón, Juliana; Clark, Alexandra; Clarkson, James J.; Conrad, Ferozah; Devey, Dion S.; Ford, Caroline S.; Hedderson, Terry A.J.; Hollingsworth, Michelle L.; Husband, Brian C.; Kelly, Laura J.; Kesanakurti, Prasad R.; Kim, Jung Sung; Kim, Young-Dong; Lahaye, Renaud; Lee, Hae-Lim; Long, David G.; Madriñán, Santiago; Maurin, Olivier; Meusnier, Isabelle; Newmaster, Steven G.; Park, Chong-Wook; Percy, Diana M.; Petersen, Gitte; Richardson, James E.; Salazar, Gerardo A.; Savolainen, Vincent; Seberg, Ole; Wilkinson, Michael J.; Yi, Dong-Keun; Little, Damon P.
2009-01-01
DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF–atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK–psbI spacer, and trnH–psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL+matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants. PMID:19666622
2013-01-01
Background Arctic ecosystems, especially those near transition zones, are expected to be strongly impacted by climate change. Because it is positioned on the ecotone between tundra and boreal forest, the Churchill area is a strategic locality for the analysis of shifts in faunal composition. This fact has motivated the effort to develop a comprehensive biodiversity inventory for the Churchill region by coupling DNA barcoding with morphological studies. The present study represents one element of this effort; it focuses on analysis of the spider fauna at Churchill. Results 198 species were detected among 2704 spiders analyzed, tripling the count for the Churchill region. Estimates of overall diversity suggest that another 10–20 species await detection. Most species displayed little intraspecific sequence variation (maximum <1%) in the barcode region of the cytochrome c oxidase subunit I (COI) gene, but four species showed considerably higher values (maximum = 4.1-6.2%), suggesting cryptic species. All recognized species possessed a distinct haplotype array at COI with nearest-neighbour interspecific distances averaging 8.57%. Three species new to Canada were detected: Robertus lyrifer (Theridiidae), Baryphyma trifrons (Linyphiidae), and Satilatlas monticola (Linyphiidae). The first two species may represent human-mediated introductions linked to the port in Churchill, but the other species represents a range extension from the USA. The first description of the female of S. monticola was also presented. As well, one probable new species of Alopecosa (Lycosidae) was recognized. Conclusions This study provides the first comprehensive DNA barcode reference library for the spider fauna of any region. Few cryptic species of spiders were detected, a result contrasting with the prevalence of undescribed species in several other terrestrial arthropod groups at Churchill. Because most (97.5%) sequence clusters at COI corresponded with a named taxon, DNA barcoding reliably identifies spiders in the Churchill fauna. The capacity of DNA barcoding to enable the identification of otherwise taxonomically ambiguous specimens (juveniles, females) also represents a major advance for future monitoring efforts on this group. PMID:24279427
Barcodes for genomes and applications
Zhou, Fengfeng; Olman, Victor; Xu, Ying
2008-01-01
Background Each genome has a stable distribution of the combined frequency for each k-mer and its reverse complement measured in sequence fragments as short as 1000 bps across the whole genome, for 1
Tanabe, Akifumi S; Toju, Hirokazu
2013-01-01
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.
Tanabe, Akifumi S.; Toju, Hirokazu
2013-01-01
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research. PMID:24204702
Using DNA barcodes for assessing diversity in the family Hybotidae (Diptera, Empidoidea)
Nagy, Zoltán T.; Sonet, Gontran; Mortelmans, Jonas; Vandewynkel, Camille; Grootaert, Patrick
2013-01-01
Abstract Empidoidea is one of the largest extant lineages of flies, but phylogenetic relationships among species of this group are poorly investigated and global diversity remains scarcely assessed. In this context, one of the most enigmatic empidoid families is Hybotidae. Within the framework of a pilot study, we barcoded 339 specimens of Old World hybotids belonging to 164 species and 22 genera (plus two Empis as outgroups) and attempted to evaluate whether patterns of intra- and interspecific divergences match the current taxonomy. We used a large sampling of diverse Hybotidae. The material came from the Palaearctic (Belgium, France, Portugal and Russian Caucasus), the Afrotropic (Democratic Republic of the Congo) and the Oriental realms (Singapore and Thailand). Thereby, we optimized lab protocols for barcoding hybotids. Although DNA barcodes generally well distinguished recognized taxa, the study also revealed a number of unexpected phenomena: e.g., undescribed taxa found within morphologically very similar or identical specimens, especially when geographic distance was large; some morphologically distinct species showed no genetic divergence; or different pattern of intraspecific divergence between populations or closely related species. Using COI sequences and simple Neighbour-Joining tree reconstructions, the monophyly of many species- and genus-level taxa was well supported, but more inclusive taxonomical levels did not receive significant bootstrap support. We conclude that in hybotids DNA barcoding might be well used to identify species, when two main constraints are considered. First, incomplete barcoding libraries hinder efficient (correct) identification. Therefore, extra efforts are needed to increase the representation of hybotids in these databases. Second, the spatial scale of sampling has to be taken into account, and especially for widespread species or species complexes with unclear taxonomy, an integrative approach has to be used to clarify species boundaries and identities. PMID:24453562
Lowenstein, Jacob H; Osmundson, Todd W; Becker, Sven; Hanner, Robert; Stiassny, Melanie L J
2011-10-01
Here we describe preliminary efforts to integrate DNA barcoding into an ongoing inventory of the Lower Congo River (LCR) ichthyofauna. The 350 km stretch of the LCR from Pool Malebo to Boma includes the world's largest river rapids. The LCR ichthyofauna is hyperdiverse and rich in endemism due to high habitat heterogeneity, numerous dispersal barriers, and its downstream location in the basin. We have documented 328 species from the LCR, 25% of which are thought to be endemic. In addition to detailing progress made to generate a reference sequence library of DNA barcodes for these fishes, we ask how DNA can be used at the current stage of the Fish Barcode of Life initiative, as a work in progress currently of limited utility to a wide audience. Two possibilities that we explore are the potential for DNA barcodes to generate discrete diagnostic characters for species, and to help resolve problematic taxa lacking clear morphologically diagnostic characters such as many species of the cyprinid genus Labeo, which we use as a case study. Our molecular analysis helped to clarify the validity of some species that were the subject of historical debate, and we were able to construct a molecular key for all monophyletic and morphologically recognizable species. Several species sampled from across the Congo Basin and widely distributed throughout Central and West Africa were recovered as paraphyletic based on our molecular data. Our study underscores the importance of generating reference barcodes for specimens collected from, or in close proximity to, type localities, particularly where species are poorly understood taxonomically and the extent of their geographical distributions have yet to be established.
DNA Barcodes for Forensically Important Fly Species in Brazil.
Koroiva, Ricardo; de Souza, Mirian S; Roque, Fabio de Oliveira; Pepinelli, Mateus
2018-04-07
Here, we analyze 248 DNA barcode sequences of 35 fly species of forensic importance in Brazil. DNA barcoding can be effectively used for specimen identification of these species, allowing the unambiguous identification of 31 species, an overall success rate of 88%. Our results show a high rate of success for molecular identification using DNA barcoding sequences and open new perspectives for immature species identification, a subject on which limited forensic investigations exist in Tropical regions. We also address the implications of building a robust forensic DNA barcode database. A geographic bias is recognized for the COI dataset available for forensically important fly species in Brazil, with concentration of sequences from specimens collected mainly in sites located in the Cerrado, Mata Atlântica, and Pampa biomes.
Barker, F. Keith; Oyler-McCance, Sara; Tomback, Diana F.
2015-01-01
Next generation sequencing methods allow rapid, economical accumulation of data that have many applications, even at relatively low levels of genome coverage. However, the utility of shotgun sequencing data sets for specific goals may vary depending on the biological nature of the samples sequenced. We show that the ability to assemble mitogenomes from three avian samples of two different tissue types varies widely. In particular, data with coverage typical of microsatellite development efforts (∼1×) from DNA extracted from avian blood failed to cover even 50% of the mitogenome, relative to at least 500-fold coverage from muscle-derived data. Researchers should consider possible applications of their data and select the tissue source for their work accordingly. Practitioners analyzing low-coverage shotgun sequencing data (including for microsatellite locus development) should consider the potential benefits of mitogenome assembly, including internal barcode verification of species identity, mitochondrial primer development, and phylogenetics.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Genome-Wide Tuning of Protein Expression Levels to Rapidly Engineer Microbial Traits.
Freed, Emily F; Winkler, James D; Weiss, Sophie J; Garst, Andrew D; Mutalik, Vivek K; Arkin, Adam P; Knight, Rob; Gill, Ryan T
2015-11-20
The reliable engineering of biological systems requires quantitative mapping of predictable and context-independent expression over a broad range of protein expression levels. However, current techniques for modifying expression levels are cumbersome and are not amenable to high-throughput approaches. Here we present major improvements to current techniques through the design and construction of E. coli genome-wide libraries using synthetic DNA cassettes that can tune expression over a ∼10(4) range. The cassettes also contain molecular barcodes that are optimized for next-generation sequencing, enabling rapid and quantitative tracking of alleles that have the highest fitness advantage. We show these libraries can be used to determine which genes and expression levels confer greater fitness to E. coli under different growth conditions.
Bishoyi, Ashok Kumar; Kavane, Aarti; Sharma, Anjali; Geetha, K A
2017-02-01
CYMBOPOGON: is an important member of grass family Poaceae, cultivated for essential oils which have greater medicinal and industrial value. Taxonomic identification of Cymbopogon species is determined mainly by morphological markers, odour of essential oils and concentration of bioactive compounds present in the oil matrices which are highly influenced by environment. Authenticated molecular marker based taxonomical identification is also lacking in the genus; hence effort was made to evaluate potential DNA barcode loci in six commercially important Cymbopogon species for their individual discrimination and authentication at the species level. Four widely used DNA barcoding regions viz., ITS 1 & ITS 2 spacers, matK, psbA-trnH and rbcL were taken for the study. Gene sequences of the same or related genera of the concerned loci were mined from NCBI domain and primers were designed and validated for barcode loci amplification. Out of the four loci studied, sequences from matK and ITS spacer loci revealed 0.46% and 5.64% nucleotide sequence diversity, respectively whereas the other two loci i.e., psbA-trnH and rbcL showed 100% sequence homology. The newly developed primers can be used for barcode loci amplification in the genus Cymbopogon. The identified Single Nucleotide Polymorphisms from the studied sequences may be used as barcodes for the six Cymbopogon species. The information generated can also be utilized for barcode development of the genus by including more number of Cymbopgon species in future.
Selection of a DNA barcode for Nectriaceae from fungal whole-genomes.
Zeng, Zhaoqing; Zhao, Peng; Luo, Jing; Zhuang, Wenying; Yu, Zhihe
2012-01-01
A DNA barcode is a short segment of sequence that is able to distinguish species. A barcode must ideally contain enough variation to distinguish every individual species and be easily obtained. Fungi of Nectriaceae are economically important and show high species diversity. To establish a standard DNA barcode for this group of fungi, the genomes of Neurospora crassa and 30 other filamentous fungi were compared. The expect value was treated as a criterion to recognize homologous sequences. Four candidate markers, Hsp90, AAC, CDC48, and EF3, were tested for their feasibility as barcodes in the identification of 34 well-established species belonging to 13 genera of Nectriaceae. Two hundred and fifteen sequences were analyzed. Intra- and inter-specific variations and the success rate of PCR amplification and sequencing were considered as important criteria for estimation of the candidate markers. Ultimately, the partial EF3 gene met the requirements for a good DNA barcode: No overlap was found between the intra- and inter-specific pairwise distances. The smallest inter-specific distance of EF3 gene was 3.19%, while the largest intra-specific distance was 1.79%. In addition, there was a high success rate in PCR and sequencing for this gene (96.3%). CDC48 showed sufficiently high sequence variation among species, but the PCR and sequencing success rate was 84% using a single pair of primers. Although the Hsp90 and AAC genes had higher PCR and sequencing success rates (96.3% and 97.5%, respectively), overlapping occurred between the intra- and inter-specific variations, which could lead to misidentification. Therefore, we propose the EF3 gene as a possible DNA barcode for the nectriaceous fungi.
The Scirtothrips dorsalis Species Complex: Endemism and Invasion in a Global Pest
Dickey, Aaron M.; Kumar, Vivek; Hoddle, Mark S.; Funderburk, Joe E.; Morgan, J. Kent; Jara-Cavieres, Antonella; Shatters, Robert G. Jr.; Osborne, Lance S.; McKenzie, Cindy L.
2015-01-01
Invasive arthropods pose unique management challenges in various environments, the first of which is correct identification. This apparently mundane task is particularly difficult if multiple species are morphologically indistinguishable but accurate identification can be determined with DNA barcoding provided an adequate reference set is available. Scirtothrips dorsalis is a highly polyphagous plant pest with a rapidly expanding global distribution and this species, as currently recognized, may be comprised of cryptic species. Here we report the development of a comprehensive DNA barcode library for S. dorsalis and seven nuclear markers via next-generation sequencing for identification use within the complex. We also report the delimitation of nine cryptic species and two morphologically distinguishable species comprising the S. dorsalis species complex using histogram analysis of DNA barcodes, Bayesian phylogenetics, and the multi-species coalescent. One member of the complex, here designated the South Asia 1 cryptic species, is highly invasive, polyphagous, and likely the species implicated in tospovirus transmission. Two other species, South Asia 2, and East Asia 1 are also highly polyphagous and appear to be at an earlier stage of global invasion. The remaining members of the complex are regionally endemic, varying in their pest status and degree of polyphagy. In addition to patterns of invasion and endemism, our results provide a framework both for identifying members of the complex based on their DNA barcode, and for future species delimiting efforts. PMID:25893251
Lau, Billy T; Ji, Hanlee P
2017-09-21
RNA-Seq measures gene expression by counting sequence reads belonging to unique cDNA fragments. Molecular barcodes commonly in the form of random nucleotides were recently introduced to improve gene expression measures by detecting amplification duplicates, but are susceptible to errors generated during PCR and sequencing. This results in false positive counts, leading to inaccurate transcriptome quantification especially at low input and single-cell RNA amounts where the total number of molecules present is minuscule. To address this issue, we demonstrated the systematic identification of molecular species using transposable error-correcting barcodes that are exponentially expanded to tens of billions of unique labels. We experimentally showed random-mer molecular barcodes suffer from substantial and persistent errors that are difficult to resolve. To assess our method's performance, we applied it to the analysis of known reference RNA standards. By including an inline random-mer molecular barcode, we systematically characterized the presence of sequence errors in random-mer molecular barcodes. We observed that such errors are extensive and become more dominant at low input amounts. We described the first study to use transposable molecular barcodes and its use for studying random-mer molecular barcode errors. Extensive errors found in random-mer molecular barcodes may warrant the use of error correcting barcodes for transcriptome analysis as input amounts decrease.
Delaney, Nigel F.; Marx, Christopher J.
2012-01-01
Understanding evolutionary dynamics within microbial populations requires the ability to accurately follow allele frequencies through time. Here we present a rapid, cost-effective method (FREQ-Seq) that leverages Illumina next-generation sequencing for localized, quantitative allele frequency detection. Analogous to RNA-Seq, FREQ-Seq relies upon counts from the >105 reads generated per locus per time-point to determine allele frequencies. Loci of interest are directly amplified from a mixed population via two rounds of PCR using inexpensive, user-designed oligonucleotides and a bar-coded bridging primer system that can be regenerated in-house. The resulting bar-coded PCR products contain the adapters needed for Illumina sequencing, eliminating further library preparation. We demonstrate the utility of FREQ-Seq by determining the order and dynamics of beneficial alleles that arose as a microbial population, founded with an engineered strain of Methylobacterium, evolved to grow on methanol. Quantifying allele frequencies with minimal bias down to 1% abundance allowed effective analysis of SNPs, small in-dels and insertions of transposable elements. Our data reveal large-scale clonal interference during the early stages of adaptation and illustrate the utility of FREQ-Seq as a cost-effective tool for tracking allele frequencies in populations. PMID:23118913
High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.
Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M
2016-09-07
Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. Copyright © 2016 Elsevier Inc. All rights reserved.
Liao, Jing; Chao, Zhi; Zhang, Liang
2013-11-01
To identify the common snakes in medicated liquor of Guangdong using COI barcode sequence,and to test the feasibility. The COI barcode sequences of collected medicinal snakes were amplified and sequenced. The sequences combined with the data from GenBank were analyzed for divergence and building a neighbor-joining(NJ) tree with MEGA 5.0. The genetic distance and NJ tree demonstrated that there were 241 variable sites in these species, and the average (A + T) content of 56.2% was higher than the average (G + C) content of 43.7%. The maximum interspecific genetic distance was 0.2568, and the minimum was 0. 1519. In the NJ tree,each species formed a monophyletic clade with bootstrap supports of 100%. DNA barcoding identification method based on the COI sequence is accurate and can be applied to identify the common medicinal snakes.
DNA barcode identification of Podocarpaceae--the second largest conifer family.
Little, Damon P; Knopf, Patrick; Schulz, Christian
2013-01-01
We have generated matK, rbcL, and nrITS2 DNA barcodes for 320 specimens representing all 18 extant genera of the conifer family Podocarpaceae. The sample includes 145 of the 198 recognized species. Comparative analyses of sequence quality and species discrimination were conducted on the 159 individuals from which all three markers were recovered (representing 15 genera and 97 species). The vast majority of sequences were of high quality (B 30 = 0.596-0.989). Even the lowest quality sequences exceeded the minimum requirements of the BARCODE data standard. In the few instances that low quality sequences were generated, the responsible mechanism could not be discerned. There were no statistically significant differences in the discriminatory power of markers or marker combinations (p = 0.05). The discriminatory power of the barcode markers individually and in combination is low (56.7% of species at maximum). In some instances, species discrimination failed in spite of ostensibly useful variation being present (genotypes were shared among species), but in many cases there was simply an absence of sequence variation. Barcode gaps (maximum intraspecific p-distance > minimum interspecific p-distance) were observed in 50.5% of species when all three markers were considered simultaneously. The presence of a barcode gap was not predictive of discrimination success (p = 0.02) and there was no statistically significant difference in the frequency of barcode gaps among markers (p = 0.05). In addition, there was no correlation between number of individuals sampled per species and the presence of a barcode gap (p = 0.27).
DNA Barcode Identification of Podocarpaceae—The Second Largest Conifer Family
Little, Damon P.; Knopf, Patrick; Schulz, Christian
2013-01-01
We have generated matK, rbcL, and nrITS2 DNA barcodes for 320 specimens representing all 18 extant genera of the conifer family Podocarpaceae. The sample includes 145 of the 198 recognized species. Comparative analyses of sequence quality and species discrimination were conducted on the 159 individuals from which all three markers were recovered (representing 15 genera and 97 species). The vast majority of sequences were of high quality (B 30 = 0.596–0.989). Even the lowest quality sequences exceeded the minimum requirements of the BARCODE data standard. In the few instances that low quality sequences were generated, the responsible mechanism could not be discerned. There were no statistically significant differences in the discriminatory power of markers or marker combinations (p = 0.05). The discriminatory power of the barcode markers individually and in combination is low (56.7% of species at maximum). In some instances, species discrimination failed in spite of ostensibly useful variation being present (genotypes were shared among species), but in many cases there was simply an absence of sequence variation. Barcode gaps (maximum intraspecific p–distance > minimum interspecific p–distance) were observed in 50.5% of species when all three markers were considered simultaneously. The presence of a barcode gap was not predictive of discrimination success (p = 0.02) and there was no statistically significant difference in the frequency of barcode gaps among markers (p = 0.05). In addition, there was no correlation between number of individuals sampled per species and the presence of a barcode gap (p = 0.27). PMID:24312258
Functional annotation of chemical libraries across diverse biological processes.
Piotrowski, Jeff S; Li, Sheena C; Deshpande, Raamesh; Simpkins, Scott W; Nelson, Justin; Yashiroda, Yoko; Barber, Jacqueline M; Safizadeh, Hamid; Wilson, Erin; Okada, Hiroki; Gebre, Abraham A; Kubo, Karen; Torres, Nikko P; LeBlanc, Marissa A; Andrusiak, Kerry; Okamoto, Reika; Yoshimura, Mami; DeRango-Adem, Eva; van Leeuwen, Jolanda; Shirahige, Katsuhiko; Baryshnikova, Anastasia; Brown, Grant W; Hirano, Hiroyuki; Costanzo, Michael; Andrews, Brenda; Ohya, Yoshikazu; Osada, Hiroyuki; Yoshida, Minoru; Myers, Chad L; Boone, Charles
2017-09-01
Chemical-genetic approaches offer the potential for unbiased functional annotation of chemical libraries. Mutations can alter the response of cells in the presence of a compound, revealing chemical-genetic interactions that can elucidate a compound's mode of action. We developed a highly parallel, unbiased yeast chemical-genetic screening system involving three key components. First, in a drug-sensitive genetic background, we constructed an optimized diagnostic mutant collection that is predictive for all major yeast biological processes. Second, we implemented a multiplexed (768-plex) barcode-sequencing protocol, enabling the assembly of thousands of chemical-genetic profiles. Finally, based on comparison of the chemical-genetic profiles with a compendium of genome-wide genetic interaction profiles, we predicted compound functionality. Applying this high-throughput approach, we screened seven different compound libraries and annotated their functional diversity. We further validated biological process predictions, prioritized a diverse set of compounds, and identified compounds that appear to have dual modes of action.
Svensson, J Peter; Quirós Pesudo, Laia; McRee, Siobhan K; Adeleye, Yeyejide; Carmichael, Paul; Samson, Leona D
2013-01-01
Toxicity screening of compounds provides a means to identify compounds harmful for human health and the environment. Here, we further develop the technique of genomic phenotyping to improve throughput while maintaining specificity. We exposed cells to eight different compounds that rely on different modes of action: four genotoxic alkylating (methyl methanesulfonate (MMS), N-Methyl-N-nitrosourea (MNU), N,N'-bis(2-chloroethyl)-N-nitroso-urea (BCNU), N-ethylnitrosourea (ENU)), two oxidizing (2-methylnaphthalene-1,4-dione (menadione, MEN), benzene-1,4-diol (hydroquinone, HYQ)), and two non-genotoxic (methyl carbamate (MC) and dimethyl sulfoxide (DMSO)) compounds. A library of S. cerevisiae 4,852 deletion strains, each identifiable by a unique genetic 'barcode', were grown in competition; at different time points the ratio between the strains was assessed by quantitative high throughput 'barcode' sequencing. The method was validated by comparison to previous genomic phenotyping studies and 90% of the strains identified as MMS-sensitive here were also identified as MMS-sensitive in a much lower throughput solid agar screen. The data provide profiles of proteins and pathways needed for recovery after both genotoxic and non-genotoxic compounds. In addition, a novel role for aromatic amino acids in the recovery after treatment with oxidizing agents was suggested. The role of aromatic acids was further validated; the quinone subgroup of oxidizing agents were extremely toxic in cells where tryptophan biosynthesis was compromised.
Suwannasai, Nuttika; Martín, María P; Phosri, Cherdchai; Sihanonth, Prakitsin; Whalley, Anthony J S; Spouge, John L
2013-01-01
Thailand, a part of the Indo-Burma biodiversity hotspot, has many endemic animals and plants. Some of its fungal species are difficult to recognize and separate, complicating assessments of biodiversity. We assessed species diversity within the fungal genera Annulohypoxylon and Hypoxylon, which produce biologically active and potentially therapeutic compounds, by applying classical taxonomic methods to 552 teleomorphs collected from across Thailand. Using probability of correct identification (PCI), we also assessed the efficacy of automated species identification with a fungal barcode marker, ITS, in the model system of Annulohypoxylon and Hypoxylon. The 552 teleomorphs yielded 137 ITS sequences; in addition, we examined 128 GenBank ITS sequences, to assess biases in evaluating a DNA barcode with GenBank data. The use of multiple sequence alignment in a barcode database like BOLD raises some concerns about non-protein barcode markers like ITS, so we also compared species identification using different alignment methods. Our results suggest the following. (1) Multiple sequence alignment of ITS sequences is competitive with pairwise alignment when identifying species, so BOLD should be able to preserve its present bioinformatics workflow for species identification for ITS, and possibly therefore with at least some other non-protein barcode markers. (2) Automated species identification is insensitive to a specific choice of evolutionary distance, contributing to resolution of a current debate in DNA barcoding. (3) Statistical methods are available to address, at least partially, the possibility of expert misidentification of species. Phylogenetic trees discovered a cryptic species and strongly supported monophyletic clades for many Annulohypoxylon and Hypoxylon species, suggesting that ITS can contribute usefully to a barcode for these fungi. The PCIs here, derived solely from ITS, suggest that a fungal barcode will require secondary markers in Annulohypoxylon and Hypoxylon, however. The URL http://tinyurl.com/spouge-barcode contains computer programs and other supplementary material relevant to this article.
Borges, Luísa M. S.; Hollatz, Claudia; Lobo, Jorge; Cunha, Ana M.; Vilela, Ana P.; Calado, Gonçalo; Coelho, Rita; Costa, Ana C.; Ferreira, Maria S. G.; Costa, Maria H.; Costa, Filipe O.
2016-01-01
The Gastropoda is one of the best studied classes of marine invertebrates. Yet, most species have been delimited based on morphology only. The application of DNA barcodes has shown to be greatly useful to help delimiting species. Therefore, sequences of the cytochrome c oxidase I gene from 108 specimens of 34 morpho-species were used to investigate the molecular diversity within the gastropods from the Portuguese coast. To the above dataset, we added available COI-5P sequences of taxonomically close species, in a total of 58 morpho-species examined. There was a good match between ours and sequences from independent studies, in public repositories. We found 32 concordant (91.4%) out of the 35 Barcode Index Numbers (BINs) generated from our sequences. The application of a ranking system to the barcodes yield over 70% with top taxonomic congruence, while 14.2% of the species barcodes had insufficient data. In the majority of the cases, there was a good concordance between morphological identification and DNA barcodes. Nonetheless, the discordance between morphological and molecular data is a reminder that even the comparatively well-known European marine gastropods can benefit from being probed using the DNA barcode approach. Discordant cases should be reviewed with more integrative studies. PMID:26876495
Borges, Luísa M S; Hollatz, Claudia; Lobo, Jorge; Cunha, Ana M; Vilela, Ana P; Calado, Gonçalo; Coelho, Rita; Costa, Ana C; Ferreira, Maria S G; Costa, Maria H; Costa, Filipe O
2016-02-15
The Gastropoda is one of the best studied classes of marine invertebrates. Yet, most species have been delimited based on morphology only. The application of DNA barcodes has shown to be greatly useful to help delimiting species. Therefore, sequences of the cytochrome c oxidase I gene from 108 specimens of 34 morpho-species were used to investigate the molecular diversity within the gastropods from the Portuguese coast. To the above dataset, we added available COI-5P sequences of taxonomically close species, in a total of 58 morpho-species examined. There was a good match between ours and sequences from independent studies, in public repositories. We found 32 concordant (91.4%) out of the 35 Barcode Index Numbers (BINs) generated from our sequences. The application of a ranking system to the barcodes yield over 70% with top taxonomic congruence, while 14.2% of the species barcodes had insufficient data. In the majority of the cases, there was a good concordance between morphological identification and DNA barcodes. Nonetheless, the discordance between morphological and molecular data is a reminder that even the comparatively well-known European marine gastropods can benefit from being probed using the DNA barcode approach. Discordant cases should be reviewed with more integrative studies.
Hou, Gang; Chen, Wei-Tao; Lu, Huo-Sheng; Cheng, Fei; Xie, Song-Guang
2018-01-01
DNA barcodes were studied for 1,353 specimens representing 272 morphological species belonging to 149 genera and 55 families of Perciformes from the South China Sea (SCS). The average Kimura 2-parameter (K2P) distances within species, genera and families were 0.31%, 8.71% and 14.52%, respectively. A neighbour-joining (NJ) tree, Bayesian inference (BI) and maximum-likelihood (ML) trees and Automatic Barcode Gap Discovery (ABGD) revealed 260, 253 and 259 single-species-representing clusters, respectively. Barcoding gap analysis (BGA) demonstrated that barcode gaps were present for 178 of 187 species analysed with multiple specimens (95.2%), with the minimum interspecific distance to the nearest neighbour larger than the maximum intraspecific distance. A group of three Thunnus species (T. albacares, T. obesus and T. tonggol), a pair of Gerres species (G. oyena and G. japonicus), a pair of Istiblennius species (I. edentulous and I. lineatus) and a pair of Uranoscopus species (U. oligolepis and U. kaianus) were observed with low interspecific distances and overlaps between intra- and interspecific genetic distances. Three species (Apogon ellioti, Naucrates ductor and Psenopsis anomala) showed deep intraspecific divergences and generated two lineages each, suggesting the possibility of cryptic species. Our results demonstrated that DNA barcodes are highly reliable for delineating species of Perciformes in the SCS. The DNA barcode library established in this study will shed light on further research on the diversity of Perciformes in the SCS. © 2017 John Wiley & Sons Ltd.
Towards writing the encyclopaedia of life: an introduction to DNA barcoding
Savolainen, Vincent; Cowan, Robyn S; Vogler, Alfried P; Roderick, George K; Lane, Richard
2005-01-01
An international consortium of major natural history museums, herbaria and other organizations has launched an ambitious project, the ‘Barcode of Life Initiative’, to promote a process enabling the rapid and inexpensive identification of the estimated 10 million species on Earth. DNA barcoding is a diagnostic technique in which short DNA sequence(s) can be used for species identification. The first international scientific conference on Barcoding of Life was held at the Natural History Museum in London in February 2005, and here we review the scientific challenges discussed during this conference and in previous publications. Although still controversial, the scientific benefits of DNA barcoding include: (i) enabling species identification, including any life stage or fragment, (ii) facilitating species discoveries based on cluster analyses of gene sequences (e.g. cox1=CO1, in animals), (iii) promoting development of handheld DNA sequencing technology that can be applied in the field for biodiversity inventories and (iv) providing insight into the diversity of life. PMID:16214739
A laboratory information management system for DNA barcoding workflows.
Vu, Thuy Duong; Eberhardt, Ursula; Szöke, Szániszló; Groenewald, Marizeth; Robert, Vincent
2012-07-01
This paper presents a laboratory information management system for DNA sequences (LIMS) created and based on the needs of a DNA barcoding project at the CBS-KNAW Fungal Biodiversity Centre (Utrecht, the Netherlands). DNA barcoding is a global initiative for species identification through simple DNA sequence markers. We aim at generating barcode data for all strains (or specimens) included in the collection (currently ca. 80 k). The LIMS has been developed to better manage large amounts of sequence data and to keep track of the whole experimental procedure. The system has allowed us to classify strains more efficiently as the quality of sequence data has improved, and as a result, up-to-date taxonomic names have been given to strains and more accurate correlation analyses have been carried out.
Lammers, Youri; Peelen, Tamara; Vos, Rutger A; Gravendeel, Barbara
2014-02-06
Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation' barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker.
2014-01-01
Background Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. Results The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation’ barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. Conclusions The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker. PMID:24502833
Comparing COI and ITS as DNA barcode markers for mushrooms and allies (Agaricomycotina).
Dentinger, Bryn T M; Didukh, Maryna Y; Moncalvo, Jean-Marc
2011-01-01
DNA barcoding is an approach to rapidly identify species using short, standard genetic markers. The mitochondrial cytochrome oxidase I gene (COI) has been proposed as the universal barcode locus, but its utility for barcoding in mushrooms (ca. 20,000 species) has not been established. We succeeded in generating 167 partial COI sequences (~450 bp) representing ~100 morphospecies from ~650 collections of Agaricomycotina using several sets of new primers. Large introns (~1500 bp) at variable locations were detected in ~5% of the sequences we obtained. We suspect that widespread presence of large introns is responsible for our low PCR success (~30%) with this locus. We also sequenced the nuclear internal transcribed spacer rDNA regions (ITS) to compare with COI. Among the small proportion of taxa for which COI could be sequenced, COI and ITS perform similarly as a barcode. However, in a densely sampled set of closely related taxa, COI was less divergent than ITS and failed to distinguish all terminal clades. Given our results and the wealth of ITS data already available in public databases, we recommend that COI be abandoned in favor of ITS as the primary DNA barcode locus in mushrooms.
NASA Astrophysics Data System (ADS)
Zhou, Hong; Zhang, Zhinan; Chen, Haiyan; Sun, Renhua; Wang, Hui; Guo, Lei; Pan, Haijian
2010-07-01
In this study, we integrated a DNA barcoding project with an ecological survey on intertidal polychaete communities and investigated the utility of CO1 gene sequence as a DNA barcode for the classification of the intertidal polychaetes. Using 16S rDNA as a complementary marker and combining morphological and ecological characterization, some of dominant and common polychaete species from Chinese coasts were assessed for their taxonomic status. We obtained 22 haplotype gene sequences of 13 taxa, including 10 CO1 sequences and 12 16S rDNA sequences. Based on intra- and inter-specific distances, we built phylogenetic trees using the neighbor-joining method. Our study suggested that the mitochondrial CO1 gene was a valid DNA barcoding marker for species identification in polychaetes, but other genes, such as 16S rDNA, could be used as a complementary genetic marker. For more accurate species identification and effective testing of species hypothesis, DNA barcoding should be incorporated with morphological, ecological, biogeographical, and phylogenetic information. The application of DNA barcoding and molecular identification in the ecological survey on the intertidal polychaete communities demonstrated the feasibility of integrating DNA taxonomy and ecology.
Comparing COI and ITS as DNA Barcode Markers for Mushrooms and Allies (Agaricomycotina)
Dentinger, Bryn T. M.; Didukh, Maryna Y.; Moncalvo, Jean-Marc
2011-01-01
DNA barcoding is an approach to rapidly identify species using short, standard genetic markers. The mitochondrial cytochrome oxidase I gene (COI) has been proposed as the universal barcode locus, but its utility for barcoding in mushrooms (ca. 20,000 species) has not been established. We succeeded in generating 167 partial COI sequences (∼450 bp) representing ∼100 morphospecies from ∼650 collections of Agaricomycotina using several sets of new primers. Large introns (∼1500 bp) at variable locations were detected in ∼5% of the sequences we obtained. We suspect that widespread presence of large introns is responsible for our low PCR success (∼30%) with this locus. We also sequenced the nuclear internal transcribed spacer rDNA regions (ITS) to compare with COI. Among the small proportion of taxa for which COI could be sequenced, COI and ITS perform similarly as a barcode. However, in a densely sampled set of closely related taxa, COI was less divergent than ITS and failed to distinguish all terminal clades. Given our results and the wealth of ITS data already available in public databases, we recommend that COI be abandoned in favor of ITS as the primary DNA barcode locus in mushrooms. PMID:21966418
Prado, Blanca R.; Pozo, Carmen; Valdez-Moreno, Martha; Hebert, Paul D. N.
2011-01-01
Background Recent studies have demonstrated the utility of DNA barcoding in the discovery of overlooked species and in the connection of immature and adult stages. In this study, we use DNA barcoding to examine diversity patterns in 121 species of Nymphalidae from the Yucatan Peninsula in Mexico. Our results suggest the presence of cryptic species in 8 of these 121 taxa. As well, the reference database derived from the analysis of adult specimens allowed the identification of nymphalid caterpillars providing new details on host plant use. Methodology/Principal Findings We gathered DNA barcode sequences from 857 adult Nymphalidae representing 121 different species. This total includes four species (Adelpha iphiclus, Adelpha malea, Hamadryas iphtime and Taygetis laches) that were initially overlooked because of their close morphological similarity to other species. The barcode results showed that each of the 121 species possessed a diagnostic array of barcode sequences. In addition, there was evidence of cryptic taxa; seven species included two barcode clusters showing more than 2% sequence divergence while one species included three clusters. All 71 nymphalid caterpillars were identified to a species level by their sequence congruence to adult sequences. These caterpillars represented 16 species, and included Hamadryas julitta, an endemic species from the Yucatan Peninsula whose larval stages and host plant (Dalechampia schottii, also endemic to the Yucatan Peninsula) were previously unknown. Conclusions/Significance This investigation has revealed overlooked species in a well-studied museum collection of nymphalid butterflies and suggests that there is a substantial incidence of cryptic species that await full characterization. The utility of barcoding in the rapid identification of caterpillars also promises to accelerate the assembly of information on life histories, a particularly important advance for hyperdiverse tropical insect assemblages. PMID:22132140
Yao, Hui; Song, Jing-Yuan; Ma, Xin-Ye; Liu, Chang; Li, Ying; Xu, Hong-Xi; Han, Jian-Ping; Duan, Li-Sheng; Chen, Shi-Lin
2009-05-01
DNA barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Although a consensus has not been reached regarding which DNA sequences can be used as the best plant barcodes, the psbA-trnH spacer region has been tested extensively in recent years. In this study, we hypothesize that the psbA-trnH spacer regions are also effective barcodes for Dendrobium species. We have sequenced the chloroplast psbA-trnH intergenic spacers of 17 Dendrobium species to test this hypothesis. The sequences were found to be significantly different from those of other species, with percentages of variation ranging from 0.3 % to 2.3 % and an average of 1.2 %. In contrast, the intraspecific variation among the Dendrobium species studied ranged from 0 % to 0.1 %. The sequence difference between the psbA-trnH sequences of 17 Dendrobium species and one Bulbophyllum odoratissimum ranged from 2.0 % to 3.1 %, with an average of 2.5 %. Our results support the notion that the psbA-trnH intergenic spacer region could be used as a barcode to distinguish various Dendrobium species and to differentiate Dendrobium species from other adulterating species. Copyright Georg Thieme Verlag KG Stuttgart. New York.
Beltman, Joost B; Urbanus, Jos; Velds, Arno; van Rooij, Nienke; Rohr, Jan C; Naik, Shalin H; Schumacher, Ton N
2016-04-02
Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and rare sequences are biologically relevant, the relatively high error rate of NGS techniques complicates data analysis, as it is difficult to distinguish rare true sequences from spurious sequences that are generated by PCR or sequencing errors. This issue, for instance, applies to cellular barcoding strategies that aim to follow the amount and type of offspring of single cells, by supplying these with unique heritable DNA tags. Here, we use genetic barcoding data from the Illumina HiSeq platform to show that straightforward read threshold-based filtering of data is typically insufficient to filter out spurious barcodes. Importantly, we demonstrate that specific sequencing errors occur at an approximately constant rate across different samples that are sequenced in parallel. We exploit this observation by developing a novel approach to filter out spurious sequences. Application of our new method demonstrates its value in the identification of true sequences amongst spurious sequences in biological data sets.
Umdale, Suraj D; Kshirsagar, Parthraj R; Lekhak, Manoj M; Gaikwad, Nikhil B
2017-07-01
Smithia conferta Sm. is an annual herb widely used in Indian traditional medical practice and commonly known as "Lakshman booti" in Sanskrit. Morphological resemblance among the species of genus Smithia Aiton . leads to inaccurate identification and adulteration. This causes inconsistent therapeutic effects and also affects the quality of herbal medicine. This study aimed to generate potential barcode for authentication of S. conferta and its adulterants through DNA barcoding technique. Genomic DNA extracted from S. conferta and its adulterants was used as templates for polymerase chain reaction amplification of the barcoding regions. The amplicons were directed for sequencing, and species identification was conducted using BLASTn and unweighted pair-group method with arithmetic mean trees. In addition, the secondary structures of internal transcribed spacer (ITS) 2 region were predicted. The nucleotide sequence of ITS provides species-specific single nucleotide polymorphisms and sequence divergence (22%) than psb A- trn H (10.9%) and rbc L (3.1%) sequences. The ITS barcode indicates that S. conferta and Smithia sensitiva are closely related compared to other species. ITS is the most applicable barcode for molecular authentication of S. conferta , and further chloroplast barcodes should be tested for phylogenetic analysis of genus Smithia. The present investigation is the first effort of utilization of DNA barcode for molecular authentication of S. conferta and its adulterants. Also, this study expanded the application of the ITS2 sequence data in the authentication. The ITS has been proved as a potential and reliable candidate barcode for the authentication of S. conferta . Abbreviations used: BLASTn: Basic Local Alignment Search Tool for Nucleotide; MEGA: Molecular Evolutionary Genetic Analysis; EMBL: European Molecular Biology Laboratory; psb A- trn H: Photosystem II protein D1- stuctural RNA: His tRNA gene; rbcL: Ribulose 1,5 bi-phosphate carboxylase/oxygenase large subunit gene.
rbcL and matK earn two thumbs up as the core DNA barcode for ferns.
Li, Fay-Wei; Kuo, Li-Yaung; Rothfels, Carl J; Ebihara, Atsushi; Chiou, Wen-Liang; Windham, Michael D; Pryer, Kathleen M
2011-01-01
DNA barcoding will revolutionize our understanding of fern ecology, most especially because the accurate identification of the independent but cryptic gametophyte phase of the fern's life history--an endeavor previously impossible--will finally be feasible. In this study, we assess the discriminatory power of the core plant DNA barcode (rbcL and matK), as well as alternatively proposed fern barcodes (trnH-psbA and trnL-F), across all major fern lineages. We also present plastid barcode data for two genera in the hyperdiverse polypod clade--Deparia (Woodsiaceae) and the Cheilanthes marginata group (currently being segregated as a new genus of Pteridaceae)--to further evaluate the resolving power of these loci. Our results clearly demonstrate the value of matK data, previously unavailable in ferns because of difficulties in amplification due to a major rearrangement of the plastid genome. With its high sequence variation, matK complements rbcL to provide a two-locus barcode with strong resolving power. With sequence variation comparable to matK, trnL-F appears to be a suitable alternative barcode region in ferns, and perhaps should be added to the core barcode region if universal primer development for matK fails. In contrast, trnH-psbA shows dramatically reduced sequence variation for the majority of ferns. This is likely due to the translocation of this segment of the plastid genome into the inverted repeat regions, which are known to have a highly constrained substitution rate. Our study provides the first endorsement of the two-locus barcode (rbcL+matK) in ferns, and favors trnL-F over trnH-psbA as a potential back-up locus. Future work should focus on gathering more fern matK sequence data to facilitate universal primer development.
DNA barcodes for dragonflies and damselflies (Odonata) of Mindanao, Philippines.
Casas, Princess Angelie S; Sing, Kong-Wah; Lee, Ping-Shin; Nuñeza, Olga M; Villanueva, Reagan Joseph T; Wilson, John-James
2018-03-01
Reliable species identification provides a sounder basis for use of species in the order Odonata as biological indicators and for their conservation, an urgent concern as many species are threatened with imminent extinction. We generated 134 COI barcodes from 36 morphologically identified species of Odonata collected from Mindanao Island, representing 10 families and 19 genera. Intraspecific sequence divergences ranged from 0 to 6.7% with four species showing more than 2%, while interspecific sequence divergences ranged from 0.5 to 23.3% with seven species showing less than 2%. Consequently, no distinct gap was observed between intraspecific and interspecific DNA barcode divergences. The numerous islands of the Philippine archipelago may have facilitated rapid speciation in the Odonata and resulted in low interspecific sequence divergences among closely related groups of species. This study contributes DNA barcodes for 36 morphologically identified species of Odonata reported from Mindanao including 31 species with no previous DNA barcode records.
Supervised DNA Barcodes species classification: analysis, comparisons and results
2014-01-01
Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
Single-Cell RNA Sequencing of Glioblastoma Cells.
Sen, Rajeev; Dolgalev, Igor; Bayin, N Sumru; Heguy, Adriana; Tsirigos, Aris; Placantonakis, Dimitris G
2018-01-01
Single-cell RNA sequencing (sc-RNASeq) is a recently developed technique used to evaluate the transcriptome of individual cells. As opposed to conventional RNASeq in which entire populations are sequenced in bulk, sc-RNASeq can be beneficial when trying to better understand gene expression patterns in markedly heterogeneous populations of cells or when trying to identify transcriptional signatures of rare cells that may be underrepresented when using conventional bulk RNASeq. In this method, we describe the generation and analysis of cDNA libraries from single patient-derived glioblastoma cells using the C1 Fluidigm system. The protocol details the use of the C1 integrated fluidics circuit (IFC) for capturing, imaging and lysing cells; performing reverse transcription; and generating cDNA libraries that are ready for sequencing and analysis.
Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou
2014-07-01
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.
Sass, Chodon; Little, Damon P.; Stevenson, Dennis Wm.; Specht, Chelsea D.
2007-01-01
Barcodes are short segments of DNA that can be used to uniquely identify an unknown specimen to species, particularly when diagnostic morphological features are absent. These sequences could offer a new forensic tool in plant and animal conservation—especially for endangered species such as members of the Cycadales. Ideally, barcodes could be used to positively identify illegally obtained material even in cases where diagnostic features have been purposefully removed or to release confiscated organisms into the proper breeding population. In order to be useful, a DNA barcode sequence must not only easily PCR amplify with universal or near-universal reaction conditions and primers, but also contain enough variation to generate unique identifiers at either the species or population levels. Chloroplast regions suggested by the Plant Working Group of the Consortium for the Barcode of Life (CBoL), and two alternatives, the chloroplast psbA-trnH intergenic spacer and the nuclear ribosomal internal transcribed spacer (nrITS), were tested for their utility in generating unique identifiers for members of the Cycadales. Ease of amplification and sequence generation with universal primers and reaction conditions was determined for each of the seven proposed markers. While none of the proposed markers provided unique identifiers for all species tested, nrITS showed the most promise in terms of variability, although sequencing difficulties remain a drawback. We suggest a workflow for DNA barcoding, including database generation and management, which will ultimately be necessary if we are to succeed in establishing a universal DNA barcode for plants. PMID:17987130
Loudig, Olivier; Wang, Tao; Ye, Kenny; Lin, Juan; Wang, Yihong; Ramnauth, Andrew; Liu, Christina; Stark, Azadeh; Chitale, Dhananjay; Greenlee, Robert; Multerer, Deborah; Honda, Stacey; Daida, Yihe; Spencer Feigelson, Heather; Glass, Andrew; Couch, Fergus J.; Rohan, Thomas; Ben-Dov, Iddo Z.
2017-01-01
Formalin-fixed paraffin-embedded (FFPE) specimens, when used in conjunction with patient clinical data history, represent an invaluable resource for molecular studies of cancer. Even though nucleic acids extracted from archived FFPE tissues are degraded, their molecular analysis has become possible. In this study, we optimized a laboratory-based next-generation sequencing barcoded cDNA library preparation protocol for analysis of small RNAs recovered from archived FFPE tissues. Using matched fresh and FFPE specimens, we evaluated the robustness and reproducibility of our optimized approach, as well as its applicability to archived clinical specimens stored for up to 35 years. We then evaluated this cDNA library preparation protocol by performing a miRNA expression analysis of archived breast ductal carcinoma in situ (DCIS) specimens, selected for their relation to the risk of subsequent breast cancer development and obtained from six different institutions. Our analyses identified six miRNAs (miR-29a, miR-221, miR-375, miR-184, miR-363, miR-455-5p) differentially expressed between DCIS lesions from women who subsequently developed an invasive breast cancer (cases) and women who did not develop invasive breast cancer within the same time interval (control). Our thorough evaluation and application of this laboratory-based miRNA sequencing analysis indicates that the preparation of small RNA cDNA libraries can reliably be performed on older, archived, clinically-classified specimens. PMID:28335433
Khamis, F M; Rwomushana, I; Ombura, L O; Cook, G; Mohamed, S A; Tanga, C M; Nderitu, P W; Borgemeister, C; Sétamou, M; Grout, T G; Ekesi, S
2017-12-05
Citrus (Citrus spp.) production continues to decline in East Africa, particularly in Kenya and Tanzania, the two major producers in the region. This decline is attributed to pests and diseases including infestation by the African citrus triozid, Trioza erytreae (Del Guercio) (Hemiptera: Triozidae). Besides direct feeding damage by adults and immature stages, T. erytreae is the main vector of 'Candidatus Liberibacter africanus', the causative agent of Greening disease in Africa, closely related to Huanglongbing. This study aimed to generate a novel barcode reference library for T. erytreae in order to use DNA barcoding as a rapid tool for accurate identification of the pest to aid phytosanitary measures. Triozid samples were collected from citrus orchards in Kenya, Tanzania, and South Africa and from alternative host plants. Sequences generated from populations in the study showed very low variability within acceptable ranges of species. All samples analyzed were linked to T. erytreae of GenBank accession number KU517195. Phylogeny of samples in this study and other Trioza reference species was inferred using the Maximum Likelihood method. The phylogenetic tree was paraphyletic with two distinct branches. The first branch had two clusters: 1) cluster of all populations analyzed with GenBank accession of T. erytreae and 2) cluster of all the other GenBank accession of Trioza species analyzed except T. incrustata Percy, 2016 (KT588307.1), T. eugeniae Froggatt (KY294637.1), and T. grallata Percy, 2016 (KT588308.1) that occupied the second branch as outgroups forming sister clade relationships. These results were further substantiated with genetic distance values and principal component analyses. © The Author(s) 2017. Published by Oxford University Press on behalf of Entomological Society of America.
Bergmame, Laura; Huffman, Jane; Cole, Rebecca; Dayanandan, Selvadurai; Tkach, Vasyl; McLaughlin, J. Daniel
2011-01-01
Flukes belonging to Sphaeridiotrema are important parasites of waterfowl, and 2 morphologically similar species Sphaeridiotrema globulus and Sphaeridiotrema pseudoglobulus, have been implicated in waterfowl mortality in North America. Cytochrome oxidase I (barcode region) and partial LSU-rDNA sequences from specimens of S. globulus and S. pseudoglobulus, obtained from naturally and experimentally infected hosts from New Jersey and Quebec, respectively, confirmed that these species were distinct. Barcode sequences of the 2 species differed at 92 of 590 nucleotide positions (15.6%) and the translated sequences differed by 13 amino acid residues. Partial LSU-rDNA sequences differed at 29 of 1,208 nucleotide positions (2.4%). Additional barcode sequences from specimens collected from waterfowl in Wisconsin and Minnesota and morphometric data obtained from specimens acquired along the north shore of Lake Superior revealed the presence of S. pseudoglobulus in these areas. Although morphometric data suggested the presence of S. globulus in the Lake Superior sample, it was not found among the specimens sequenced from Wisconsin or Minnesota.
Bergmame, L.; Huffman, J.; Cole, R.; Dayanandan, S.; Tkach, V.; McLaughlin, J.D.
2011-01-01
Flukes belonging to Sphaeridiotrema are important parasites of waterfowl, and 2 morphologically similar species Sphaeridiotrema globulus and Sphaeridiotrema pseudoglobulus, have been implicated in waterfowl mortality in North America. Cytochrome oxidase I (barcode region) and partial LSU-rDNA sequences from specimens of S. globulus and S. pseudoglobulus, obtained from naturally and experimentally infected hosts from New Jersey and Quebec, respectively, confirmed that these species were distinct. Barcode sequences of the 2 species differed at 92 of 590 nucleotide positions (15.6%) and the translated sequences differed by 13 amino acid residues. Partial LSU-rDNA sequences differed at 29 of 1,208 nucleotide positions (2.4%). Additional barcode sequences from specimens collected from waterfowl in Wisconsin and Minnesota and morphometric data obtained from specimens acquired along the north shore of Lake Superior revealed the presence of S. pseudoglobulus in these areas. Although morphometric data suggested the presence of S. globulus in the Lake Superior sample, it was not found among the specimens sequenced from Wisconsin or Minnesota. ?? 2011 American Society of Parasitologists.
Chen, Rui; Jiang, Li-Yun; Qiao, Ge-Xia
2012-01-01
The mitochondrial gene COI has been widely used by taxonomists as a standard DNA barcode sequence for the identification of many animal species. However, the COI region is of limited use for identifying certain species and is not efficiently amplified by PCR in all animal taxa. To evaluate the utility of COI as a DNA barcode and to identify other barcode genes, we chose the aphid subfamily Lachninae (Hemiptera: Aphididae) as the focus of our study. We compared the results obtained using COI with two other mitochondrial genes, COII and Cytb. In addition, we propose a new method to improve the efficiency of species identification using DNA barcoding. Three mitochondrial genes (COI, COII and Cytb) were sequenced and were used in the identification of over 80 species of Lachninae. The COI and COII genes demonstrated a greater PCR amplification efficiency than Cytb. Species identification using COII sequences had a higher frequency of success (96.9% in "best match" and 90.8% in "best close match") and yielded lower intra- and higher interspecific genetic divergence values than the other two markers. The use of "tag barcodes" is a new approach that involves attaching a species-specific tag to the standard DNA barcode. With this method, the "barcoding overlap" can be nearly eliminated. As a result, we were able to increase the identification success rate from 83.9% to 95.2% by using COI and the "best close match" technique. A COII-based identification system should be more effective in identifying lachnine species than COI or Cytb. However, the Cytb gene is an effective marker for the study of aphid population genetics due to its high sequence diversity. Furthermore, the use of "tag barcodes" can improve the accuracy of DNA barcoding identification by reducing or removing the overlap between intra- and inter-specific genetic divergence values.
Jeremy R. deWaard; Andrew Mitchell; Melody A. Keena; David Gopurenko; Laura M. Boykin; Karen F. Armstrong; Michael G. Pogue; Joao Lima; Robin Floyd; Robert H. Hanner; Leland M. Humble
2010-01-01
This study demonstrates the efficacy of DNA barcodes for diagnosing species of Lymantria and reinforces the view that the approach is an under-utilized resource with substantial potential for biosecurity and surveillance. Biomonitoring agencies currently employing the NB restriction digest system would gather more information by transitioning to the...
Watermarking spot colors in packaging
NASA Astrophysics Data System (ADS)
Reed, Alastair; Filler, TomáÅ.¡; Falkenstern, Kristyn; Bai, Yang
2015-03-01
In January 2014, Digimarc announced Digimarc® Barcode for the packaging industry to improve the check-out efficiency and customer experience for retailers. Digimarc Barcode is a machine readable code that carries the same information as a traditional Universal Product Code (UPC) and is introduced by adding a robust digital watermark to the package design. It is imperceptible to the human eye but can be read by a modern barcode scanner at the Point of Sale (POS) station. Compared to a traditional linear barcode, Digimarc Barcode covers the whole package with minimal impact on the graphic design. This significantly improves the Items per Minute (IPM) metric, which retailers use to track the checkout efficiency since it closely relates to their profitability. Increasing IPM by a few percent could lead to potential savings of millions of dollars for retailers, giving them a strong incentive to add the Digimarc Barcode to their packages. Testing performed by Digimarc showed increases in IPM of at least 33% using the Digimarc Barcode, compared to using a traditional barcode. A method of watermarking print ready image data used in the commercial packaging industry is described. A significant proportion of packages are printed using spot colors, therefore spot colors needs to be supported by an embedder for Digimarc Barcode. Digimarc Barcode supports the PANTONE spot color system, which is commonly used in the packaging industry. The Digimarc Barcode embedder allows a user to insert the UPC code in an image while minimizing perceptibility to the Human Visual System (HVS). The Digimarc Barcode is inserted in the printing ink domain, using an Adobe Photoshop plug-in as the last step before printing. Since Photoshop is an industry standard widely used by pre-press shops in the packaging industry, a Digimarc Barcode can be easily inserted and proofed.
NASA Astrophysics Data System (ADS)
Gebhardt, Katharina; Knebelsberger, Thomas
2015-09-01
We morphologically analyzed 79 cephalopod specimens from the North and Baltic Seas belonging to 13 separate species. Another 29 specimens showed morphological features of either Alloteuthis mediaor Alloteuthis subulata or were found to be in between. Reliable identification features to distinguish between A. media and A. subulata are currently not available. The analysis of the DNA barcoding region of the COI gene revealed intraspecific distances (uncorrected p) ranging from 0 to 2.13 % (average 0.1 %) and interspecific distances between 3.31 and 22 % (average 15.52 %). All species formed monophyletic clusters in a neighbor-joining analysis and were supported by bootstrap values of ≥99 %. All COI haplotypes belonging to the 29 Alloteuthis specimens were grouped in one cluster. Neither COI nor 18S rDNA sequences helped to distinguish between the different Alloteuthis morphotypes. For species identification purposes, we recommend the use of COI, as it showed higher bootstrap support of species clusters and less amplification and sequencing failure compared to 18S. Our data strongly support the assumption that the genus Alloteuthis is only represented by a single species, at least in the North Sea. It remained unclear whether this species is A. subulata or A. media. All COI sequences including important metadata were uploaded to the Barcode of Life Data Systems and can be used as reference library for the molecular identification of more than 50 % of the cephalopod fauna known from the North and Baltic Seas.
Quantitative Tracking of Combinatorially Engineered Populations with Multiplexed Binary Assemblies.
Zeitoun, Ramsey I; Pines, Gur; Grau, Willliam C; Gill, Ryan T
2017-04-21
Advances in synthetic biology and genomics have enabled full-scale genome engineering efforts on laboratory time scales. However, the absence of sufficient approaches for mapping engineered genomes at system-wide scales onto performance has limited the adoption of more sophisticated algorithms for engineering complex biological systems. Here we report on the development and application of a robust approach to quantitatively map combinatorially engineered populations at scales up to several dozen target sites. This approach works by assembling genome engineered sites with cell-specific barcodes into a format compatible with high-throughput sequencing technologies. This approach, called barcoded-TRACE (bTRACE) was applied to assess E. coli populations engineered by recursive multiplex recombineering across both 6-target sites and 31-target sites. The 31-target library was then tracked throughout growth selections in the presence and absence of isopentenol (a potential next-generation biofuel). We also use the resolution of bTRACE to compare the influence of technical and biological noise on genome engineering efforts.
Mobberley, Jennifer M; Ortega, Maya C; Foster, Jamie S
2012-01-01
Thrombolites are unlaminated carbonate structures that form as a result of the metabolic interactions of complex microbial mat communities. Thrombolites have a long geological history; however, little is known regarding the microbes associated with modern structures. In this study, we use a barcoded 16S rRNA gene-pyrosequencing approach coupled with morphological analysis to assess the bacterial, cyanobacterial and archaeal diversity associated with actively forming thrombolites found in Highborne Cay, Bahamas. Analyses revealed four distinct microbial mat communities referred to as black, beige, pink and button mats on the surfaces of the thrombolites. At a coarse phylogenetic resolution, the domain bacterial sequence libraries from the four mats were similar, with Proteobacteria and Cyanobacteria being the most abundant. At the finer resolution of the rRNA gene sequences, significant differences in community structure were observed, with dramatically different cyanobacterial communities. Of the four mat types, the button mats contained the highest diversity of Cyanobacteria, and were dominated by two sequence clusters with high similarity to the genus Dichothrix, an organism associated with the deposition of carbonate. Archaeal diversity was low, but varied in all mat types, and the archaeal community was predominately composed of members of the Thaumarchaeota and Euryarchaeota. The morphological and genetic data support the hypothesis that the four mat types are distinctive thrombolitic mat communities. © 2011 Society for Applied Microbiology and Blackwell Publishing Ltd.
de Groot, G. Arjen; During, Heinjo J.; Maas, Jan W.; Schneider, Harald; Vogel, Johannes C.; Erkens, Roy H. J.
2011-01-01
Although consensus has now been reached on a general two-locus DNA barcode for land plants, the selected combination of markers (rbcL + matK) is not applicable for ferns at the moment. Yet especially for ferns, DNA barcoding is potentially of great value since fern gametophytes—while playing an essential role in fern colonization and reproduction—generally lack the morphological complexity for morphology-based identification and have therefore been underappreciated in ecological studies. We evaluated the potential of a combination of rbcL with a noncoding plastid marker, trnL-F, to obtain DNA-identifications for fern species. A regional approach was adopted, by creating a reference database of trusted rbcL and trnL-F sequences for the wild-occurring homosporous ferns of NW-Europe. A combination of parsimony analyses and distance-based analyses was performed to evaluate the discriminatory power of the two-region barcode. DNA was successfully extracted from 86 tiny fern gametophytes and was used as a test case for the performance of DNA-based identification. Primer universality proved high for both markers. Based on the combined rbcL + trnL-F dataset, all genera as well as all species with non-equal chloroplast genomes formed their own well supported monophyletic clade, indicating a high discriminatory power. Interspecific distances were larger than intraspecific distances for all tested taxa. Identification tests on gametophytes showed a comparable result. All test samples could be identified to genus level, species identification was well possible unless they belonged to a pair of Dryopteris species with completely identical chloroplast genomes. Our results suggest a high potential of the combined use of rbcL and trnL-F as a two-locus cpDNA barcode for identification of fern species. A regional approach may be preferred for ecological tests. We here offer such a ready-to-use barcoding approach for ferns, which opens the way for answering a whole range of questions previously unaddressed in fern gametophyte ecology. PMID:21298108
DNA Barcoding of Marine Metazoa
NASA Astrophysics Data System (ADS)
Bucklin, Ann; Steinke, Dirk; Blanco-Bercial, Leocadio
2011-01-01
More than 230,000 known species representing 31 metazoan phyla populate the world's oceans. Perhaps another 1,000,000 or more species remain to be discovered. There is reason for concern that species extinctions may outpace discovery, especially in diverse and endangered marine habitats such as coral reefs. DNA barcodes (i.e., short DNA sequences for species recognition and discrimination) are useful tools to accelerate species-level analysis of marine biodiversity and to facilitate conservation efforts. This review focuses on the usual barcode region for metazoans: a ˜648 base-pair region of the mitochondrial cytochrome c oxidase subunit I (COI) gene. Barcodes have also been used for population genetic and phylogeographic analysis, identification of prey in gut contents, detection of invasive species, forensics, and seafood safety. More controversially, barcodes have been used to delimit species boundaries, reveal cryptic species, and discover new species. Emerging frontiers are the use of barcodes for rapid and increasingly automated biodiversity assessment by high-throughput sequencing, including environmental barcoding and the use of barcodes to detect species for which formal identification or scientific naming may never be possible.
Tropical Plant–Herbivore Networks: Reconstructing Species Interactions Using DNA Barcodes
García-Robledo, Carlos; Erickson, David L.; Staines, Charles L.; Erwin, Terry L.; Kress, W. John
2013-01-01
Plants and their associated insect herbivores, represent more than 50% of all known species on earth. The first step in understanding the mechanisms generating and maintaining this important component of biodiversity is to identify plant-herbivore associations. In this study we determined insect-host plant associations for an entire guild of insect herbivores using plant DNA extracted from insect gut contents. Over two years, in a tropical rain forest in Costa Rica (La Selva Biological Station), we recorded the full diet breadth of rolled-leaf beetles, a group of herbivores that feed on plants in the order Zingiberales. Field observations were used to determine the accuracy of diet identifications using a three-locus DNA barcode (rbcL, trnH-psbA and ITS2). Using extraction techniques for ancient DNA, we obtained high-quality sequences for two of these loci from gut contents (rbcL and ITS2). Sequences were then compared to a comprehensive DNA barcode library of the Zingiberales. The rbcL locus identified host plants to family (success/sequence = 58.8%) and genus (success/sequence = 47%). For all Zingiberales except Heliconiaceae, ITS2 successfully identified host plants to genus (success/sequence = 67.1%) and species (success/sequence = 61.6%). Kindt’s sampling estimates suggest that by collecting ca. four individuals representing each plant-herbivore interaction, 99% of all host associations included in this study can be identified to genus. For plants that amplified ITS2, 99% of the hosts can be identified to species after collecting at least four individuals representing each interaction. Our study demonstrates that host plant identifications at the species-level using DNA barcodes are feasible, cost-effective, and reliable, and that reconstructing plant-herbivore networks with these methods will become the standard for a detailed understanding of these interactions. PMID:23308128
The Barcode of Life Data Portal: Bridging the Biodiversity Informatics Divide for DNA Barcoding
Sarkar, Indra Neil; Trizna, Michael
2011-01-01
With the volume of molecular sequence data that is systematically being generated globally, there is a need for centralized resources for data exploration and analytics. DNA Barcode initiatives are on track to generate a compendium of molecular sequence–based signatures for identifying animals and plants. To date, the range of available data exploration and analytic tools to explore these data have only been available in a boutique form—often representing a frustrating hurdle for many researchers that may not necessarily have resources to install or implement algorithms described by the analytic community. The Barcode of Life Data Portal (BDP) is a first step towards integrating the latest biodiversity informatics innovations with molecular sequence data from DNA barcoding. Through establishment of community driven standards, based on discussion with the Data Analysis Working Group (DAWG) of the Consortium for the Barcode of Life (CBOL), the BDP provides an infrastructure for incorporation of existing and next-generation DNA barcode analytic applications in an open forum. PMID:21818249
Tropical montane nymphalids in Mexico: DNA barcodes reveal greater diversity.
Escalante, Patricia; Ibarra-Vazquez, Adolfo; Rosas-Escobar, Patricia
2010-12-01
DNA sequences obtained for the Barcode of Life library in the All Lepidoptera Campaign project Nymphalidae of Central Mexico were analyzed as a test of species limits and to explore possible phylogenetic groupings in the Preponini tribe. Using specimens in the National Insect Collection of the Instituto de Biología of the Universidad Nacional Autónoma de México, 78 specimens were assayed for cytochrome oxidase c subunit 1. Disregarding the missing data, there were 458 conserved sites, 200 variable sites and 187 parsimony-informative sites. The neighbor-joining and maximum likelihood analyses indicate that none of the three genera of Preponini as currently circumscribed are reciprocally monophyletic. As per species limits, high levels of barcode variation in the Prepona deiphile complex suggest the existence of at least two new endemic species to Mexico. The divergent taxa were escalantiana from the Tuxtlas region in Veracruz, and ibarra from Sierra Madre del Sur in the Pacific states of southern Mexico. The genetic distance in the CO1 fragment between them and the other deiphile populations ranged from 2.7 to 8.0%. We recommend that morphological data need to be re-examined and that additional molecular data for species ought to be gathered before a particular biogeographic model can be proposed for the group in Mesoamerica.
Kress, W John; Erickson, David L
2007-06-06
A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination.
Yu, Ning; Wei, Yu-Long; Zhang, Xin; Zhu, Ning; Wang, Yan-Li; Zhu, Yue; Zhang, Hai-Ping; Li, Fen-Mei; Yang, Lan; Sun, Jia-Qi; Sun, Ai-Dong
2017-07-11
Trachelospermum jasminoides is commonly used in traditional Chinese medicine. However, the use of the plant's local alternatives is frequent, causing potential clinical problems. The T. jasminoides sold in the medicine market is commonly dried and sliced, making traditional identification methods difficult. In this study, the ITS2 region was evaluated on 127 sequences representing T. jasminoides and its local alternatives according to PCR and sequencing rates, intra- and inter-specific divergences, secondary structure, and discrimination capacity. Results indicated the 100% success rates of PCR and sequencing and the obvious presence of a barcoding gap. Results of BLAST 1, nearest distance and neighbor-joining tree methods showed that barcode ITS2 could successfully identify all the texted samples. The secondary structures of the ITS2 region provided another dimensionality for species identification. Two-dimensional images were obtained for better and easier identification. Previous studies on DNA barcoding concentrated more on the same family, genus, or species. However, an ideal barcode should be variable enough to identify closely related species. Meanwhile, the barcodes should also be conservative in identifying distantly related species. This study highlights the application of barcode ITS2 in solving practical problems in the distantly related local alternatives of medical plants.
Kim, W J; Ji, Y; Choi, G; Kang, Y M; Yang, S; Moon, B C
2016-08-05
This study was performed to identify and analyze the phylogenetic relationship among four herbaceous species of the genus Paeonia, P. lactiflora, P. japonica, P. veitchii, and P. suffruticosa, using DNA barcodes. These four species, which are commonly used in traditional medicine as Paeoniae Radix and Moutan Radicis Cortex, are pharmaceutically defined in different ways in the national pharmacopoeias in Korea, Japan, and China. To authenticate the different species used in these medicines, we evaluated rDNA-internal transcribed spacers (ITS), matK and rbcL regions, which provide information capable of effectively distinguishing each species from one another. Seventeen samples were collected from different geographic regions in Korea and China, and DNA barcode regions were amplified using universal primers. Comparative analyses of these DNA barcode sequences revealed species-specific nucleotide sequences capable of discriminating the four Paeonia species. Among the entire sequences of three barcodes, marker nucleotides were identified at three positions in P. lactiflora, eleven in P. japonica, five in P. veitchii, and 25 in P. suffruticosa. Phylogenetic analyses also revealed four distinct clusters showing homogeneous clades with high resolution at the species level. The results demonstrate that the analysis of these three DNA barcode sequences is a reliable method for identifying the four Paeonia species and can be used to authenticate Paeoniae Radix and Moutan Radicis Cortex at the species level. Furthermore, based on the assessment of amplicon sizes, inter/intra-specific distances, marker nucleotides, and phylogenetic analysis, rDNA-ITS was the most suitable DNA barcode for identification of these species.
Lopes-Andrade, Cristiano; Grebennikov, Vasily V
2015-08-25
We report the first record of the beetle tribe Xylographellini (Ciidae) from the continental Palaearctic Region, represented by five new species discovered in Yunnan and Sichuan provinces, China: Scolytocis danae sp. nov., Syncosmetus euryale sp. nov., Sync. medusa sp. nov., Sync. perseus sp. nov. and Sync. stheno sp. nov. Illustrations and identification keys are provided for these new species, and in order to facilitate further research of Ciidae we present an open-access DNA barcode library (dx.doi.org/10.5883/DS-SYNCOSM) containing 114 records (of 44 species in 14 genera), 15 of which belong to the newly described species. A phylogenetic analysis based on the barcode fragment of the cytochrome oxidase I gene did not recover much tree structure within Ciidae, however both Xylographus Mellié and Syncosmetus Sharp were recovered as clades, with a single Scolytocis Blair being the sister to the latter.
Mini-DNA barcode in identification of the ornamental fish: A case study from Northeast India.
Dhar, Bishal; Ghosh, Sankar Kumar
2017-09-05
The ornamental fishes were exported under the trade names or generic names, thus creating problems in species identification. In this regard, DNA barcoding could effectively elucidate the actual species status. However, the problem arises if the specimen is having taxonomic disputes, falsified by trade/generic names, etc., On the other hand, barcoding the archival museum specimens would be of greater benefit to address such issues as it would create firm, error-free reference database for rapid identification of any species. This can be achieved only by generating short sequences as DNA from chemically preserved are mostly degraded. Here we aimed to identify a short stretch of informative sites within the full-length barcode segment, capable of delineating diverse group of ornamental fish species, commonly traded from NE India. We analyzed 287 full-length barcode sequences from the major fish orders and compared the interspecific K2P distance with nucleotide substitutions patterns and found a strong correlation of interspecies distance with transversions (0.95, p<0.001). We, therefore, proposed a short stretch of 171bp (transversion rich) segment as mini-barcode. The proposed segment was compared with the full-length barcodes and found to delineate the species effectively. Successful PCR amplification and sequencing of the 171bp segment using designed primers for different orders validated it as mini-barcodes for ornamental fishes. Thus, our findings would be helpful in strengthening the global database with the sequence of archived fish species as well as an effective identification tool of the traded ornamental fish species, as a less time consuming, cost effective field-based application. Copyright © 2017 Elsevier B.V. All rights reserved.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.
Sucher, Nikolaus J; Hennell, James R; Carles, Maria C
2012-01-01
DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
NASA Astrophysics Data System (ADS)
Bucklin, Ann; Ortman, Brian D.; Jennings, Robert M.; Nigro, Lisa M.; Sweetman, Christopher J.; Copley, Nancy J.; Sutton, Tracey; Wiebe, Peter H.
2010-12-01
Species diversity of the metazoan holozooplankton assemblage of the Sargasso Sea, Northwest Atlantic Ocean, was examined through coordinated morphological taxonomic identification of species and DNA sequencing of a ˜650 base-pair region of mitochondrial cytochrome oxidase I (mtCOI) as a DNA barcode (i.e., short sequence for species recognition and discrimination). Zooplankton collections were made from the surface to 5,000 meters during April, 2006 on the R/V R.H. Brown. Samples were examined by a ship-board team of morphological taxonomists; DNA barcoding was carried out in both ship-board and land-based DNA sequencing laboratories. DNA barcodes were determined for a total of 297 individuals of 175 holozooplankton species in four phyla, including: Cnidaria (Hydromedusae, 4 species; Siphonophora, 47); Arthropoda (Amphipoda, 10; Copepoda, 34; Decapoda, 9; Euphausiacea, 10; Mysidacea, 1; Ostracoda, 27); and Mollusca (Cephalopoda, 8; Heteropoda, 6; Pteropoda, 15); and Chaetognatha (4). Thirty species of fish (Teleostei) were also barcoded. For all seven zooplankton groups for which sufficient data were available, Kimura-2-Parameter genetic distances were significantly lower between individuals of the same species (mean=0.0114; S.D. 0.0117) than between individuals of different species within the same group (mean=0.3166; S.D. 0.0378). This difference, known as the barcode gap, ensures that mtCOI sequences are reliable characters for species identification for the oceanic holozooplankton assemblage. In addition, DNA barcodes allow recognition of new or undescribed species, reveal cryptic species within known taxa, and inform phylogeographic and population genetic studies of geographic variation. The growing database of "gold standard" DNA barcodes serves as a Rosetta Stone for marine zooplankton, providing the key for decoding species diversity by linking species names, morphology, and DNA sequence variation. In light of the pivotal position of zooplankton in ocean food webs, their usefulness as rapid responders to environmental change, and the increasing scarcity of taxonomists, the use of DNA barcodes is an important and useful approach for rapid analysis of species diversity and distribution in the pelagic community.
Heinrichs, Guido; de Hoog, G. Sybren
2012-01-01
Herpotrichiellaceous black yeasts and relatives comprise severe pathogens flanked by nonpathogenic environmental siblings. Reliable identification by conventional methods is notoriously difficult. Molecular identification is hampered by the sequence variability in the internal transcribed spacer (ITS) domain caused by difficult-to-sequence homopolymeric regions and by poor taxonomic attribution of sequences deposited in GenBank. Here, we present a potential solution using short barcode identifiers (27 to 50 bp) based on ITS2 ribosomal DNA (rDNA), which allows unambiguous definition of species-specific fragments. Starting from proven sequences of ex-type and authentic strains, we were able to describe 103 identifiers. Multiple BLAST searches of these proposed barcode identifiers in GenBank revealed uniqueness for 100 taxonomic entities, whereas the three remaining identifiers each matched with two entities, but the species of these identifiers could easily be discriminated by differences in the remaining ITS regions. Using the proposed barcode identifiers, a 4.1-fold increase of 100% matches in GenBank was achieved in comparison to the classical approach using the complete ITS sequences. The proposed barcode identifiers will be made accessible for the diagnostic laboratory in a permanently updated online database, thereby providing a highly practical, reliable, and cost-effective tool for identification of clinically important black yeasts and relatives. PMID:22785187
Testing the Efficacy of DNA Barcodes for Identifying the Vascular Plants of Canada.
Braukmann, Thomas W A; Kuzmina, Maria L; Sills, Jesse; Zakharov, Evgeny V; Hebert, Paul D N
2017-01-01
Their relatively slow rates of molecular evolution, as well as frequent exposure to hybridization and introgression, often make it difficult to discriminate species of vascular plants with the standard barcode markers (rbcL, matK, ITS2). Previous studies have examined these constraints in narrow geographic or taxonomic contexts, but the present investigation expands analysis to consider the performance of these gene regions in discriminating the species in local floras at sites across Canada. To test identification success, we employed a DNA barcode reference library with sequence records for 96% of the 5108 vascular plant species known from Canada, but coverage varied from 94% for rbcL to 60% for ITS2 and 39% for matK. Using plant lists from 27 national parks and one scientific reserve, we tested the efficacy of DNA barcodes in identifying the plants in simulated species assemblages from six biogeographic regions of Canada using BLAST and mothur. Mean pairwise distance (MPD) and mean nearest taxon distance (MNTD) were strong predictors of barcode performance for different plant families and genera, and both metrics supported ITS2 as possessing the highest genetic diversity. All three genes performed strongly in assigning the taxa present in local floras to the correct genus with values ranging from 91% for rbcL to 97% for ITS2 and 98% for matK. However, matK delivered the highest species discrimination (~81%) followed by ITS2 (~72%) and rbcL (~44%). Despite the low number of plant taxa in the Canadian Arctic, DNA barcodes had the least success in discriminating species from this biogeographic region with resolution ranging from 36% with rbcL to 69% with matK. Species resolution was higher in the other settings, peaking in the Woodland region at 52% for rbcL and 87% for matK. Our results indicate that DNA barcoding is very effective in identifying Canadian plants to a genus, and that it performs well in discriminating species in regions where floristic diversity is highest.
Testing the Efficacy of DNA Barcodes for Identifying the Vascular Plants of Canada
Kuzmina, Maria L.; Sills, Jesse; Zakharov, Evgeny V.; Hebert, Paul D. N.
2017-01-01
Their relatively slow rates of molecular evolution, as well as frequent exposure to hybridization and introgression, often make it difficult to discriminate species of vascular plants with the standard barcode markers (rbcL, matK, ITS2). Previous studies have examined these constraints in narrow geographic or taxonomic contexts, but the present investigation expands analysis to consider the performance of these gene regions in discriminating the species in local floras at sites across Canada. To test identification success, we employed a DNA barcode reference library with sequence records for 96% of the 5108 vascular plant species known from Canada, but coverage varied from 94% for rbcL to 60% for ITS2 and 39% for matK. Using plant lists from 27 national parks and one scientific reserve, we tested the efficacy of DNA barcodes in identifying the plants in simulated species assemblages from six biogeographic regions of Canada using BLAST and mothur. Mean pairwise distance (MPD) and mean nearest taxon distance (MNTD) were strong predictors of barcode performance for different plant families and genera, and both metrics supported ITS2 as possessing the highest genetic diversity. All three genes performed strongly in assigning the taxa present in local floras to the correct genus with values ranging from 91% for rbcL to 97% for ITS2 and 98% for matK. However, matK delivered the highest species discrimination (~81%) followed by ITS2 (~72%) and rbcL (~44%). Despite the low number of plant taxa in the Canadian Arctic, DNA barcodes had the least success in discriminating species from this biogeographic region with resolution ranging from 36% with rbcL to 69% with matK. Species resolution was higher in the other settings, peaking in the Woodland region at 52% for rbcL and 87% for matK. Our results indicate that DNA barcoding is very effective in identifying Canadian plants to a genus, and that it performs well in discriminating species in regions where floristic diversity is highest. PMID:28072819
rbcL and matK Earn Two Thumbs Up as the Core DNA Barcode for Ferns
Li, Fay-Wei; Kuo, Li-Yaung; Rothfels, Carl J.; Ebihara, Atsushi; Chiou, Wen-Liang; Windham, Michael D.; Pryer, Kathleen M.
2011-01-01
Background DNA barcoding will revolutionize our understanding of fern ecology, most especially because the accurate identification of the independent but cryptic gametophyte phase of the fern's life history—an endeavor previously impossible—will finally be feasible. In this study, we assess the discriminatory power of the core plant DNA barcode (rbcL and matK), as well as alternatively proposed fern barcodes (trnH-psbA and trnL-F), across all major fern lineages. We also present plastid barcode data for two genera in the hyperdiverse polypod clade—Deparia (Woodsiaceae) and the Cheilanthes marginata group (currently being segregated as a new genus of Pteridaceae)—to further evaluate the resolving power of these loci. Principal Findings Our results clearly demonstrate the value of matK data, previously unavailable in ferns because of difficulties in amplification due to a major rearrangement of the plastid genome. With its high sequence variation, matK complements rbcL to provide a two-locus barcode with strong resolving power. With sequence variation comparable to matK, trnL-F appears to be a suitable alternative barcode region in ferns, and perhaps should be added to the core barcode region if universal primer development for matK fails. In contrast, trnH-psbA shows dramatically reduced sequence variation for the majority of ferns. This is likely due to the translocation of this segment of the plastid genome into the inverted repeat regions, which are known to have a highly constrained substitution rate. Conclusions Our study provides the first endorsement of the two-locus barcode (rbcL+matK) in ferns, and favors trnL-F over trnH-psbA as a potential back-up locus. Future work should focus on gathering more fern matK sequence data to facilitate universal primer development. PMID:22028918
Kress, W. John; Erickson, David L.
2007-01-01
Background A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Methodology/Principal Findings Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. Conclusions/Significance A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination. PMID:17551588
De Ley, Paul; De Ley, Irma Tandingan; Morris, Krystalynne; Abebe, Eyualem; Mundo-Ocampo, Manuel; Yoder, Melissa; Heras, Joseph; Waumann, Dora; Rocha-Olivares, Axayácatl; Jay Burr, A.H; Baldwin, James G; Thomas, W. Kelley
2005-01-01
Molecular surveys of meiofaunal diversity face some interesting methodological challenges when it comes to interstitial nematodes from soils and sediments. Morphology-based surveys are greatly limited in processing speed, while barcoding approaches for nematodes are hampered by difficulties of matching sequence data with traditional taxonomy. Intermediate technology is needed to bridge the gap between both approaches. An example of such technology is video capture and editing microscopy, which consists of the recording of taxonomically informative multifocal series of microscopy images as digital video clips. The integration of multifocal imaging with sequence analysis of the D2D3 region of large subunit (LSU) rDNA is illustrated here in the context of a combined morphological and barcode sequencing survey of marine nematodes from Baja California and California. The resulting video clips and sequence data are made available online in the database NemATOL (http://nematol.unh.edu/). Analyses of 37 barcoded nematodes suggest that these represent at least 32 species, none of which matches available D2D3 sequences in public databases. The recorded multifocal vouchers allowed us to identify most specimens to genus, and will be used to match specimens with subsequent species identifications and descriptions of preserved specimens. Like molecular barcodes, multifocal voucher archives are part of a wider effort at structuring and changing the process of biodiversity discovery. We argue that data-rich surveys and phylogenetic tools for analysis of barcode sequences are an essential component of the exploration of phyla with a high fraction of undiscovered species. Our methods are also directly applicable to other meiofauna such as for example gastrotrichs and tardigrades. PMID:16214752
de Muinck, Eric J; Trosvik, Pål; Gilfillan, Gregor D; Hov, Johannes R; Sundaram, Arvind Y M
2017-07-06
Advances in sequencing technologies and bioinformatics have made the analysis of microbial communities almost routine. Nonetheless, the need remains to improve on the techniques used for gathering such data, including increasing throughput while lowering cost and benchmarking the techniques so that potential sources of bias can be better characterized. We present a triple-index amplicon sequencing strategy to sequence large numbers of samples at significantly lower c ost and in a shorter timeframe compared to existing methods. The design employs a two-stage PCR protocol, incorpo rating three barcodes to each sample, with the possibility to add a fourth-index. It also includes heterogeneity spacers to overcome low complexity issues faced when sequencing amplicons on Illumina platforms. The library preparation method was extensively benchmarked through analysis of a mock community in order to assess biases introduced by sample indexing, number of PCR cycles, and template concentration. We further evaluated the method through re-sequencing of a standardized environmental sample. Finally, we evaluated our protocol on a set of fecal samples from a small cohort of healthy adults, demonstrating good performance in a realistic experimental setting. Between-sample variation was mainly related to batch effects, such as DNA extraction, while sample indexing was also a significant source of bias. PCR cycle number strongly influenced chimera formation and affected relative abundance estimates of species with high GC content. Libraries were sequenced using the Illumina HiSeq and MiSeq platforms to demonstrate that this protocol is highly scalable to sequence thousands of samples at a very low cost. Here, we provide the most comprehensive study of performance and bias inherent to a 16S rRNA gene amplicon sequencing method to date. Triple-indexing greatly reduces the number of long custom DNA oligos required for library preparation, while the inclusion of variable length heterogeneity spacers minimizes the need for PhiX spike-in. This design results in a significant cost reduction of highly multiplexed amplicon sequencing. The biases we characterize highlight the need for highly standardized protocols. Reassuringly, we find that the biological signal is a far stronger structuring factor than the various sources of bias.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nierman, William C.
At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phredmore » Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.« less
Flexbar 3.0 - SIMD and multicore parallelization.
Roehr, Johannes T; Dieterich, Christoph; Reinert, Knut
2017-09-15
High-throughput sequencing machines can process many samples in a single run. For Illumina systems, sequencing reads are barcoded with an additional DNA tag that is contained in the respective sequencing adapters. The recognition of barcode and adapter sequences is hence commonly needed for the analysis of next-generation sequencing data. Flexbar performs demultiplexing based on barcodes and adapter trimming for such data. The massive amounts of data generated on modern sequencing machines demand that this preprocessing is done as efficiently as possible. We present Flexbar 3.0, the successor of the popular program Flexbar. It employs now twofold parallelism: multi-threading and additionally SIMD vectorization. Both types of parallelism are used to speed-up the computation of pair-wise sequence alignments, which are used for the detection of barcodes and adapters. Furthermore, new features were included to cover a wide range of applications. We evaluated the performance of Flexbar based on a simulated sequencing dataset. Our program outcompetes other tools in terms of speed and is among the best tools in the presented quality benchmark. https://github.com/seqan/flexbar. johannes.roehr@fu-berlin.de or knut.reinert@fu-berlin.de. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
On site DNA barcoding by nanopore sequencing
Menegon, Michele; Cantaloni, Chiara; Rodriguez-Prieto, Ana; Centomo, Cesare; Abdelfattah, Ahmed; Rossato, Marzia; Bernardi, Massimo; Xumerle, Luciano; Loader, Simon; Delledonne, Massimo
2017-01-01
Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet’s biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities. PMID:28977016
Cytochrome c oxidase subunit I barcoding of the green bee-eater (Merops orientalis).
Arif, I A; Khan, H A; Shobrak, M; Williams, J
2011-10-21
DNA barcoding using mitochondrial cytochrome c oxidase subunit I (COI) is regarded as a standard method for species identification. Recent reports have also shown extended applications of COI gene analysis in phylogeny and molecular diversity studies. The bee-eaters are a group of near passerine birds in the family Meropidae. There are 26 species worldwide; five of them are found in Saudi Arabia. Until now, GenBank included a COI barcode for only one species of bee-eater, the European bee-eater (Merops apiaster). We sequenced the 694-bp segment of the COI gene of the green bee-eater M. orientalis and compared the sequences with those of M. apiaster. Pairwise sequence comparison showed 66 variable sites across all the eight sequences from both species, with an interspecific genetic distance of 0.0362. Two and one within-species variable sites were found, with genetic distances of 0.0005 and 0.0003 for M. apiaster and M. orientalis, respectively. This is the first study reporting barcodes for M. orientalis.
Chakraborty, Mohua; Dhar, Bishal; Ghosh, Sankar Kumar
2017-11-01
The DNA barcodes are generally interpreted using distance-based and character-based methods. The former uses clustering of comparable groups, based on the relative genetic distance, while the latter is based on the presence or absence of discrete nucleotide substitutions. The distance-based approach has a limitation in defining a universal species boundary across the taxa as the rate of mtDNA evolution is not constant throughout the taxa. However, character-based approach more accurately defines this using a unique set of nucleotide characters. The character-based analysis of full-length barcode has some inherent limitations, like sequencing of the full-length barcode, use of a sparse-data matrix and lack of a uniform diagnostic position for each group. A short continuous stretch of a fragment can be used to resolve the limitations. Here, we observe that a 154-bp fragment, from the transversion-rich domain of 1367 COI barcode sequences can successfully delimit species in the three most diverse orders of freshwater fishes. This fragment is used to design species-specific barcode motifs for 109 species by the character-based method, which successfully identifies the correct species using a pattern-matching program. The motifs also correctly identify geographically isolated population of the Cypriniformes species. Further, this region is validated as a species-specific mini-barcode for freshwater fishes by successful PCR amplification and sequencing of the motif (154 bp) using the designed primers. We anticipate that use of such motifs will enhance the diagnostic power of DNA barcode, and the mini-barcode approach will greatly benefit the field-based system of rapid species identification. © 2017 John Wiley & Sons Ltd.
Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C; Quake, Stephen R; Burkholder, William F
2013-01-01
Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation.
Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C.; Quake, Stephen R.; Burkholder, William F.
2013-01-01
Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation. PMID:23894273
Zou, Shanmei; Li, Qi
2016-06-01
With the global biodiversity crisis, DNA barcoding aims for fast species identification and cryptic species diversity revelation. For more than 10 years, large amounts of DNA barcode data have been accumulating in publicly available databases, most of which were conducted by distance or tree-building methods that have often been argued, especially for cryptic species revelation. In this context, overlooked cryptic diversity may exist in the available barcoding data. The character-based DNA barcoding, however, has a good chance for detecting the overlooked cryptic diversity. In this study, marine mollusk was as the ideal case for detecting the overlooked potential cryptic species from existing cytochrome c oxidase I (COI) sequences with character-based DNA barcode. A total of 1081 COI sequences of mollusks, belonging to 176 species of 25 families of Gastropoda, Cephalopoda, and Lamellibranchia, were conducted by character analysis. As a whole, the character-based barcoding results were consistent with previous distance and tree-building analysis for species discrimination. More importantly, quite a number of species analyzed were divided into distinct clades with unique diagnostical characters. Based on the concept of cryptic species revelation of character-based barcoding, these species divided into separate taxonomic groups might be potential cryptic species. The detection of the overlooked potential cryptic diversity proves that the character-based barcoding mode possesses more advantages of revealing cryptic biodiversity. With the development of DNA barcoding, making the best use of barcoding data is worthy of our attention for species conservation.
Mioduchowska, Monika; Czyż, Michał Jan; Gołdyn, Bartłomiej; Kur, Jarosław; Sell, Jerzy
2018-01-01
The cytochrome c oxidase subunit I (cox1) gene is the main mitochondrial molecular marker playing a pivotal role in phylogenetic research and is a crucial barcode sequence. Folmer's "universal" primers designed to amplify this gene in metazoan invertebrates allowed quick and easy barcode and phylogenetic analysis. On the other hand, the increase in the number of studies on barcoding leads to more frequent publishing of incorrect sequences, due to amplification of non-target taxa, and insufficient analysis of the obtained sequences. Consequently, some sequences deposited in genetic databases are incorrectly described as obtained from invertebrates, while being in fact bacterial sequences. In our study, in which we used Folmer's primers to amplify COI sequences of the crustacean fairy shrimp Branchipus schaefferi (Fischer 1834), we also obtained COI sequences of microbial contaminants from Aeromonas sp. However, when we searched the GenBank database for sequences closely matching these contaminations we found entries described as representatives of Gastrotricha and Mollusca. When these entries were compared with other sequences bearing the same names in the database, the genetic distance between the incorrect and correct sequences amplified from the same species was c.a. 65%. Although the responsibility for the correct molecular identification of species rests on researchers, the errors found in already published sequences data have not been re-evaluated so far. On the basis of the standard sampling technique we have estimated with 95% probability that the chances of finding incorrectly described metazoan sequences in the GenBank depend on the systematic group, and variety from less than 1% (Mollusca and Arthropoda) up to 6.9% (Gastrotricha). Consequently, the increasing popularity of DNA barcoding and metabarcoding analysis may lead to overestimation of species diversity. Finally, the study also discusses the sources of the problems with amplification of non-target sequences.
DNA barcodes for ecology, evolution, and conservation.
Kress, W John; García-Robledo, Carlos; Uriarte, Maria; Erickson, David L
2015-01-01
The use of DNA barcodes, which are short gene sequences taken from a standardized portion of the genome and used to identify species, is entering a new phase of application as more and more investigations employ these genetic markers to address questions relating to the ecology and evolution of natural systems. The suite of DNA barcode markers now applied to specific taxonomic groups of organisms are proving invaluable for understanding species boundaries, community ecology, functional trait evolution, trophic interactions, and the conservation of biodiversity. The application of next-generation sequencing (NGS) technology will greatly expand the versatility of DNA barcodes across the Tree of Life, habitats, and geographies as new methodologies are explored and developed. Published by Elsevier Ltd.
Molecular taxonomic techniques such as DNA barcoding offer interesting new capabilities for studying community biodiversity for applications like biological monitoring. Beyond DNA barcoding, new DNA sequencing technologies (i.e. Next-Generation Sequencing) present even greater po...
Kane, Nolan; Sveinsson, Saemundur; Dempewolf, Hannes; Yang, Ji Yong; Zhang, Dapeng; Engels, Johannes M M; Cronk, Quentin
2012-02-01
To reliably identify lineages below the species level such as subspecies or varieties, we propose an extension to DNA-barcoding using next-generation sequencing to produce whole organellar genomes and substantial nuclear ribosomal sequence. Because this method uses much longer versions of the traditional DNA-barcoding loci in the plastid and ribosomal DNA, we call our approach ultra-barcoding (UBC). We used high-throughput next-generation sequencing to scan the genome and generate reliable sequence of high copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribosomal DNA sequences for nine genotypes of Theobroma cacao and an individual of the related species T. grandiflorum, as well as an additional publicly available whole plastid genome of T. cacao. All individuals of T. cacao examined were uniquely distinguished, and evidence of reticulation and gene flow was observed. Sequence variation was observed in some of the canonical barcoding regions between species, but other regions of the chloroplast were more variable both within species and between species, as were ribosomal spacers. Furthermore, no single region provides the level of data available using the complete plastid genome and rDNA. Our data demonstrate that UBC is a viable, increasingly cost-effective approach for reliably distinguishing varieties and even individual genotypes of T. cacao. This approach shows great promise for applications where very closely related or interbreeding taxa must be distinguished.
Indigenous species barcode database improves the identification of zooplankton
Yang, Jianghua; Zhang, Wanwan; Sun, Jingying; Xie, Yuwei; Zhang, Yimin; Burton, G. Allen; Yu, Hongxia
2017-01-01
Incompleteness and inaccuracy of DNA barcode databases is considered an important hindrance to the use of metabarcoding in biodiversity analysis of zooplankton at the species-level. Species barcoding by Sanger sequencing is inefficient for organisms with small body sizes, such as zooplankton. Here mitochondrial cytochrome c oxidase I (COI) fragment barcodes from 910 freshwater zooplankton specimens (87 morphospecies) were recovered by a high-throughput sequencing platform, Ion Torrent PGM. Intraspecific divergence of most zooplanktons was < 5%, except Branchionus leydign (Rotifer, 14.3%), Trichocerca elongate (Rotifer, 11.5%), Lecane bulla (Rotifer, 15.9%), Synchaeta oblonga (Rotifer, 5.95%) and Schmackeria forbesi (Copepod, 6.5%). Metabarcoding data of 28 environmental samples from Lake Tai were annotated by both an indigenous database and NCBI Genbank database. The indigenous database improved the taxonomic assignment of metabarcoding of zooplankton. Most zooplankton (81%) with barcode sequences in the indigenous database were identified by metabarcoding monitoring. Furthermore, the frequency and distribution of zooplankton were also consistent between metabarcoding and morphology identification. Overall, the indigenous database improved the taxonomic assignment of zooplankton. PMID:28977035
Multiplexing clonality: combining RGB marking and genetic barcoding
Cornils, Kerstin; Thielecke, Lars; Hüser, Svenja; Forgber, Michael; Thomaschewski, Michael; Kleist, Nadja; Hussein, Kais; Riecken, Kristoffer; Volz, Tassilo; Gerdes, Sebastian; Glauche, Ingmar; Dahl, Andreas; Dandri, Maura; Roeder, Ingo; Fehse, Boris
2014-01-01
RGB marking and DNA barcoding are two cutting-edge technologies in the field of clonal cell marking. To combine the virtues of both approaches, we equipped LeGO vectors encoding red, green or blue fluorescent proteins with complex DNA barcodes carrying color-specific signatures. For these vectors, we generated highly complex plasmid libraries that were used for the production of barcoded lentiviral vector particles. In proof-of-principle experiments, we used barcoded vectors for RGB marking of cell lines and primary murine hepatocytes. We applied single-cell polymerase chain reaction to decipher barcode signatures of individual RGB-marked cells expressing defined color hues. This enabled us to prove clonal identity of cells with one and the same RGB color. Also, we made use of barcoded vectors to investigate clonal development of leukemia induced by ectopic oncogene expression in murine hematopoietic cells. In conclusion, by combining RGB marking and DNA barcoding, we have established a novel technique for the unambiguous genetic marking of individual cells in the context of normal regeneration as well as malignant outgrowth. Moreover, the introduction of color-specific signatures in barcodes will facilitate studies on the impact of different variables (e.g. vector type, transgenes, culture conditions) in the context of competitive repopulation studies. PMID:24476916
Wen, Jun; Ebihara, Atsushi; Li, De-Zhu
2016-01-01
DNA barcoding is a fast-developing technique to identify species by using short and standard DNA sequences. Universal selection of DNA barcodes in ferns remains unresolved. In this study, five plastid regions (rbcL, matK, trnH-psbA, trnL-F and rps4-trnS) and eight nuclear regions (ITS, pgiC, gapC, LEAFY, ITS2, IBR3_2, DET1, and SQD1_1) were screened and evaluated in the fern genus Adiantum from China and neighboring areas. Due to low primer universality (matK) and/or the existence of multiple copies (ITS), the commonly used barcodes matK and ITS were not appropriate for Adiantum. The PCR amplification rate was extremely low in all nuclear genes except for IBR3_2. rbcL had the highest PCR amplification rate (94.33%) and sequencing success rate (90.78%), while trnH-psbA had the highest species identification rate (75%). With the consideration of discriminatory power, cost-efficiency and effort, the two-barcode combination of rbcL+ trnH-psbA seems to be the best choice for barcoding Adiantum, and perhaps basal polypod ferns in general. The nuclear IBR3_2 showed 100% PCR amplification success rate in Adiantum, however, it seemed that only diploid species could acquire clean sequences without cloning. With cloning, IBR3_2 can successfully distinguish cryptic species and hybrid species from their related species. Because hybridization and allopolyploidy are common in ferns, we argue for including a selected group of nuclear loci as barcodes, especially via the next-generation sequencing, as it is much more efficient to obtain single-copy nuclear loci without the cloning procedure. PMID:27603700
DNA Barcode Authentication of Saw Palmetto Herbal Dietary Supplements
Little, Damon P.; Jeanson, Marc L.
2013-01-01
Herbal dietary supplements made from saw palmetto (Serenoa repens; Arecaceae) fruit are commonly consumed to ameliorate benign prostate hyperplasia. A novel DNA mini–barcode assay to accurately identify [specificity = 1.00 (95% confidence interval = 0.74–1.00); sensitivity = 1.00 (95% confidence interval = 0.66–1.00); n = 31] saw palmetto dietary supplements was designed from a DNA barcode reference library created for this purpose. The mini–barcodes were used to estimate the frequency of mislabeled saw palmetto herbal dietary supplements on the market in the United States of America. Of the 37 supplements examined, amplifiable DNA could be extracted from 34 (92%). Mini–barcode analysis of these supplements demonstrated that 29 (85%) contain saw palmetto and that 2 (6%) supplements contain related species that cannot be legally sold as herbal dietary supplements in the United States of America. The identity of 3 (9%) supplements could not be conclusively determined. PMID:24343362
DNA barcode authentication of saw palmetto herbal dietary supplements.
Little, Damon P; Jeanson, Marc L
2013-12-17
Herbal dietary supplements made from saw palmetto (Serenoa repens; Arecaceae) fruit are commonly consumed to ameliorate benign prostate hyperplasia. A novel DNA mini-barcode assay to accurately identify [specificity = 1.00 (95% confidence interval = 0.74-1.00); sensitivity = 1.00 (95% confidence interval = 0.66-1.00); n = 31] saw palmetto dietary supplements was designed from a DNA barcode reference library created for this purpose. The mini-barcodes were used to estimate the frequency of mislabeled saw palmetto herbal dietary supplements on the market in the United States of America. Of the 37 supplements examined, amplifiable DNA could be extracted from 34 (92%). Mini-barcode analysis of these supplements demonstrated that 29 (85%) contain saw palmetto and that 2 (6%) supplements contain related species that cannot be legally sold as herbal dietary supplements in the United States of America. The identity of 3 (9%) supplements could not be conclusively determined.
Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso
2015-07-01
In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.
Scaling up discovery of hidden diversity in fungi: impacts of barcoding approaches.
Yahr, Rebecca; Schoch, Conrad L; Dentinger, Bryn T M
2016-09-05
The fungal kingdom is a hyperdiverse group of multicellular eukaryotes with profound impacts on human society and ecosystem function. The challenge of documenting and describing fungal diversity is exacerbated by their typically cryptic nature, their ability to produce seemingly unrelated morphologies from a single individual and their similarity in appearance to distantly related taxa. This multiplicity of hurdles resulted in the early adoption of DNA-based comparisons to study fungal diversity, including linking curated DNA sequence data to expertly identified voucher specimens. DNA-barcoding approaches in fungi were first applied in specimen-based studies for identification and discovery of taxonomic diversity, but are now widely deployed for community characterization based on sequencing of environmental samples. Collectively, fungal barcoding approaches have yielded important advances across biological scales and research applications, from taxonomic, ecological, industrial and health perspectives. A major outstanding issue is the growing problem of 'sequences without names' that are somewhat uncoupled from the traditional framework of fungal classification based on morphology and preserved specimens. This review summarizes some of the most significant impacts of fungal barcoding, its limitations, and progress towards the challenge of effective utilization of the exponentially growing volume of data gathered from high-throughput sequencing technologies.This article is part of the themed issue 'From DNA barcodes to biomes'. © 2016 The Authors.
Loh, W K W; Bond, P; Ashton, K J; Roberts, D T; Tibbetts, I R
2014-08-01
The barcoding of mitochondrial cytochrome c oxidase subunit 1 (coI) gene was amplified and sequenced from 16 species of freshwater fishes found in Lake Wivenhoe (south-eastern Queensland, Australia) to support monitoring of reservoir fish populations, ecosystem function and water health. In this study, 630-650 bp sequences of the coI barcoding gene from 100 specimens representing 15 genera, 13 families and two subclasses of fishes allowed 14 of the 16 species to be identified and differentiated. The mean ± s.e. Kimura 2 parameter divergence within and between species was 0.52 ± 0.10 and 23.8 ± 2.20% respectively, indicating that barcodes can be used to discriminate most of the fish species accurately. The two terapontids, Amniataba percoides and Leiopotherapon unicolor, however, shared coI DNA sequences and could not be differentiated using this gene. A barcoding database was established and a qPCR assay was developed using coI sequences to identify and quantify proportional abundances of fish species in ichthyoplankton samples from Lake Wivenhoe. These methods provide a viable alternative to the time-consuming process of manually enumerating and identifying ichthyoplankton samples. © 2014 The Fisheries Society of the British Isles.
DNA barcoding of perennial fruit tree species of agronomic interest in the genus Annona (Annonaceae)
Larranaga, Nerea; Hormaza, José I.
2015-01-01
The DNA barcode initiative aims to establish a universal protocol using short genetic sequences to discriminate among animal and plant species. Although many markers have been proposed to become the barcode of plants, the Consortium for the Barcode of Life (CBOL) Plant Working Group recommended using as a core the combination of two portions of plastid coding region, rbcL and matK. In this paper, specific markers based on matK sequences were developed for 7 closely related Annona species of agronomic interest (Annona cherimola, A. reticulata, A. squamosa, A. muricata, A. macroprophyllata, A. glabra, and A. purpurea) and the discrimination power of both rbcL and matK was tested using also sequences of the genus Annona available in the Barcode of Life Database (BOLD) data systems. The specific sequences developed allowed the discrimination among all those species tested. Moreover, the primers generated were validated in six additional species of the genus (A. liebmanniana, A. longiflora, A. montana, A. senegalensis, A. emarginata and A. neosalicifolia) and in an interspecific hybrid (A. cherimola x A. squamosa). The development of a fast, reliable and economic approach for species identification in these underutilized subtropical fruit crops in a very initial state of domestication is of great importance in order to optimize genetic resource management. PMID:26284104
Hit-Validation Methodologies for Ligands Isolated from DNA-Encoded Chemical Libraries.
Zimmermann, Gunther; Li, Yizhou; Rieder, Ulrike; Mattarella, Martin; Neri, Dario; Scheuermann, Jörg
2017-05-04
DNA-encoded chemical libraries (DECLs) are large collections of compounds linked to DNA fragments, serving as amplifiable barcodes, which can be screened on target proteins of interest. In typical DECL selections, preferential binders are identified by high-throughput DNA sequencing, by comparing their frequency before and after the affinity capture step. Hits identified in this procedure need to be confirmed, by resynthesis and by performing affinity measurements. In this article we present new methods based on hybridization of oligonucleotide conjugates with fluorescently labeled complementary oligonucleotides; these facilitate the determination of affinity constants and kinetic dissociation constants. The experimental procedures were demonstrated with acetazolamide, a binder to carbonic anhydrase IX with a dissociation constant in the nanomolar range. The detection of binding events was compatible not only with fluorescence polarization methodologies, but also with Alphascreen technology and with microscale thermophoresis. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Molecular Approach to the Identification of Fish in the South China Sea
Zhang, Junbin; Hanner, Robert
2012-01-01
Background DNA barcoding is one means of establishing a rapid, accurate, and cost-effective system for the identification of species. It involves the use of short, standard gene targets to create sequence profiles of known species against sequences of unknowns that can be matched and subsequently identified. The Fish Barcode of Life (FISH-BOL) campaign has the primary goal of gathering DNA barcode records for all the world's fish species. As a contribution to FISH-BOL, we examined the degree to which DNA barcoding can discriminate marine fishes from the South China Sea. Methodology/Principal Findings DNA barcodes of cytochrome oxidase subunit I (COI) were characterized using 1336 specimens that belong to 242 species fishes from the South China Sea. All specimen provenance data (including digital specimen images and geospatial coordinates of collection localities) and collateral sequence information were assembled using Barcode of Life Data System (BOLD; www.barcodinglife.org). Small intraspecific and large interspecific differences create distinct genetic boundaries among most species. In addition, the efficiency of two mitochondrial genes, 16S rRNA (16S) and cytochrome b (cytb), and one nuclear ribosomal gene, 18S rRNA (18S), was also evaluated for a few select groups of species. Conclusions/Significance The present study provides evidence for the effectiveness of DNA barcoding as a tool for monitoring marine biodiversity. Open access data of fishes from the South China Sea can benefit relative applications in ecology and taxonomy. PMID:22363454
Does a global DNA barcoding gap exist in Annelida?
Kvist, Sebastian
2016-05-01
Accurate identification of unknown specimens by means of DNA barcoding is contingent on the presence of a DNA barcoding gap, among other factors, as its absence may result in dubious specimen identifications - false negatives or positives. Whereas the utility of DNA barcoding would be greatly reduced in the absence of a distinct and sufficiently sized barcoding gap, the limits of intraspecific and interspecific distances are seldom thoroughly inspected across comprehensive sampling. The present study aims to illuminate this aspect of barcoding in a comprehensive manner for the animal phylum Annelida. All cytochrome c oxidase subunit I sequences (cox1 gene; the chosen region for zoological DNA barcoding) present in GenBank for Annelida, as well as for "Polychaeta", "Oligochaeta", and Hirudinea separately, were downloaded and curated for length, coverage and potential contaminations. The final datasets consisted of 9782 (Annelida), 5545 ("Polychaeta"), 3639 ("Oligochaeta"), and 598 (Hirudinea) cox1 sequences and these were either (i) used as is in an automated global barcoding gap detection analysis or (ii) further analyzed for genetic distances, separated into bins containing intraspecific and interspecific comparisons and plotted in a graph to visualize any potential global barcoding gap. Over 70 million pairwise genetic comparisons were made and results suggest that although there is a tendency towards separation, no distinct or sufficiently sized global barcoding gap exists in either of the datasets rendering future barcoding efforts at risk of erroneous specimen identifications (but local barcoding gaps may still exist allowing for the identification of specimens at lower taxonomic ranks). This seems to be especially true for earthworm taxa, which account for fully 35% of the total number of interspecific comparisons that show 0% divergence.
Zou, Shanmei; Fei, Cong; Song, Jiameng; Bao, Yachao; He, Meilin; Wang, Changhai
2016-01-01
Several different barcoding methods of distinguishing species have been advanced, but which method is the best is still controversial. Chlorella is becoming particularly promising in the development of second-generation biofuels. However, the taxonomy of Chlorella-like organisms is easily confused. Here we report a comprehensive barcoding analysis of Chlorella-like species from Chlorella, Chloroidium, Dictyosphaerium and Actinastrum based on rbcL, ITS, tufA and 16S sequences to test the efficiency of traditional barcoding, GMYC, ABGD, PTP, P ID and character-based barcoding methods. First of all, the barcoding results gave new insights into the taxonomic assessment of Chlorella-like organisms studied, including the clear species discrimination and resolution of potentially cryptic species complexes in C. sorokiniana, D. ehrenbergianum and C. Vulgaris. The tufA proved to be the most efficient barcoding locus, which thus could be as potential "specific barcode" for Chlorella-like species. The 16S failed in discriminating most closely related species. The resolution of GMYC, PTP, P ID, ABGD and character-based barcoding methods were variable among rbcL, ITS and tufA genes. The best resolution for species differentiation appeared in tufA analysis where GMYC, PTP, ABGD and character-based approaches produced consistent groups while the PTP method over-split the taxa. The character analysis of rbcL, ITS and tufA sequences could clearly distinguish all taxonomic groups respectively, including the potentially cryptic lineages, with many character attributes. Thus, the character-based barcoding provides an attractive complement to coalescent and distance-based barcoding. Our study represents the test that proves the efficiency of multiple DNA barcoding in species discrimination of microalgaes.
ITS1: a DNA barcode better than ITS2 in eukaryotes?
Wang, Xin-Cun; Liu, Chang; Huang, Liang; Bengtsson-Palme, Johan; Chen, Haimei; Zhang, Jian-Hui; Cai, Dayong; Li, Jian-Qin
2015-05-01
A DNA barcode is a short piece of DNA sequence used for species determination and discovery. The internal transcribed spacer (ITS/ITS2) region has been proposed as the standard DNA barcode for fungi and seed plants and has been widely used in DNA barcoding analyses for other biological groups, for example algae, protists and animals. The ITS region consists of both ITS1 and ITS2 regions. Here, a large-scale meta-analysis was carried out to compare ITS1 and ITS2 from three aspects: PCR amplification, DNA sequencing and species discrimination, in terms of the presence of DNA barcoding gaps, species discrimination efficiency, sequence length distribution, GC content distribution and primer universality. In total, 85 345 sequence pairs in 10 major groups of eukaryotes, including ascomycetes, basidiomycetes, liverworts, mosses, ferns, gymnosperms, monocotyledons, eudicotyledons, insects and fishes, covering 611 families, 3694 genera, and 19 060 species, were analysed. Using similarity-based methods, we calculated species discrimination efficiencies for ITS1 and ITS2 in all major groups, families and genera. Using Fisher's exact test, we found that ITS1 has significantly higher efficiencies than ITS2 in 17 of the 47 families and 20 of the 49 genera, which are sample-rich. By in silico PCR amplification evaluation, primer universality of the extensively applied ITS1 primers was found superior to that of ITS2 primers. Additionally, shorter length of amplification product and lower GC content was discovered to be two other advantages of ITS1 for sequencing. In summary, ITS1 represents a better DNA barcode than ITS2 for eukaryotic species. © 2014 John Wiley & Sons Ltd.
Contreras Gutiérrez, María Angélica; Vivero, Rafael J; Vélez, Iván D; Porter, Charles H; Uribe, Sandra
2014-01-01
Sand flies include a group of insects that are of medical importance and that vary in geographic distribution, ecology, and pathogen transmission. Approximately 163 species of sand flies have been reported in Colombia. Surveillance of the presence of sand fly species and the actualization of species distribution are important for predicting risks for and monitoring the expansion of diseases which sand flies can transmit. Currently, the identification of phlebotomine sand flies is based on morphological characters. However, morphological identification requires considerable skills and taxonomic expertise. In addition, significant morphological similarity between some species, especially among females, may cause difficulties during the identification process. DNA-based approaches have become increasingly useful and promising tools for estimating sand fly diversity and for ensuring the rapid and accurate identification of species. A partial sequence of the mitochondrial cytochrome oxidase gene subunit I (COI) is currently being used to differentiate species in different animal taxa, including insects, and it is referred as a barcoding sequence. The present study explored the utility of the DNA barcode approach for the identification of phlebotomine sand flies in Colombia. We sequenced 700 bp of the COI gene from 36 species collected from different geographic localities. The COI barcode sequence divergence within a single species was <2% in most cases, whereas this divergence ranged from 9% to 26.6% among different species. These results indicated that the barcoding gene correctly discriminated among the previously morphologically identified species with an efficacy of nearly 100%. Analyses of the generated sequences indicated that the observed species groupings were consistent with the morphological identifications. In conclusion, the barcoding gene was useful for species discrimination in sand flies from Colombia.
Contreras Gutiérrez, María Angélica; Vivero, Rafael J.; Vélez, Iván D.; Porter, Charles H.; Uribe, Sandra
2014-01-01
Sand flies include a group of insects that are of medical importance and that vary in geographic distribution, ecology, and pathogen transmission. Approximately 163 species of sand flies have been reported in Colombia. Surveillance of the presence of sand fly species and the actualization of species distribution are important for predicting risks for and monitoring the expansion of diseases which sand flies can transmit. Currently, the identification of phlebotomine sand flies is based on morphological characters. However, morphological identification requires considerable skills and taxonomic expertise. In addition, significant morphological similarity between some species, especially among females, may cause difficulties during the identification process. DNA-based approaches have become increasingly useful and promising tools for estimating sand fly diversity and for ensuring the rapid and accurate identification of species. A partial sequence of the mitochondrial cytochrome oxidase gene subunit I (COI) is currently being used to differentiate species in different animal taxa, including insects, and it is referred as a barcoding sequence. The present study explored the utility of the DNA barcode approach for the identification of phlebotomine sand flies in Colombia. We sequenced 700 bp of the COI gene from 36 species collected from different geographic localities. The COI barcode sequence divergence within a single species was <2% in most cases, whereas this divergence ranged from 9% to 26.6% among different species. These results indicated that the barcoding gene correctly discriminated among the previously morphologically identified species with an efficacy of nearly 100%. Analyses of the generated sequences indicated that the observed species groupings were consistent with the morphological identifications. In conclusion, the barcoding gene was useful for species discrimination in sand flies from Colombia. PMID:24454877
Huang, Xiao-cui; Ci, Xiu-qin; Conran, John G; Li, Jie
2015-01-01
Within a regional floristic context, DNA barcoding is more useful to manage plant diversity inventories on a large scale and develop valuable conservation strategies. However, there are no DNA barcode studies from tropical areas of China, which represents one of the biodiversity hotspots around the world. A DNA barcoding database of an Asian tropical trees with high diversity was established at Xishuangbanna Nature Reserve, Yunnan, southwest China using rbcL and matK as standard barcodes, as well as trnH-psbA and ITS as supplementary barcodes. The performance of tree species identification success was assessed using 2,052 accessions from four plots belonging to two vegetation types in the region by three methods: Neighbor-Joining, Maximum-Likelihood and BLAST. We corrected morphological field identification errors (9.6%) for the three plots using rbcL and matK based on Neighbor-Joining tree. The best barcode region for PCR and sequencing was rbcL (97.6%, 90.8%), followed by trnH-psbA (93.6%, 85.6%), while matK and ITS obtained relative low PCR and sequencing success rates. However, ITS performed best for both species (44.6-58.1%) and genus (72.8-76.2%) identification. With trnH-psbA slightly less effective for species identification. The two standard barcode rbcL and matK gave poor results for species identification (24.7-28.5% and 31.6-35.3%). Compared with other studies from comparable tropical forests (e.g. Cameroon, the Amazon and India), the overall performance of the four barcodes for species identification was lower for the Xishuangbanna Nature Reserve, possibly because of species/genus ratios and species composition between these tropical areas. Although the core barcodes rbcL and matK were not suitable for species identification of tropical trees from Xishuangbanna Nature Reserve, they could still help with identification at the family and genus level. Considering the relative sequence recovery and the species identification performance, we recommend the use of trnH-psbA and ITS in combination as the preferred barcodes for tropical tree species identification in China.
The problems and promise of DNA barcodes for species diagnosis of primate biomaterials
Lorenz, Joseph G; Jackson, Whitney E; Beck, Jeanne C; Hanner, Robert
2005-01-01
The Integrated Primate Biomaterials and Information Resource (www.IPBIR.org) provides essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing DNA and RNA derived from primate cell cultures. The IPBIR uses mitochondrial cytochrome c oxidase subunit I sequences to verify the identity of samples for quality control purposes in the accession, cell culture, DNA extraction processes and prior to shipping to end users. As a result, IPBIR is accumulating a database of ‘DNA barcodes’ for many species of primates. However, this quality control process is complicated by taxon specific patterns of ‘universal primer’ failure, as well as the amplification or co-amplification of nuclear pseudogenes of mitochondrial origins. To overcome these difficulties, taxon specific primers have been developed, and reverse transcriptase PCR is utilized to exclude these extraneous sequences from amplification. DNA barcoding of primates has applications to conservation and law enforcement. Depositing barcode sequences in a public database, along with primer sequences, trace files and associated quality scores, makes this species identification technique widely accessible. Reference DNA barcode sequences should be derived from, and linked to, specimens of known provenance in web-accessible collections in order to validate this system of molecular diagnostics. PMID:16214744
Raja, Huzefa A; Baker, Timothy R; Little, Jason G; Oberlies, Nicholas H
2017-01-01
One challenge in the dietary supplement industry is confirmation of species identity for processed raw materials, i.e. those modified by milling, drying, or extraction, which move through a multilevel supply chain before reaching the finished product. This is particularly difficult for samples containing fungal mycelia, where processing removes morphological characteristics, such that they do not present sufficient variation to differentiate species by traditional techniques. To address this issue, we have demonstrated the utility of DNA barcoding to verify the taxonomic identity of fungi found commonly in the food and dietary supplement industry; such data are critical for protecting consumer health, by assuring both safety and quality. By using DNA barcoding of nuclear ribosomal internal transcribed spacer (ITS) of the rRNA gene with fungal specific ITS primers, ITS barcodes were generated for 33 representative fungal samples, all of which could be used by consumers for food and/or dietary supplement purposes. In the majority of cases, we were able to sequence the ITS region from powdered mycelium samples, grocery store mushrooms, and capsules from commercial dietary supplements. After generating ITS barcodes utilizing standard procedures accepted by the Consortium for the Barcode of Life, we tested their utility by performing a BLAST search against authenticate published ITS sequences in GenBank. In some cases, we also downloaded published, homologous sequences of the ITS region of fungi inspected in this study and examined the phylogenetic relationships of barcoded fungal species in light of modern taxonomic and phylogenetic studies. We anticipate that these data will motivate discussions on DNA barcoding based species identification as applied to the verification/certification of mushroom-containing dietary supplements. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Laopichienpong, Nararat; Muangmai, Narongrit; Supikamolseni, Arrjaree; Twilprawat, Panupon; Chanhome, Lawan; Suntrarachun, Sunutcha; Peyachoknagul, Surin; Srikulnath, Kornsorn
2016-12-15
DNA barcodes of mitochondrial cytochrome c oxidase I (COI), cytochrome b (Cytb) genes, and their combined data sets were constructed from 35 snake species in Thailand. No barcoding gap was detected in either of the two genes from the observed intra- and interspecific sequence divergences. Intra- and interspecific sequence divergences of the COI gene differed 14 times, with barcode cut-off scores ranging over 2%-4% for threshold values differentiated among most of the different species; the Cytb gene differed 6 times with cut-off scores ranging over 2%-6%. Thirty-five specific nucleotide mutations were also found at interspecific level in the COI gene, identifying 18 snake species, but no specific nucleotide mutation was observed for Cytb in any single species. This suggests that COI barcoding was a better marker than Cytb. Phylogenetic clustering analysis indicated that most species were represented by monophyletic clusters, suggesting that these snake species could be clearly differentiated using COI barcodes. However, the two-marker combination of both COI and Cytb was more effective, differentiating snake species by over 2%-4%, and reducing species numbers in the overlap value between intra- and interspecific divergences. Three species delimitation algorithms (general mixed Yule-coalescent, automatic barcoding gap detection, and statistical parsimony network analysis) were extensively applied to a wide range of snakes based on both barcodes. This revealed cryptic diversity for eleven snake species in Thailand. In addition, eleven accessions from the database previously grouped under the same species were represented at different species level, suggesting either high genetic diversity, or the misidentification of these sequences in the database as a consequence of cryptic species. Copyright © 2016 Elsevier B.V. All rights reserved.
Using high-throughput barcode sequencing to efficiently map connectomes.
Peikon, Ian D; Kebschull, Justus M; Vagin, Vasily V; Ravens, Diana I; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R; Bressan, Dario; Zador, Anthony M
2017-07-07
The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision-a 'connectome'-is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence-an RNA 'barcode'-which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Choosing and Using a Plant DNA Barcode
Hollingsworth, Peter M.; Graham, Sean W.; Little, Damon P.
2011-01-01
The main aim of DNA barcoding is to establish a shared community resource of DNA sequences that can be used for organismal identification and taxonomic clarification. This approach was successfully pioneered in animals using a portion of the cytochrome oxidase 1 (CO1) mitochondrial gene. In plants, establishing a standardized DNA barcoding system has been more challenging. In this paper, we review the process of selecting and refining a plant barcode; evaluate the factors which influence the discriminatory power of the approach; describe some early applications of plant barcoding and summarise major emerging projects; and outline tool development that will be necessary for plant DNA barcoding to advance. PMID:21637336
ERIC Educational Resources Information Center
Folda, Linda; And Others
1989-01-01
Issues related to library online systems are discussed in six articles. Topics covered include staff education through vendor demonstrations, evaluation of online public access catalogs, the impact of integrated online systems on cataloging operations, the merits of smart and dumb barcodes, and points to consider in planning for the next online…
Application of DNA barcodes in wildlife conservation in Tropical East Asia.
Wilson, John-James; Sing, Kong-Wah; Lee, Ping-Shin; Wee, Alison K S
2016-10-01
Over the past 50 years, Tropical East Asia has lost more biodiversity than any tropical region. Tropical East Asia is a megadiverse region with an acute taxonomic impediment. DNA barcodes are short standardized DNA sequences used for taxonomic purposes and have the potential to lessen the challenges of biodiversity inventory and assessments in regions where they are most needed. We reviewed DNA barcoding efforts in Tropical East Asia relative to other tropical regions. We suggest DNA barcodes (or metabarcodes from next-generation sequencers) may be especially useful for characterizing and connecting species-level biodiversity units in inventories encompassing taxa lacking formal description (particularly arthropods) and in large-scale, minimal-impact approaches to vertebrate monitoring and population assessments through secondary sources of DNA (invertebrate derived DNA and environmental DNA). We suggest interest and capacity for DNA barcoding are slowly growing in Tropical East Asia, particularly among the younger generation of researchers who can connect with the barcoding analogy and understand the need for new approaches to the conservation challenges being faced. © 2016 Society for Conservation Biology.
New taxonomy and old collections: integrating DNA barcoding into the collection curation process.
Puillandre, N; Bouchet, P; Boisselier-Dubayle, M-C; Brisset, J; Buge, B; Castelin, M; Chagnoux, S; Christophe, T; Corbari, L; Lambourdière, J; Lozouet, P; Marani, G; Rivasseau, A; Silva, N; Terryn, Y; Tillier, S; Utge, J; Samadi, S
2012-05-01
Because they house large biodiversity collections and are also research centres with sequencing facilities, natural history museums are well placed to develop DNA barcoding best practices. The main difficulty is generally the vouchering system: it must ensure that all data produced remain attached to the corresponding specimen, from the field to publication in articles and online databases. The Museum National d'Histoire Naturelle in Paris is one of the leading laboratories in the Marine Barcode of Life (MarBOL) project, which was used as a pilot programme to include barcode collections for marine molluscs and crustaceans. The system is based on two relational databases. The first one classically records the data (locality and identification) attached to the specimens. In the second one, tissue-clippings, DNA extractions (both preserved in 2D barcode tubes) and PCR data (including primers) are linked to the corresponding specimen. All the steps of the process [sampling event, specimen identification, molecular processing, data submission to Barcode Of Life Database (BOLD) and GenBank] are thus linked together. Furthermore, we have developed several web-based tools to automatically upload data into the system, control the quality of the sequences produced and facilitate the submission to online databases. This work is the result of a joint effort from several teams in the Museum National d'Histoire Naturelle (MNHN), but also from a collaborative network of taxonomists and molecular systematists outside the museum, resulting in the vouchering so far of ∼41,000 sequences and the production of ∼11,000 COI sequences. © 2012 Blackwell Publishing Ltd.
Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya
2015-08-01
Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
A Barcoding Strategy Enabling Higher-Throughput Library Screening by Microscopy.
Chen, Robert; Rishi, Harneet S; Potapov, Vladimir; Yamada, Masaki R; Yeh, Vincent J; Chow, Thomas; Cheung, Celia L; Jones, Austin T; Johnson, Terry D; Keating, Amy E; DeLoache, William C; Dueber, John E
2015-11-20
Dramatic progress has been made in the design and build phases of the design-build-test cycle for engineering cells. However, the test phase usually limits throughput, as many outputs of interest are not amenable to rapid analytical measurements. For example, phenotypes such as motility, morphology, and subcellular localization can be readily measured by microscopy, but analysis of these phenotypes is notoriously slow. To increase throughput, we developed microscopy-readable barcodes (MiCodes) composed of fluorescent proteins targeted to discernible organelles. In this system, a unique barcode can be genetically linked to each library member, making possible the parallel analysis of phenotypes of interest via microscopy. As a first demonstration, we MiCoded a set of synthetic coiled-coil leucine zipper proteins to allow an 8 × 8 matrix to be tested for specific interactions in micrographs consisting of mixed populations of cells. A novel microscopy-readable two-hybrid fluorescence localization assay for probing candidate interactions in the cytosol was also developed using a bait protein targeted to the peroxisome and a prey protein tagged with a fluorescent protein. This work introduces a generalizable, scalable platform for making microscopy amenable to higher-throughput library screening experiments, thereby coupling the power of imaging with the utility of combinatorial search paradigms.
Dincă, Vlad; Zakharov, Evgeny V.; Hebert, Paul D. N.; Vila, Roger
2011-01-01
DNA barcoding aims to accelerate species identification and discovery, but performance tests have shown marked differences in identification success. As a consequence, there remains a great need for comprehensive studies which objectively test the method in groups with a solid taxonomic framework. This study focuses on the 180 species of butterflies in Romania, accounting for about one third of the European butterfly fauna. This country includes five eco-regions, the highest of any in the European Union, and is a good representative for temperate areas. Morphology and DNA barcodes of more than 1300 specimens were carefully studied and compared. Our results indicate that 90 per cent of the species form barcode clusters allowing their reliable identification. The remaining cases involve nine closely related species pairs, some whose taxonomic status is controversial or that hybridize regularly. Interestingly, DNA barcoding was found to be the most effective identification tool, outperforming external morphology, and being slightly better than male genitalia. Romania is now the first country to have a comprehensive DNA barcode reference database for butterflies. Similar barcoding efforts based on comprehensive sampling of specific geographical regions can act as functional modules that will foster the early application of DNA barcoding while a global system is under development. PMID:20702462
Telling plant species apart with DNA: from barcodes to genomes
Li, De-Zhu; van der Bank, Michelle
2016-01-01
Land plants underpin a multitude of ecosystem functions, support human livelihoods and represent a critically important component of terrestrial biodiversity—yet many tens of thousands of species await discovery, and plant identification remains a substantial challenge, especially where material is juvenile, fragmented or processed. In this opinion article, we tackle two main topics. Firstly, we provide a short summary of the strengths and limitations of plant DNA barcoding for addressing these issues. Secondly, we discuss options for enhancing current plant barcodes, focusing on increasing discriminatory power via either gene capture of nuclear markers or genome skimming. The former has the advantage of establishing a defined set of target loci maximizing efficiency of sequencing effort, data storage and analysis. The challenge is developing a probe set for large numbers of nuclear markers that works over sufficient phylogenetic breadth. Genome skimming has the advantage of using existing protocols and being backward compatible with existing barcodes; and the depth of sequence coverage can be increased as sequencing costs fall. Its non-targeted nature does, however, present a major informatics challenge for upscaling to large sample sets. This article is part of the themed issue ‘From DNA barcodes to biomes’. PMID:27481790
Hernández-Triana, Luis M; Montes De Oca, Fernanda; Prosser, Sean W J; Hebert, Paul D N; Gregory, T Ryan; McMurtrie, Shelley
2017-04-01
In this paper, the utility of a partial sequence of the COI gene, the DNA barcoding region, for the identification of species of black flies in the austral region was assessed. Twenty-eight morphospecies were analyzed: eight of the genus Austrosimulium (four species in the subgenus Austrosimulium s. str., three species in the subgenus Novaustrosimulium, and one species unassigned to subgenus), two of the genus Cnesia, eight of Gigantodax, three of Paracnephia, one of Paraustrosimulium, and six of Simulium (subgenera Morops, Nevermannia, and Pternaspatha). The neighbour-joining tree derived from the DNA barcode sequences grouped most specimens according to species or species groups recognized by morphotaxonomic studies. Intraspecific sequence divergences within morphologically distinct species ranged from 0% to 1.8%, while higher divergences (2%-4.2%) in certain species suggested the presence of cryptic diversity. The existence of well-defined groups within S. simile revealed the likely inclusion of cryptic diversity. DNA barcodes also showed that specimens identified as C. dissimilis, C. nr. pussilla, and C. ornata might be conspecific, suggesting possible synonymy. DNA barcoding combined with a sound morphotaxonomic framework would provide an effective approach for the identification of black flies in the region.
Zou, Shanmei; Fei, Cong; Song, Jiameng; Bao, Yachao; He, Meilin; Wang, Changhai
2016-01-01
Several different barcoding methods of distinguishing species have been advanced, but which method is the best is still controversial. Chlorella is becoming particularly promising in the development of second-generation biofuels. However, the taxonomy of Chlorella–like organisms is easily confused. Here we report a comprehensive barcoding analysis of Chlorella-like species from Chlorella, Chloroidium, Dictyosphaerium and Actinastrum based on rbcL, ITS, tufA and 16S sequences to test the efficiency of traditional barcoding, GMYC, ABGD, PTP, P ID and character-based barcoding methods. First of all, the barcoding results gave new insights into the taxonomic assessment of Chlorella-like organisms studied, including the clear species discrimination and resolution of potentially cryptic species complexes in C. sorokiniana, D. ehrenbergianum and C. Vulgaris. The tufA proved to be the most efficient barcoding locus, which thus could be as potential “specific barcode” for Chlorella-like species. The 16S failed in discriminating most closely related species. The resolution of GMYC, PTP, P ID, ABGD and character-based barcoding methods were variable among rbcL, ITS and tufA genes. The best resolution for species differentiation appeared in tufA analysis where GMYC, PTP, ABGD and character-based approaches produced consistent groups while the PTP method over-split the taxa. The character analysis of rbcL, ITS and tufA sequences could clearly distinguish all taxonomic groups respectively, including the potentially cryptic lineages, with many character attributes. Thus, the character-based barcoding provides an attractive complement to coalescent and distance-based barcoding. Our study represents the test that proves the efficiency of multiple DNA barcoding in species discrimination of microalgaes. PMID:27092945
DNA barcoding commercially important fish species of Turkey.
Keskın, Emre; Atar, Hasan H
2013-09-01
DNA barcoding was used in the identification of 89 commercially important freshwater and marine fish species found in Turkish ichthyofauna. A total of 1765 DNA barcodes using a 654-bp-long fragment of the mitochondrial cytochrome c oxidase subunit I gene were generated for 89 commercially important freshwater and marine fish species found in Turkish ichthyofauna. These species belong to 70 genera, 40 families and 19 orders from class Actinopterygii, and all were associated with a distinct DNA barcode. Nine and 12 of the COI barcode clusters represent the first species records submitted to the BOLD and GenBank databases, respectively. All COI barcodes (except sequences of first species records) were matched with reference sequences of expected species, according to morphological identification. Average nucleotide frequencies of the data set were calculated as T = 29.7%, C = 28.2%, A = 23.6% and G = 18.6%. Average pairwise genetic distance among individuals were estimated as 0.32%, 9.62%, 17,90% and 22.40% for conspecific, congeneric, confamilial and within order, respectively. Kimura 2-parameter genetic distance values were found to increase with taxonomic level. For most of the species analysed in our data set, there is a barcoding gap, and an overlap in the barcoding gap exists for only two genera. Neighbour-joining trees were drawn based on DNA barcodes and all the specimens clustered in agreement with their taxonomic classification at species level. Results of this study supported DNA barcoding as an efficient molecular tool for a better monitoring, conservation and management of fisheries. © 2013 John Wiley & Sons Ltd.
DNA barcodes of the native ray-finned fishes in Taiwan.
Chang, Chia-Hao; Shao, Kwang-Tsao; Lin, Han-Yang; Chiu, Yung-Chieh; Lee, Mao-Ying; Liu, Shih-Hui; Lin, Pai-Lei
2017-07-01
Species identification based on the DNA sequence of a fragment of the cytochrome c oxidase subunit I gene in the mitochondrial genome, DNA barcoding, is widely applied to assist in sustainable exploitation of fish resources and the protection of fish biodiversity. The aim of this study was to establish a reliable barcoding reference database of the native ray-finned fishes in Taiwan. A total of 2993 individuals, belonging to 1245 species within 637 genera, 184 families and 29 orders of ray-finned fishes and representing approximately 40% of the recorded ray-finned fishes in Taiwan, were PCR amplified at the barcode region and bidirectionally sequenced. The mean length of the 2993 barcodes is 549 bp. Mean congeneric K2P distance (15.24%) is approximately 10-fold higher than the mean conspecific one (1.51%), but approximately 1.4-fold less than the mean genetic distance between families (20.80%). The Barcode Index Number (BIN) discordance report shows that 2993 specimens represent 1275 BINs and, among them, 86 BINs are singletons, 570 BINs are taxonomically concordant, and the other 619 BINs are taxonomically discordant. Barcode gap analysis also revealed that more than 90% of the collected fishes in this study can be discriminated by DNA barcoding. Overall, the barcoding reference database established by this study reveals the need for taxonomic revisions and voucher specimen rechecks, in addition to assisting in the management of Taiwan's fish resources and diversity. © 2016 John Wiley & Sons Ltd.
Use of DNA barcodes to identify flowering plants.
Kress, W John; Wurdack, Kenneth J; Zimmer, Elizabeth A; Weigt, Lee A; Janzen, Daniel H
2005-06-07
Methods for identifying species by using short orthologous DNA sequences, known as "DNA barcodes," have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We therefore propose the nuclear internal transcribed spacer region and the plastid trnH-psbA intergenic spacer as potentially usable DNA regions for applying barcoding to flowering plants. The internal transcribed spacer is the most commonly sequenced locus used in plant phylogenetic investigations at the species level and shows high levels of interspecific divergence. The trnH-psbA spacer, although short ( approximately 450-bp), is the most variable plastid region in angiosperms and is easily amplified across a broad range of land plants. Comparison of the total plastid genomes of tobacco and deadly nightshade enhanced with trials on widely divergent angiosperm taxa, including closely related species in seven plant families and a group of species sampled from a local flora encompassing 50 plant families (for a total of 99 species, 80 genera, and 53 families), suggest that the sequences in this pair of loci have the potential to discriminate among the largest number of plant species for barcoding purposes.
Maloukh, Lina; Kumarappan, Alagappan; Jarrar, Mohammad; Salehi, Jawad; El-Wakil, Houssam; Rajya Lakshmi, T V
2017-06-01
DNA barcoding of United Arab Emirates (UAE) native plants is of high practical and scientific value as the plants adapt to very harsh environmental conditions that challenge their identification. Fifty-one plant species belonged to 22 families, 2 monocots, and 20 eudicots; a maximum number of species being legumes and grasses were collected. To authenticate the morphological identification of the wild plant taxa, rbcL and matK regions were used in the study. The primer universality and discriminatory power of rbcL is 100%, while it is 35% for matK locus for these plant species. The sequences were submitted to GenBank; accession numbers were obtained for all the rbcL sequences and for 6 of matK sequences. We suggest rbcL as a promising barcode locus for the tested group of 51 plants. In the present study, an inexpensive, simple method of identification of rare desert plant taxa through rbcL barcode is being reported.
Thaler, David S; Stoeckle, Mark Y
2016-10-01
DNA barcodes for species identification and the analysis of human mitochondrial variation have developed as independent fields even though both are based on sequences from animal mitochondria. This study finds questions within each field that can be addressed by reference to the other. DNA barcodes are based on a 648-bp segment of the mitochondrially encoded cytochrome oxidase I. From most species, this segment is the only sequence available. It is impossible to know whether it fairly represents overall mitochondrial variation. For modern humans, the entire mitochondrial genome is available from thousands of healthy individuals. SNPs in the human mitochondrial genome are evenly distributed across all protein-encoding regions arguing that COI DNA barcode is representative. Barcode variation among related species is largely based on synonymous codons. Data on human mitochondrial variation support the interpretation that most - possibly all - synonymous substitutions in mitochondria are selectively neutral. DNA barcodes confirm reports of a low variance in modern humans compared to nonhuman primates. In addition, DNA barcodes allow the comparison of modern human variance to many other extant animal species. Birds are a well-curated group in which DNA barcodes are coupled with census and geographic data. Putting modern human variation in the context of intraspecies variation among birds shows humans to be a single breeding population of average variance.
Integrating DNA barcode data and taxonomic practice: determination, discovery, and description.
Goldstein, Paul Z; DeSalle, Rob
2011-02-01
DNA barcodes, like traditional sources of taxonomic information, are potentially powerful heuristics in the identification of described species but require mindful analytical interpretation. The role of DNA barcoding in generating hypotheses of new taxa in need of formal taxonomic treatment is discussed, and it is emphasized that the recursive process of character evaluation is both necessary and best served by understanding the empirical mechanics of the discovery process. These undertakings carry enormous ramifications not only for the translation of DNA sequence data into taxonomic information but also for our comprehension of the magnitude of species diversity and its disappearance. This paper examines the potential strengths and pitfalls of integrating DNA sequence data, specifically in the form of DNA barcodes as they are currently generated and analyzed, with taxonomic practice.
Osmundson, Todd W.; Robert, Vincent A.; Schoch, Conrad L.; Baker, Lydia J.; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M.
2013-01-01
Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1–2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa. PMID:23638077
Osmundson, Todd W; Robert, Vincent A; Schoch, Conrad L; Baker, Lydia J; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M
2013-01-01
Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1-2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa.
Rapidly evolving homing CRISPR barcodes
Kalhor, Reza; Mali, Prashant; Church, George M.
2017-01-01
We present here an approach for engineering evolving DNA barcodes in living cells. The methodology entails using a homing guide RNA (hgRNA) scaffold that directs the Cas9-hgRNA complex to target the DNA locus of the hgRNA itself. We show that this homing CRISPR-Cas9 system acts as an expressed genetic barcode that diversifies its sequence and that the rate of diversification can be controlled in cultured cells. We further evaluate these barcodes in cell populations and show the barcode RNAs can be assayed as single molecules in situ . This integrated approach will have wide ranging applications, such as in deep lineage tracing, cellular barcoding, molecular recording, dissecting cancer biology, and connectome mapping. PMID:27918539
DNA Barcoding through Quaternary LDPC Codes
Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar
2015-01-01
For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10−2 per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10−9 at the expense of a rate of read losses just in the order of 10−6. PMID:26492348
DNA Barcoding through Quaternary LDPC Codes.
Tapia, Elizabeth; Spetale, Flavio; Krsticevic, Flavia; Angelone, Laura; Bulacio, Pilar
2015-01-01
For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10(-2) per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10(-9) at the expense of a rate of read losses just in the order of 10(-6).
Wang, Y J; Li, Z H; Zhang, S F; Varadínová, Z; Jiang, F; Kučerová, Z; Stejskal, V; Opit, G; Cao, Y; Li, F J
2014-10-01
Several species of the genus Cryptolestes Ganglbauer, 1899 (Coleoptera: Laemophloeidae) are commonly found in stored products. In this study, five species of Cryptolestes, with almost worldwide distribution, were obtained from laboratories in China, Czech Republic and the USA: Cryptolestes ferrugineus (Stephens, 1831), Cryptolestes pusillus (Schönherr, 1817), Cryptolestes turcicus (Grouvelle, 1876), Cryptolestes pusilloides (Steel & Howe, 1952) and Cryptolestes capensis (Waltl, 1834). Molecular identification based on a 658 bp fragment from the mitochondrial DNA cytochrome c oxidase subunit I (COI) was adopted to overcome some problems of morphological identification of Cryptolestes species. The utility of COI sequences as DNA barcodes in discriminating the five Cryptolestes species was evaluated on adults and larvae by analysing Kimura 2-parameter distances, phylogenetic tree and haplotype networks. The results showed that molecular approaches based on DNA barcodes were able to accurately identify these species. This is the first study using DNA barcoding to identify Cryptolestes species and the gathered DNA sequences will complement the biological barcode database.
Escaping introns in COI through cDNA barcoding of mushrooms: Pleurotus as a test case.
Avin, Farhat A; Subha, Bhassu; Tan, Yee-Shin; Braukmann, Thomas W A; Vikineswary, Sabaratnam; Hebert, Paul D N
2017-09-01
DNA barcoding involves the use of one or more short, standardized DNA fragments for the rapid identification of species. A 648-bp segment near the 5' terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the universal DNA barcode for members of the animal kingdom, but its utility in mushrooms is complicated by the frequent occurrence of large introns. As a consequence, ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus , the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5' region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus . Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.
Tripathi, Abhinandan Mani; Tyagi, Antariksh; Kumar, Anoop; Singh, Akanksha; Singh, Shivani; Chaudhary, Lal Babu; Roy, Sribash
2013-01-01
DNA barcoding as a tool for species identification has been successful in animals and other organisms, including certain groups of plants. The exploration of this new tool for species identification, particularly in tree species, is very scanty from biodiversity-rich countries like India. rbcL and matK are standard barcode loci while ITS, and trnH-psbA are considered as supplementary loci for plants. Plant barcode loci, namely, rbcL, matK, ITS, trnH-psbA, and the recently proposed ITS2, were tested for their efficacy as barcode loci using 300 accessions of tropical tree species. We tested these loci for PCR, sequencing success, and species discrimination ability using three methods. rbcL was the best locus as far as PCR and sequencing success rate were concerned, but not for the species discrimination ability of tropical tree species. ITS and trnH-psbA were the second best loci in PCR and sequencing success, respectively. The species discrimination ability of ITS ranged from 24.4 percent to 74.3 percent and that of trnH-psbA was 25.6 percent to 67.7 percent, depending upon the data set and the method used. matK provided the least PCR success, followed by ITS2 (59. 0%). Species resolution by ITS2 and rbcL ranged from 9.0 percent to 48.7 percent and 13.2 percent to 43.6 percent, respectively. Further, we observed that the NCBI nucleotide database is poorly represented by the sequences of barcode loci studied here for tree species. Although a conservative approach of a success rate of 60-70 percent by both ITS and trnH-psbA may not be considered as highly successful but would certainly help in large-scale biodiversity inventorization, particularly for tropical tree species, considering the standard success rate of plant DNA barcode program reported so far. The recommended matK and rbcL primers combination may not work in tropical tree species as barcode markers.
ERIC Educational Resources Information Center
Rickards, Debbie; Hawes, Shirl
This book is meant as a resource that can be used by experienced writing teachers as a practical reference when planning writing lessons. The book includes a sequence of instruction, lesson ideas, enrichment options, literature and poetry connections, student samples, and all the ready-to-use materials teachers need to implement the…
[Identification of antler powder components based on DNA barcoding technology].
Jia, Jing; Shi, Lin-chun; Xu, Zhi-chao; Xin, Tian-yi; Song, Jing-yuan; Chen Shi, Lin
2015-10-01
In order to authenticate the components of antler powder in the market, DNA barcoding technology coupled with cloning method were used. Cytochrome c oxidase subunit I (COI) sequences were obtained according to the DNA barcoding standard operation procedure (SOP). For antler powder with possible mixed components, the cloning method was used to get each COI sequence. 65 COI sequences were successfully obtained from commercial antler powders via sequencing PCR products. The results indicates that only 38% of these samples were derived from Cervus nippon Temminck or Cervus elaphus Linnaeus which is recorded in the 2010 edition of "Chinese Pharmacopoeia", while 62% of them were derived from other species. Rangifer tarandus Linnaeus was the most frequent species among the adulterants. Further analysis showed that some samples collected from different regions, companies and prices, contained adulterants. Analysis of 36 COI sequences obtained by the cloning method showed that C. elaphus and C. nippon were main components. In addition, some samples were marked clearly as antler powder on the label, however, C. elaphus or R. tarandus were their main components. In summary, DNA barcoding can accurately and efficiently distinguish the exact content in the commercial antler powder, which provides a new technique to ensure clinical safety and improve quality control of Chinese traditional medicine
Preparation of Low-Input and Ligation-Free ChIP-seq Libraries Using Template-Switching Technology.
Bolduc, Nathalie; Lehman, Alisa P; Farmer, Andrew
2016-10-10
Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) has become the gold standard for mapping of transcription factors and histone modifications throughout the genome. However, for ChIP experiments involving few cells or targeting low-abundance transcription factors, the small amount of DNA recovered makes ligation of adapters very challenging. In this unit, we describe a ChIP-seq workflow that can be applied to small cell numbers, including a robust single-tube and ligation-free method for preparation of sequencing libraries from sub-nanogram amounts of ChIP DNA. An example ChIP protocol is first presented, resulting in selective enrichment of DNA-binding proteins and cross-linked DNA fragments immobilized on beads via an antibody bridge. This is followed by a protocol for fast and easy cross-linking reversal and DNA recovery. Finally, we describe a fast, ligation-free library preparation protocol, featuring DNA SMART technology, resulting in samples ready for Illumina sequencing. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P
2010-11-01
Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.
Borisjuk, N; Chu, P; Gutierrez, R; Zhang, H; Acosta, K; Friesen, N; Sree, K S; Garcia, C; Appenroth, K J; Lam, E
2015-01-01
Lemnaceae, commonly called duckweeds, comprise a diverse group of floating aquatic plants that have previously been classified into 37 species based on morphological and physiological criteria. In addition to their unique evolutionary position among angiosperms and their applications in biomonitoring, the potential of duckweeds as a novel sustainable crop for fuel and feed has recently increased interest in the study of their biodiversity and systematics. However, due to their small size and abbreviated structure, accurate typing of duckweeds based on morphology can be challenging. In the past decade, attempts to employ molecular barcoding techniques for species assignment have produced promising results; however, they have yet to be codified into a simple and quantitative protocol. A study that compiles and compares the barcode sequences within all known species of this family would help to establish the fidelity and limits of this DNA-based approach. In this work, we compared the level of conservation between over 100 strains of duckweed for two intergenic barcode sequences derived from the plastid genome. By using over 300 sequences publicly available in the NCBI database, we determined the utility of each of these two barcodes for duckweed species identification. Through sequencing of these barcodes from additional accessions, 30 of the 37 known species of duckweed could be identified with varying levels of confidence using this approach. From our analyses using this reference dataset, we also confirmed two instances where mis-assignment of species has likely occurred. Potential strategies for further improving the scope of this technology are discussed. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
DNA reference libraries of French Guianese mosquitoes for barcoding and metabarcoding
Leroy, Céline; Guidez, Amandine; Dusfour, Isabelle; Girod, Romain; Dejean, Alain; Murienne, Jérôme
2017-01-01
The mosquito family (Diptera: Culicidae) constitutes the most medically important group of arthropods because certain species are vectors of human pathogens. In some parts of the world, the diversity is so high that the accurate delimitation and/or identification of species is challenging. A DNA-based identification system for all animals has been proposed, the so-called DNA barcoding approach. In this study, our objectives were (i) to establish DNA barcode libraries for the mosquitoes of French Guiana based on the COI and the 16S markers, (ii) to compare distance-based and tree-based methods of species delimitation to traditional taxonomy, and (iii) to evaluate the accuracy of each marker in identifying specimens. A total of 266 specimens belonging to 75 morphologically identified species or morphospecies were analyzed allowing us to delimit 86 DNA clusters with only 21 of them already present in the BOLD database. We thus provide a substantial contribution to the global mosquito barcoding initiative. Our results confirm that DNA barcodes can be successfully used to delimit and identify mosquito species with only a few cases where the marker could not distinguish closely related species. Our results also validate the presence of new species identified based on morphology, plus potential cases of cryptic species. We found that both COI and 16S markers performed very well, with successful identifications at the species level of up to 98% for COI and 97% for 16S when compared to traditional taxonomy. This shows great potential for the use of metabarcoding for vector monitoring and eco-epidemiological studies. PMID:28575090
Dahruddin, Hadi; Hutama, Aditya; Busson, Frédéric; Sauri, Sopian; Hanner, Robert; Keith, Philippe; Hadiaty, Renny; Hubert, Nicolas
2017-03-01
Among the 899 species of freshwater fishes reported from Sundaland biodiversity hotspot, nearly 50% are endemics. The functional integrity of aquatic ecosystems is currently jeopardized by human activities, and landscape conversion led to the decline of fish populations in several part of Sundaland, particularly in Java. The inventory of the Javanese ichthyofauna has been discontinuous, and the taxonomic knowledge is scattered in the literature. This study provides a DNA barcode reference library for the inland fishes of Java and Bali with the aim to streamline the inventory of fishes in this part of Sundaland. Owing to the lack of available checklist for estimating the taxonomic coverage of this study, a checklist was compiled based on online catalogues. A total of 95 sites were visited, and a library including 1046 DNA barcodes for 159 species was assembled. Nearest neighbour distance was 28-fold higher than maximum intraspecific distance on average, and a DNA barcoding gap was observed. The list of species with DNA barcodes displayed large discrepancies with the checklist compiled here as only 36% (i.e. 77 species) and 60% (i.e. 24 species) of the known species were sampled in Java and Bali, respectively. This result was contrasted by a high number of new occurrences and the ceiling of the accumulation curves for both species and genera. These results highlight the poor taxonomic knowledge of this ichthyofauna, and the apparent discrepancy between present and historical occurrence data is to be attributed to species extirpations, synonymy and misidentifications in previous studies. © 2016 John Wiley & Sons Ltd.
In vivo insertion pool sequencing identifies virulence factors in a complex fungal–host interaction
Uhse, Simon; Pflug, Florian G.; Stirnberg, Alexandra; Ehrlinger, Klaus; von Haeseler, Arndt
2018-01-01
Large-scale insertional mutagenesis screens can be powerful genome-wide tools if they are streamlined with efficient downstream analysis, which is a serious bottleneck in complex biological systems. A major impediment to the success of next-generation sequencing (NGS)-based screens for virulence factors is that the genetic material of pathogens is often underrepresented within the eukaryotic host, making detection extremely challenging. We therefore established insertion Pool-Sequencing (iPool-Seq) on maize infected with the biotrophic fungus U. maydis. iPool-Seq features tagmentation, unique molecular barcodes, and affinity purification of pathogen insertion mutant DNA from in vivo-infected tissues. In a proof of concept using iPool-Seq, we identified 28 virulence factors, including 23 that were previously uncharacterized, from an initial pool of 195 candidate effector mutants. Because of its sensitivity and quantitative nature, iPool-Seq can be applied to any insertional mutagenesis library and is especially suitable for genetically complex setups like pooled infections of eukaryotic hosts. PMID:29684023
Conran, John G.; Li, Jie
2015-01-01
Background Within a regional floristic context, DNA barcoding is more useful to manage plant diversity inventories on a large scale and develop valuable conservation strategies. However, there are no DNA barcode studies from tropical areas of China, which represents one of the biodiversity hotspots around the world. Methodology and Principal Findings A DNA barcoding database of an Asian tropical trees with high diversity was established at Xishuangbanna Nature Reserve, Yunnan, southwest China using rbcL and matK as standard barcodes, as well as trnH–psbA and ITS as supplementary barcodes. The performance of tree species identification success was assessed using 2,052 accessions from four plots belonging to two vegetation types in the region by three methods: Neighbor-Joining, Maximum-Likelihood and BLAST. We corrected morphological field identification errors (9.6%) for the three plots using rbcL and matK based on Neighbor-Joining tree. The best barcode region for PCR and sequencing was rbcL (97.6%, 90.8%), followed by trnH–psbA (93.6%, 85.6%), while matK and ITS obtained relative low PCR and sequencing success rates. However, ITS performed best for both species (44.6–58.1%) and genus (72.8–76.2%) identification. With trnH–psbA slightly less effective for species identification. The two standard barcode rbcL and matK gave poor results for species identification (24.7–28.5% and 31.6–35.3%). Compared with other studies from comparable tropical forests (e.g. Cameroon, the Amazon and India), the overall performance of the four barcodes for species identification was lower for the Xishuangbanna Nature Reserve, possibly because of species/genus ratios and species composition between these tropical areas. Conclusions/Significance Although the core barcodes rbcL and matK were not suitable for species identification of tropical trees from Xishuangbanna Nature Reserve, they could still help with identification at the family and genus level. Considering the relative sequence recovery and the species identification performance, we recommend the use of trnH–psbA and ITS in combination as the preferred barcodes for tropical tree species identification in China. PMID:26121045
Digitally encoded DNA nanostructures for multiplexed, single-molecule protein sensing with nanopores
NASA Astrophysics Data System (ADS)
Bell, Nicholas A. W.; Keyser, Ulrich F.
2016-07-01
The simultaneous detection of a large number of different analytes is important in bionanotechnology research and in diagnostic applications. Nanopore sensing is an attractive method in this regard as the approach can be integrated into small, portable device architectures, and there is significant potential for detecting multiple sub-populations in a sample. Here, we show that highly multiplexed sensing of single molecules can be achieved with solid-state nanopores by using digitally encoded DNA nanostructures. Based on the principles of DNA origami, we designed a library of DNA nanostructures in which each member contains a unique barcode; each bit in the barcode is signalled by the presence or absence of multiple DNA dumbbell hairpins. We show that a 3-bit barcode can be assigned with 94% accuracy by electrophoretically driving the DNA structures through a solid-state nanopore. Select members of the library were then functionalized to detect a single, specific antibody through antigen presentation at designed positions on the DNA. This allows us to simultaneously detect four different antibodies of the same isotype at nanomolar concentration levels.
Bell, Nicholas A W; Keyser, Ulrich F
2016-07-01
The simultaneous detection of a large number of different analytes is important in bionanotechnology research and in diagnostic applications. Nanopore sensing is an attractive method in this regard as the approach can be integrated into small, portable device architectures, and there is significant potential for detecting multiple sub-populations in a sample. Here, we show that highly multiplexed sensing of single molecules can be achieved with solid-state nanopores by using digitally encoded DNA nanostructures. Based on the principles of DNA origami, we designed a library of DNA nanostructures in which each member contains a unique barcode; each bit in the barcode is signalled by the presence or absence of multiple DNA dumbbell hairpins. We show that a 3-bit barcode can be assigned with 94% accuracy by electrophoretically driving the DNA structures through a solid-state nanopore. Select members of the library were then functionalized to detect a single, specific antibody through antigen presentation at designed positions on the DNA. This allows us to simultaneously detect four different antibodies of the same isotype at nanomolar concentration levels.
DNA barcoding insect–host plant associations
Jurado-Rivera, José A.; Vogler, Alfried P.; Reid, Chris A.M.; Petitpierre, Eduard; Gómez-Zurita, Jesús
2008-01-01
Short-sequence fragments (‘DNA barcodes’) used widely for plant identification and inventorying remain to be applied to complex biological problems. Host–herbivore interactions are fundamental to coevolutionary relationships of a large proportion of species on the Earth, but their study is frequently hampered by limited or unreliable host records. Here we demonstrate that DNA barcodes can greatly improve this situation as they (i) provide a secure identification of host plant species and (ii) establish the authenticity of the trophic association. Host plants of leaf beetles (subfamily Chrysomelinae) from Australia were identified using the chloroplast trnL(UAA) intron as barcode amplified from beetle DNA extracts. Sequence similarity and phylogenetic analyses provided precise identifications of each host species at tribal, generic and specific levels, depending on the available database coverage in various plant lineages. The 76 species of Chrysomelinae included—more than 10 per cent of the known Australian fauna—feed on 13 plant families, with preference for Australian radiations of Myrtaceae (eucalypts) and Fabaceae (acacias). Phylogenetic analysis of beetles shows general conservation of host association but with rare host shifts between distant plant lineages, including a few cases where barcodes supported two phylogenetically distant host plants. The study demonstrates that plant barcoding is already feasible with the current publicly available data. By sequencing plant barcodes directly from DNA extractions made from herbivorous beetles, strong physical evidence for the host association is provided. Thus, molecular identification using short DNA fragments brings together the detection of species and the analysis of their interactions. PMID:19004756
Guo, Shaokun; He, Jia; Zhao, Zihua; Liu, Lijun; Gao, Liyuan; Wei, Shuhua; Guo, Xiaoyu; Zhang, Rong; Li, Zhihong
2017-12-12
Neoceratitis asiatica (Becker), which especially infests wolfberry (Lycium barbarum L.), could cause serious economic losses every year in China, especially to organic wolfberry production. In some important wolfberry plantings, it is difficult and time-consuming to rear the larvae or pupae to adults for morphological identification. Molecular identification based on DNA barcode is a solution to the problem. In this study, 15 samples were collected from Ningxia, China. Among them, five adults were identified according to their morphological characteristics. The utility of mitochondrial DNA (mtDNA) cytochrome c oxidase I (COI) gene sequence as DNA barcode in distinguishing N. asiatica was evaluated by analysing Kimura 2-parameter distances and phylogenetic trees. There were significant differences between intra-specific and inter-specific genetic distances according to the barcoding gap analysis. The uncertain larval and pupal samples were within the same cluster as N. asiatica adults and formed sister cluster to N. cyanescens. A combination of morphological and molecular methods enabled accurate identification of N. asiatica. This is the first study using DNA barcode to identify N. asiatica and the obtained DNA sequences will be added to the DNA barcode database.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes.
Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar
2012-01-01
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. The database is available for free at http://mail.nbfgr.res.in/fbis/
DNA barcodes for 1/1000 of the animal kingdom.
Hebert, Paul D N; Dewaard, Jeremy R; Landry, Jean-François
2010-06-23
This study reports DNA barcodes for more than 1300 Lepidoptera species from the eastern half of North America, establishing that 99.3 per cent of these species possess diagnostic barcode sequences. Intraspecific divergences averaged just 0.43 per cent among this assemblage, but most values were lower. The mean was elevated by deep barcode divergences (greater than 2%) in 5.1 per cent of the species, often involving the sympatric occurrence of two barcode clusters. A few of these cases have been analysed in detail, revealing species overlooked by the current taxonomic system. This study also provided a large-scale test of the extent of regional divergence in barcode sequences, indicating that geographical differentiation in the Lepidoptera of eastern North America is small, even when comparisons involve populations as much as 2800 km apart. The present results affirm that a highly effective system for the identification of Lepidoptera in this region can be built with few records per species because of the limited intra-specific variation. As most terrestrial and marine taxa are likely to possess a similar pattern of population structure, an effective DNA-based identification system can be developed with modest effort.
Hoshino, Tatsuhiko; Inagaki, Fumio
2017-01-01
Next-generation sequencing (NGS) is a powerful tool for analyzing environmental DNA and provides the comprehensive molecular view of microbial communities. For obtaining the copy number of particular sequences in the NGS library, however, additional quantitative analysis as quantitative PCR (qPCR) or digital PCR (dPCR) is required. Furthermore, number of sequences in a sequence library does not always reflect the original copy number of a target gene because of biases caused by PCR amplification, making it difficult to convert the proportion of particular sequences in the NGS library to the copy number using the mass of input DNA. To address this issue, we applied stochastic labeling approach with random-tag sequences and developed a NGS-based quantification protocol, which enables simultaneous sequencing and quantification of the targeted DNA. This quantitative sequencing (qSeq) is initiated from single-primer extension (SPE) using a primer with random tag adjacent to the 5' end of target-specific sequence. During SPE, each DNA molecule is stochastically labeled with the random tag. Subsequently, first-round PCR is conducted, specifically targeting the SPE product, followed by second-round PCR to index for NGS. The number of random tags is only determined during the SPE step and is therefore not affected by the two rounds of PCR that may introduce amplification biases. In the case of 16S rRNA genes, after NGS sequencing and taxonomic classification, the absolute number of target phylotypes 16S rRNA gene can be estimated by Poisson statistics by counting random tags incorporated at the end of sequence. To test the feasibility of this approach, the 16S rRNA gene of Sulfolobus tokodaii was subjected to qSeq, which resulted in accurate quantification of 5.0 × 103 to 5.0 × 104 copies of the 16S rRNA gene. Furthermore, qSeq was applied to mock microbial communities and environmental samples, and the results were comparable to those obtained using digital PCR and relative abundance based on a standard sequence library. We demonstrated that the qSeq protocol proposed here is advantageous for providing less-biased absolute copy numbers of each target DNA with NGS sequencing at one time. By this new experiment scheme in microbial ecology, microbial community compositions can be explored in more quantitative manner, thus expanding our knowledge of microbial ecosystems in natural environments.
BAsE-Seq: a method for obtaining long viral haplotypes from short sequence reads.
Hong, Lewis Z; Hong, Shuzhen; Wong, Han Teng; Aw, Pauline P K; Cheng, Yan; Wilm, Andreas; de Sessions, Paola F; Lim, Seng Gee; Nagarajan, Niranjan; Hibberd, Martin L; Quake, Stephen R; Burkholder, William F
2014-01-01
We present a method for obtaining long haplotypes, of over 3 kb in length, using a short-read sequencer, Barcode-directed Assembly for Extra-long Sequences (BAsE-Seq). BAsE-Seq relies on transposing a template-specific barcode onto random segments of the template molecule and assembling the barcoded short reads into complete haplotypes. We applied BAsE-Seq on mixed clones of hepatitis B virus and accurately identified haplotypes occurring at frequencies greater than or equal to 0.4%, with >99.9% specificity. Applying BAsE-Seq to a clinical sample, we obtained over 9,000 viral haplotypes, which provided an unprecedented view of hepatitis B virus population structure during chronic infection. BAsE-Seq is readily applicable for monitoring quasispecies evolution in viral diseases.
Four years of DNA barcoding: current advances and prospects.
Frézal, Lise; Leblois, Raphael
2008-09-01
Research using cytochrome c oxidase barcoding techniques on zoological specimens was initiated by Hebert et al. [Hebert, P.D.N., Ratnasingham, S., deWaard, J.R., 2003. Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc. R. Soc. Lond. B 270, S96-S99]. By March 2004, the Consortium for the Barcode of Life started to promote the use of a standardized DNA barcoding approach, consisting of identifying a specimen as belonging to a certain animal species based on a single universal marker: the DNA barcode sequence. Over the last 4 years, this approach has become increasingly popular and advances as well as limitations have clearly emerged as increasing amounts of organisms have been studied. Our purpose is to briefly expose DNA Barcode of Life principles, pros and cons, relevance and universality. The initially proposed Barcode of life framework has greatly evolved, giving rise to a flexible description of DNA barcoding and a larger range of applications.
Song, Chao; Wang, Qian; Zhang, Ruilei; Sun, Bingjiao; Wang, Xinhua
2016-02-16
In this study, we tested the utility of the mitochondrial gene cytochrome c oxidase subunit 1 (CO1) as the barcode region to deal with taxonomical problems of Polypedilum (Tripodura) non-biting midges (Diptera: Chironomidae). The 114 DNA barcodes representing 27 morphospecies are divided into 33 well separated clusters based on both Neighbor Joining and Maximum Likelihood methods. DNA barcodes revealed an 82% success rate in matching with morphospecies. The selected DNA barcode data support 37-64 operational taxonomic units (OTUs) based on the methods of Automatic Barcode Gap Discovery (ABGD) and Poisson Tree Process (PTP). Furthermore, a priori species based on consistent phenotypic variations were attested by molecular analysis, and a taxonomical misidentification of barcode sequences from GenBank was found. We could not observe a distinct barcode gap but an overlap ranged from 9-12%. Our results supported DNA barcoding as an ideal method to detect cryptic species, delimit sibling species, and associate different life stages in non-biting midges.
Spelda, Jörg; Reip, Hans S.; Oliveira–Biener, Ulla; Melzer, Roland R.
2011-01-01
Abstract We give a first account of our ongoing barcoding activities on Bavarian myriapods in the framework of the Barcoding Fauna Bavarica project and IBOL, the International Barcode of Life. Having analyzed 126 taxa (including 122 species) belonging to all major German chilopod and diplopod lineages, often using four or more specimens each, at the moment our species stock includes 82% of the diplopods and 65% of the chilopods found in Bavaria, southern Germany. The partial COI sequences allow correct identification of more than 95% of the current set of Bavarian species. Moreover, most of the myriapod orders and families appear as distinct clades in neighbour-joining trees, although the phylogenetic relationships between them are not always depicted correctly. We give examples of (1) high interspecific sequence variability among closely related species; (2) low interspecific variability in some chordeumatidan genera, indicating that recent speciations cannot be resolved with certainty using COI DNA barcodes; (3) high intraspecific variation in some genera, suggesting the existence of cryptic lineages; and (4) the possible polyphyly of some taxa, i.e. the chordeumatidan genus Ochogona. This shows that, in addition to species identification, our data may be useful in various ways in the context of species delimitations, taxonomic revisions and analyses of ongoing speciation processes. PMID:22303099
Delineating Species with DNA Barcodes: A Case of Taxon Dependent Method Performance in Moths
Kekkonen, Mari; Mutanen, Marko; Kaila, Lauri; Nieminen, Marko; Hebert, Paul D. N.
2015-01-01
The accelerating loss of biodiversity has created a need for more effective ways to discover species. Novel algorithmic approaches for analyzing sequence data combined with rapidly expanding DNA barcode libraries provide a potential solution. While several analytical methods are available for the delineation of operational taxonomic units (OTUs), few studies have compared their performance. This study compares the performance of one morphology-based and four DNA-based (BIN, parsimony networks, ABGD, GMYC) methods on two groups of gelechioid moths. It examines 92 species of Finnish Gelechiinae and 103 species of Australian Elachistinae which were delineated by traditional taxonomy. The results reveal a striking difference in performance between the two taxa with all four DNA-based methods. OTU counts in the Elachistinae showed a wider range and a relatively low (ca. 65%) OTU match with reference species while OTU counts were more congruent and performance was higher (ca. 90%) in the Gelechiinae. Performance rose when only monophyletic species were compared, but the taxon-dependence remained. None of the DNA-based methods produced a correct match with non-monophyletic species, but singletons were handled well. A simulated test of morphospecies-grouping performed very poorly in revealing taxon diversity in these small, dull-colored moths. Despite the strong performance of analyses based on DNA barcodes, species delineated using single-locus mtDNA data are best viewed as OTUs that require validation by subsequent integrative taxonomic work. PMID:25849083
Chen, Weitao; Ma, Xiuhui; Shen, Yanjun; Mao, Yuntao; He, Shunping
2015-11-30
Nujiang River (NR), an essential component of the biodiversity hotspot of the Mountains of Southwest China, possesses a characteristic fish fauna and contains endemic species. Although previous studies on fish diversity in the NR have primarily consisted of listings of the fish species observed during field collections, in our study, we DNA-barcoded 1139 specimens belonging to 46 morphologically distinct fish species distributed throughout the NR basin by employing multiple analytical approaches. According to our analyses, DNA barcoding is an efficient method for the identification of fish by the presence of barcode gaps. However, three invasive species are characterized by deep conspecific divergences, generating multiple lineages and Operational Taxonomic Units (OTUs), implying the possibility of cryptic species. At the other end of the spectrum, ten species (from three genera) that are characterized by an overlap between their intra- and interspecific genetic distances form a single genetic cluster and share haplotypes. The neighbor-joining phenogram, Barcode Index Numbers (BINs) and Automatic Barcode Gap Discovery (ABGD) identified 43 putative species, while the General Mixed Yule-coalescence (GMYC) identified five more OTUs. Thus, our study established a reliable DNA barcode reference library for the fish in the NR and sheds new light on the local fish diversity.
Chen, Weitao; Ma, Xiuhui; Shen, Yanjun; Mao, Yuntao; He, Shunping
2015-01-01
Nujiang River (NR), an essential component of the biodiversity hotspot of the Mountains of Southwest China, possesses a characteristic fish fauna and contains endemic species. Although previous studies on fish diversity in the NR have primarily consisted of listings of the fish species observed during field collections, in our study, we DNA-barcoded 1139 specimens belonging to 46 morphologically distinct fish species distributed throughout the NR basin by employing multiple analytical approaches. According to our analyses, DNA barcoding is an efficient method for the identification of fish by the presence of barcode gaps. However, three invasive species are characterized by deep conspecific divergences, generating multiple lineages and Operational Taxonomic Units (OTUs), implying the possibility of cryptic species. At the other end of the spectrum, ten species (from three genera) that are characterized by an overlap between their intra- and interspecific genetic distances form a single genetic cluster and share haplotypes. The neighbor-joining phenogram, Barcode Index Numbers (BINs) and Automatic Barcode Gap Discovery (ABGD) identified 43 putative species, while the General Mixed Yule-coalescence (GMYC) identified five more OTUs. Thus, our study established a reliable DNA barcode reference library for the fish in the NR and sheds new light on the local fish diversity. PMID:26616046
Use of DNA barcodes to identify flowering plants
Kress, W. John; Wurdack, Kenneth J.; Zimmer, Elizabeth A.; Weigt, Lee A.; Janzen, Daniel H.
2005-01-01
Methods for identifying species by using short orthologous DNA sequences, known as “DNA barcodes,” have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We therefore propose the nuclear internal transcribed spacer region and the plastid trnH-psbA intergenic spacer as potentially usable DNA regions for applying barcoding to flowering plants. The internal transcribed spacer is the most commonly sequenced locus used in plant phylogenetic investigations at the species level and shows high levels of interspecific divergence. The trnH-psbA spacer, although short (≈450-bp), is the most variable plastid region in angiosperms and is easily amplified across a broad range of land plants. Comparison of the total plastid genomes of tobacco and deadly nightshade enhanced with trials on widely divergent angiosperm taxa, including closely related species in seven plant families and a group of species sampled from a local flora encompassing 50 plant families (for a total of 99 species, 80 genera, and 53 families), suggest that the sequences in this pair of loci have the potential to discriminate among the largest number of plant species for barcoding purposes. PMID:15928076
NASA Astrophysics Data System (ADS)
Closek, C. J.; Langevin, S.; Burge, C. A.; Crosson, L.; White, S.; Friedman, C. S.
2016-02-01
Withering syndrome (WS), caused by the bacterium Candidatus Xenohaliotis californiensis, a Rickettsia-like organism (RLO), infects many species of abalone. Black abalone (Haliotis cracherodii), one of two endangered species of abalone, has experienced high population losses along the California coast due to WS. Recently, we observed reduced pathogenicity and mortality events in RLO-infected abalone when a novel bacteriophage (phage) was also present. To better understand phage-bacterium dynamics and develop more informative diagnostic tools, we sequenced the genome of the novel phage associated with the RLO responsible for WS. Metagenomic sequencing libraries were prepared with extracted genomic DNA from two experimentally infected H. cracherodii and phage sequences were enriched using hydroxyapatite chromatography normalization. Normalized libraries were individually barcoded and sequenced with Illumina MiSeq. Raw sequence reads were processed using VIrominer and de novo assembly produced one single phage-like contig (35.7Kb) from the experimentally infected abalone. This highly divergent genome had closest homology with a virus associated with abalone shriveling syndrome (SS). Of the 34 predicted ORFs, overlapping homology with the SS virus ranged from 20-72%, demonstrating the phage sequenced is genetically distinct from any known phage. The phage-like sequences represented a significant portion of the total reads sequenced ( 2 million of the 12 million paired-end reads; 17%) and we obtained 94,000X coverage across the novel phage genome. Beyond characterization of this novel phage, which appears to reduce pathogenicity of the RLO, the genome enabled us to develop quantitative PCR and in situ hybridization assays as diagnostic tools. These tools allow us to detect and quantify this phage in the endangered H. cracherodii.
Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing
2012-01-01
Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent ‘read-backmapping’ to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach. Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results. PMID:22913592
High-throughput, image-based screening of pooled genetic variant libraries
Emanuel, George; Moffitt, Jeffrey R.; Zhuang, Xiaowei
2018-01-01
Image-based, high-throughput screening of genetic perturbations will advance both biology and biotechnology. We report a high-throughput screening method that allows diverse genotypes and corresponding phenotypes to be imaged in numerous individual cells. We achieve genotyping by introducing barcoded genetic variants into cells and using massively multiplexed FISH to measure the barcodes. We demonstrated this method by screening mutants of the fluorescent protein YFAST, yielding brighter and more photostable YFAST variants. PMID:29083401
Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A
2011-01-01
Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.
Bybee, Seth M.; Bracken-Grissom, Heather; Haynes, Benjamin D.; Hermansen, Russell A.; Byers, Robert L.; Clement, Mark J.; Udall, Joshua A.; Wilcox, Edward R.; Crandall, Keith A.
2011-01-01
Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach. PMID:22002916
Polseela, Raxsina; Jaturas, Narong; Thanwisai, Aunchalee; Sing, Kong-Wah; Wilson, John-James
2016-09-01
Sandflies vary in their distributions and role in pathogen transmission. Attempts to record distributions of sandflies in Thailand have faced difficulties due to their high abundance and diversity. We aim to provide an insight into the diversity of sandflies in Thailand by (i) conducting a literature review, and (ii) DNA barcoding sandflies collected from Wihan Cave where eight morphologically characterized species were recorded. DNA barcodes generated for 193 sandflies fell into 13 distinct species clusters under four genera (Chinius, Idiophlebotomus, Phlebotomus and Sergentomyia). Five of these species could be assigned Linnaean species names unambiguously and two others corresponded to characterized morphospecies. Two species represented a complex under the name Sergentomyia barraudi while the remaining four had not been recognized before in any form. The resulting species checklist and DNA barcode library contribute to a growing set of records for sandflies which is useful for monitoring and vector control.
[Identification of Tibetan medicine "Dida" of Gentianaceae using DNA barcoding].
Liu, Chuan; Zhang, Yu-Xin; Liu, Yue; Chen, Yi-Long; Fan, Gang; Xiang, Li; Xu, Jiang; Zhang, Yi
2016-02-01
The ITS2 barcode was used toidentify Tibetan medicine "Dida", and tosecure its quality and safety in medication. A total of 13 species, 151 experimental samples for the study from the Tibetan Plateau, including Gentianaceae Swertia, Halenia, Gentianopsis, Comastoma, Lomatogonium ITS2 sequences were amplified, and purified PCR products were sequenced. Sequence assembly and consensus sequence generation were performed using the CodonCode Aligner V3.7.1. The Kimura 2-Parameter (K2P) distances were calculated using MEGA 6.0. The neighbor-joining (NJ) phylogenetic trees were constructed. There are 31 haplotypes among 231 bp after alignment of all ITS2 sequence haplotypes, and the average G±C content of 61.40%. The NJ tree strongly supported that every species clustered into their own clade and high identification success rate, except that Swertia bifolia and Swertia wolfangiana could not be distinguished from each other based on the sequence divergences. DNA barcoding could be used as a fast and accurate identification method to distinguish Tibetan medicine "Dida" to ensure its safe use. Copyright© by the Chinese Pharmaceutical Association.
Applications of three DNA barcodes in assorting intertidal red macroalgal flora in Qingdao, China
NASA Astrophysics Data System (ADS)
Zhao, Xiaobo; Pang, Shaojun; Shan, Tifeng; Liu, Feng
2013-03-01
This study is part of the endeavor to construct a comprehensive DNA barcoding database for common seaweeds in China. Identifications of red seaweeds, which have simple morphology and anatomy, are sometimes difficult solely depending on morphological characteristics. In recent years, DNA barcode technique has become a more and more effective tool to help solve some of the taxonomic difficulties. Some DNA markers such as COI (cytochrome oxidase subunit I) are proposed as standardized DNA barcodes for all seaweed species. In this study, COI, UPA (universal plastid amplicon, domain V of 23S rRNA), and ITS (nuclear internal transcribed spacer) were employed to analyze common species of intertidal red seaweeds in Qingdao (119.3°-121°E, 35.35°-37.09°N). The applicability of using one or a few combined barcodes to identify red seaweed species was tested. The results indicated that COI is a sensitive marker at species level. However, not all the tested species gave PCR amplification products due to lack of the universal primers. The second barcode UPA had effective universal primers but needed to be tested for the effectiveness of resolving closely related species. More than one ITS sequence types were found in some species in this investigation, which might lead to confusion in further analysis. Therefore ITS sequence is not recommended as a universal barcode for seaweeds identification.
The D1-D2 region of the large subunit ribosomal DNA as barcode for ciliates.
Stoeck, T; Przybos, E; Dunthorn, M
2014-05-01
Ciliates are a major evolutionary lineage within the alveolates, which are distributed in nearly all habitats on our planet and are an essential component for ecosystem function, processes and stability. Accurate identification of these unicellular eukaryotes through, for example, microscopy or mating type reactions is reserved to few specialists. To satisfy the demand for a DNA barcode for ciliates, which meets the standard criteria for DNA barcodes defined by the Consortium for the Barcode of Life (CBOL), we here evaluated the D1-D2 region of the ribosomal DNA large subunit (LSU-rDNA). Primer universality for the phylum Ciliophora was tested in silico with available database sequences as well as in the laboratory with 73 ciliate species, which represented nine of 12 ciliate classes. Primers tested in this study were successful for all tested classes. To test the ability of the D1-D2 region to resolve conspecific and congeneric sequence divergence, 63 Paramecium strains were sampled from 24 mating species. The average conspecific D1-D2 variation was 0.18%, whereas congeneric sequence divergence averaged 4.83%. In pairwise genetic distance analyses, we identified a D1-D2 sequence divergence of <0.6% as an ideal threshold to discriminate Paramecium species. Using this definition, only 3.8% of all conspecific and 3.9% of all congeneric sequence comparisons had the potential of false assignments. Neighbour-joining analyses inferred monophyly for all taxa but for two Paramecium octaurelia strains. Here, we present a protocol for easy DNA amplification of single cells and voucher deposition. In conclusion, the presented data pinpoint the D1-D2 region as an excellent candidate for an official CBOL barcode for ciliated protists. © 2013 John Wiley & Sons Ltd.
Authentication of Botanical Origin in Herbal Teas by Plastid Noncoding DNA Length Polymorphisms.
Uncu, Ali Tevfik; Uncu, Ayse Ozgur; Frary, Anne; Doganlar, Sami
2015-07-01
The aim of this study was to develop a DNA barcode assay to authenticate the botanical origin of herbal teas. To reach this aim, we tested the efficiency of a PCR-capillary electrophoresis (PCR-CE) approach on commercial herbal tea samples using two noncoding plastid barcodes, the trnL intron and the intergenic spacer between trnL and trnF. Barcode DNA length polymorphisms proved successful in authenticating the species origin of herbal teas. We verified the validity of our approach by sequencing species-specific barcode amplicons from herbal tea samples. Moreover, we displayed the utility of PCR-CE assays coupled with sequencing to identify the origin of undeclared plant material in herbal tea samples. The PCR-CE assays proposed in this work can be applied as routine tests for the verification of botanical origin in herbal teas and can be extended to authenticate all types of herbal foodstuffs.
Randrianjatovo-Gbalou, Irina; Rosario, Sandrine; Sismeiro, Odile; Varet, Hugo; Legendre, Rachel; Coppée, Jean-Yves; Huteau, Valérie; Pochet, Sylvie; Delarue, Marc
2018-05-21
Nucleic acid aptamers, especially RNA, exhibit valuable advantages compared to protein therapeutics in terms of size, affinity and specificity. However, the synthesis of libraries of large random RNAs is still difficult and expensive. The engineering of polymerases able to directly generate these libraries has the potential to replace the chemical synthesis approach. Here, we start with a DNA polymerase that already displays a significant template-free nucleotidyltransferase activity, human DNA polymerase theta, and we mutate it based on the knowledge of its three-dimensional structure as well as previous mutational studies on members of the same polA family. One mutant exhibited a high tolerance towards ribonucleotides (NTPs) and displayed an efficient ribonucleotidyltransferase activity that resulted in the assembly of long RNA polymers. HPLC analysis and RNA sequencing of the products were used to quantify the incorporation of the four NTPs as a function of initial NTP concentrations and established the randomness of each generated nucleic acid sequence. The same mutant revealed a propensity to accept other modified nucleotides and to extend them in long fragments. Hence, this mutant can deliver random natural and modified RNA polymers libraries ready to use for SELEX, with custom lengths and balanced or unbalanced ratios.
Pomerantz, Aaron; Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan
2018-04-01
Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism.
DNA barcoding detects contamination and substitution in North American herbal products
2013-01-01
Background Herbal products available to consumers in the marketplace may be contaminated or substituted with alternative plant species and fillers that are not listed on the labels. According to the World Health Organization, the adulteration of herbal products is a threat to consumer safety. Our research aimed to investigate herbal product integrity and authenticity with the goal of protecting consumers from health risks associated with product substitution and contamination. Methods We used DNA barcoding to conduct a blind test of the authenticity for (i) 44 herbal products representing 12 companies and 30 different species of herbs, and (ii) 50 leaf samples collected from 42 herbal species. Our laboratory also assembled the first standard reference material (SRM) herbal barcode library from 100 herbal species of known provenance that were used to identify the unknown herbal products and leaf samples. Results We recovered DNA barcodes from most herbal products (91%) and all leaf samples (100%), with 95% species resolution using a tiered approach (rbcL + ITS2). Most (59%) of the products tested contained DNA barcodes from plant species not listed on the labels. Although we were able to authenticate almost half (48%) of the products, one-third of these also contained contaminants and or fillers not listed on the label. Product substitution occurred in 30/44 of the products tested and only 2/12 companies had products without any substitution, contamination or fillers. Some of the contaminants we found pose serious health risks to consumers. Conclusions Most of the herbal products tested were of poor quality, including considerable product substitution, contamination and use of fillers. These activities dilute the effectiveness of otherwise useful remedies, lowering the perceived value of all related products because of a lack of consumer confidence in them. We suggest that the herbal industry should embrace DNA barcoding for authenticating herbal products through testing of raw materials used in manufacturing products. The use of an SRM DNA herbal barcode library for testing bulk materials could provide a method for 'best practices? in the manufacturing of herbal products. This would provide consumers with safe, high quality herbal products. PMID:24120035
Falade, Mofolusho O.; Opene, Anthony J.; Benson, Otarigho
2016-01-01
DNA barcoding has been adopted as a gold standard rapid, precise and unifying identification system for animal species and provides a database of genetic sequences that can be used as a tool for universal species identification. In this study, we employed mitochondrial genes 16S rRNA (16S) and cytochrome oxidase subunit I (COI) for the identification of some Nigerian freshwater catfish and Tilapia species. Approximately 655 bp were amplified from the 5′ region of the mitochondrial cytochrome C oxidase subunit I (COI) gene whereas 570 bp were amplified for the 16S rRNA gene. Nucleotide divergences among sequences were estimated based on Kimura 2-parameter distances and the genetic relationships were assessed by constructing phylogenetic trees using the neighbour-joining (NJ) and maximum likelihood (ML) methods. Analyses of consensus barcode sequences for each species, and alignment of individual sequences from within a given species revealed highly consistent barcodes (99% similarity on average), which could be compared with deposited sequences in public databases. The nucleotide distance between species belonging to different genera based on COI ranged from 0.17% between Sarotherodon melanotheron and Coptodon zillii to 0.49% between Clarias gariepinus and C. zillii, indicating that S. melanotheron and C. zillii are closely related. Based on the data obtained, the utility of COI gene was confirmed in accurate identification of three fish species from Southwest Nigeria. PMID:27990256
Identification of processed Chinese medicinal materials using DNA mini-barcoding.
Song, Ming; Dong, Gang-Qiang; Zhang, Ya-Qin; Liu, Xia; Sun, Wei
2017-07-01
Most of Chinese medicinal herbs are subjected to traditional processing procedures, including stir-frying, charring, steaming, boiling, and calcining before they are released into dispensaries. The marketing and identification of processed medicinal materials is a growing issue in the marketplace. However, conventional methods of identification have limitations, while DNA mini-barcoding, based on the sequencing of a short-standardized region, has received considerable attention as a new potential means to identify processed medicinal materials. In the present study, six DNA barcode loci including ITS2, psbA-trnH, rbcL, matK, trnL (UAA) intron and its P6 loop, were employed for the authentication of 45 processed samples belonging to 15 species. We evaluated the amplification efficiency of each locus. We also examined the identification accuracy of the potential mini-barcode locus, of trnL (UAA) intron P6 loop. Our results showed that the five primary barcode loci were successfully amplified in only 8.89%-20% of the processed samples, while the amplification rates of the trnL (UAA) intron P6 loop were higher, at 75.56% successful amplification. We compared the mini-barcode sequences with Genbank using the Blast program. The analysis showed that 45.23% samples could be identified to genus level, while only one sample could be identified to the species level. We conclude that trnL (UAA) p6 loop is a candidate mini-barcode that has shown its potential and may become a universal mini-barcode as complementary barcode for authenticity testing and will play an important role in medicinal materials control. Copyright © 2017 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
DNA barcoding the native flowering plants and conifers of Wales.
de Vere, Natasha; Rich, Tim C G; Ford, Col R; Trinder, Sarah A; Long, Charlotte; Moore, Chris W; Satterthwaite, Danielle; Davies, Helena; Allainguillaume, Joel; Ronca, Sandra; Tatarinova, Tatiana; Garbett, Hannah; Walker, Kevin; Wilkinson, Mike J
2012-01-01
We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification.
Stein, Eric D; White, Bryan P; Mazor, Raphael D; Miller, Peter E; Pilgrim, Erik M
2013-01-01
Molecular methods, such as DNA barcoding, have the potential to enhance biomonitoring programs worldwide. Altering routinely used sample preservation methods to protect DNA from degradation may pose a potential impediment to application of DNA barcoding and metagenomics for biomonitoring using benthic macroinvertebrates. Using higher volumes or concentrations of ethanol, requirements for shorter holding times, or the need to include additional filtering may increase cost and logistical constraints to existing biomonitoring programs. To address this issue we evaluated the efficacy of various ethanol-based sample preservation methods at maintaining DNA integrity. We evaluated a series of methods that were minimally modified from typical field protocols in order to identify an approach that can be readily incorporated into existing monitoring programs. Benthic macroinvertebrates were collected from a minimally disturbed stream in southern California, USA and subjected to one of six preservation treatments. Ten individuals from five taxa were selected from each treatment and processed to produce DNA barcodes from the mitochondrial gene cytochrome c oxidase I (COI). On average, we obtained successful COI sequences (i.e. either full or partial barcodes) for between 93-99% of all specimens across all six treatments. As long as samples were initially preserved in 95% ethanol, successful sequencing of COI barcodes was not affected by a low dilution ratio of 2∶1, transfer to 70% ethanol, presence of abundant organic matter, or holding times of up to six months. Barcoding success varied by taxa, with Leptohyphidae (Ephemeroptera) producing the lowest barcode success rate, most likely due to poor PCR primer efficiency. Differential barcoding success rates have the potential to introduce spurious results. However, routine preservation methods can largely be used without adverse effects on DNA integrity.
Stein, Eric D.; White, Bryan P.; Mazor, Raphael D.; Miller, Peter E.; Pilgrim, Erik M.
2013-01-01
Molecular methods, such as DNA barcoding, have the potential to enhance biomonitoring programs worldwide. Altering routinely used sample preservation methods to protect DNA from degradation may pose a potential impediment to application of DNA barcoding and metagenomics for biomonitoring using benthic macroinvertebrates. Using higher volumes or concentrations of ethanol, requirements for shorter holding times, or the need to include additional filtering may increase cost and logistical constraints to existing biomonitoring programs. To address this issue we evaluated the efficacy of various ethanol-based sample preservation methods at maintaining DNA integrity. We evaluated a series of methods that were minimally modified from typical field protocols in order to identify an approach that can be readily incorporated into existing monitoring programs. Benthic macroinvertebrates were collected from a minimally disturbed stream in southern California, USA and subjected to one of six preservation treatments. Ten individuals from five taxa were selected from each treatment and processed to produce DNA barcodes from the mitochondrial gene cytochrome c oxidase I (COI). On average, we obtained successful COI sequences (i.e. either full or partial barcodes) for between 93–99% of all specimens across all six treatments. As long as samples were initially preserved in 95% ethanol, successful sequencing of COI barcodes was not affected by a low dilution ratio of 2∶1, transfer to 70% ethanol, presence of abundant organic matter, or holding times of up to six months. Barcoding success varied by taxa, with Leptohyphidae (Ephemeroptera) producing the lowest barcode success rate, most likely due to poor PCR primer efficiency. Differential barcoding success rates have the potential to introduce spurious results. However, routine preservation methods can largely be used without adverse effects on DNA integrity. PMID:23308097
DNA Barcoding the Native Flowering Plants and Conifers of Wales
de Vere, Natasha; Rich, Tim C. G.; Ford, Col R.; Trinder, Sarah A.; Long, Charlotte; Moore, Chris W.; Satterthwaite, Danielle; Davies, Helena; Allainguillaume, Joel; Ronca, Sandra; Tatarinova, Tatiana; Garbett, Hannah; Walker, Kevin; Wilkinson, Mike J.
2012-01-01
We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification. PMID:22701588
Defining operational taxonomic units using DNA barcode data.
Blaxter, Mark; Mann, Jenna; Chapman, Tom; Thomas, Fran; Whitton, Claire; Floyd, Robin; Abebe, Eyualem
2005-10-29
The scale of diversity of life on this planet is a significant challenge for any scientific programme hoping to produce a complete catalogue, whatever means is used. For DNA barcoding studies, this difficulty is compounded by the realization that any chosen barcode sequence is not the gene 'for' speciation and that taxa have evolutionary histories. How are we to disentangle the confounding effects of reticulate population genetic processes? Using the DNA barcode data from meiofaunal surveys, here we discuss the benefits of treating the taxa defined by barcodes without reference to their correspondence to 'species', and suggest that using this non-idealist approach facilitates access to taxon groups that are not accessible to other methods of enumeration and classification. Major issues remain, in particular the methodologies for taxon discrimination in DNA barcode data.
Kher, Chandni P; Doerder, F Paul; Cooper, Jason; Ikonomi, Pranvera; Achilles-Day, Undine; Küpper, Frithjof C; Lynn, Denis H
2011-01-01
DNA barcoding using the mitochondrial cytochromecoxidase subunit I (cox-1) gene has recently gained popularity as a tool for species identification of a variety of taxa. The primary objective of our research was to explore the efficacy of using cox-1 barcoding for species identification within the genusTetrahymena. We first increased intraspecific sampling forTetrahymena canadensis, Tetrahymena hegewischi, Tetrahymena pyriformis, Tetrahymena rostrata, Tetrahymena thermophila, and Tetrahymena tropicalis. Increased sampling efforts show that intraspecific sequence divergence is typically less than 1%, though it may be more in some species. The barcoding also showed that some strains might be misidentified or mislabeled. We also used cox-1 barcodes to provide species identifications for 51 unidentified environmental isolates, with a success rate of 98%. Thus, cox-1 barcoding is an invaluable tool for protistologists, especially when used in conjunction with morphological studies. 2010 Elsevier GmbH. All rights reserved.
Two Future Ready Librarians Explore Advocacy in and outside of the Library
ERIC Educational Resources Information Center
Miller, Shannon McClintock; Ray, Mark
2018-01-01
As part of the national Future Ready Librarians initiative at the Alliance for Excellent Education, Mark Ray and Shannon McClintock Miller serve as national advocates for school library programs and librarians. Mark and Shannon began their library advocacy careers in school libraries. For eight years, Shannon was the district librarian in Van…
Deciphering amphibian diversity through DNA barcoding: chances and challenges.
Vences, Miguel; Thomas, Meike; Bonett, Ronald M; Vieites, David R
2005-10-29
Amphibians globally are in decline, yet there is still a tremendous amount of unrecognized diversity, calling for an acceleration of taxonomic exploration. This process will be greatly facilitated by a DNA barcoding system; however, the mitochondrial population structure of many amphibian species presents numerous challenges to such a standardized, single locus, approach. Here we analyse intra- and interspecific patterns of mitochondrial variation in two distantly related groups of amphibians, mantellid frogs and salamanders, to determine the promise of DNA barcoding with cytochrome oxidase subunit I (cox1) sequences in this taxon. High intraspecific cox1 divergences of 7-14% were observed (18% in one case) within the whole set of amphibian sequences analysed. These high values are not caused by particularly high substitution rates of this gene but by generally deep mitochondrial divergences within and among amphibian species. Despite these high divergences, cox1 sequences were able to correctly identify species including disparate geographic variants. The main problems with cox1 barcoding of amphibians are (i) the high variability of priming sites that hinder the application of universal primers to all species and (ii) the observed distinct overlap of intraspecific and interspecific divergence values, which implies difficulties in the definition of threshold values to identify candidate species. Common discordances between geographical signatures of mitochondrial and nuclear markers in amphibians indicate that a single-locus approach can be problematic when high accuracy of DNA barcoding is required. We suggest that a number of mitochondrial and nuclear genes may be used as DNA barcoding markers to complement cox1.
Lobo, Jorge; Ferreira, Maria S; Antunes, Ilisa C; Teixeira, Marcos A L; Borges, Luisa M S; Sousa, Ronaldo; Gomes, Pedro A; Costa, Maria Helena; Cunha, Marina R; Costa, Filipe O
2017-02-01
In this study we compared DNA barcode-suggested species boundaries with morphology-based species identifications in the amphipod fauna of the southern European Atlantic coast. DNA sequences of the cytochrome c oxidase subunit I barcode region (COI-5P) were generated for 43 morphospecies (178 specimens) collected along the Portuguese coast which, together with publicly available COI-5P sequences, produced a final dataset comprising 68 morphospecies and 295 sequences. Seventy-five BINs (Barcode Index Numbers) were assigned to these morphospecies, of which 48 were concordant (i.e., 1 BIN = 1 species), 8 were taxonomically discordant, and 19 were singletons. Twelve species had matching sequences (<2% distance) with conspecifics from distant locations (e.g., North Sea). Seven morphospecies were assigned to multiple, and highly divergent, BINs, including specimens of Corophium multisetosum (18% divergence) and Dexamine spiniventris (16% divergence), which originated from sampling locations on the west coast of Portugal (only about 36 and 250 km apart, respectively). We also found deep divergence (4%-22%) among specimens of seven species from Portugal compared to those from the North Sea and Italy. The detection of evolutionarily meaningful divergence among populations of several amphipod species from southern Europe reinforces the need for a comprehensive re-assessment of the diversity of this faunal group.
Best, Katharine; Oakes, Theres; Heather, James M.; Shawe-Taylor, John; Chain, Benny
2015-01-01
The polymerase chain reaction (PCR) is one of the most widely used techniques in molecular biology. In combination with High Throughput Sequencing (HTS), PCR is widely used to quantify transcript abundance for RNA-seq, and in the context of analysis of T and B cell receptor repertoires. In this study, we combine DNA barcoding with HTS to quantify PCR output from individual target molecules. We develop computational tools that simulate both the PCR branching process itself, and the subsequent subsampling which typically occurs during HTS sequencing. We explore the influence of different types of heterogeneity on sequencing output, and compare them to experimental results where the efficiency of amplification is measured by barcodes uniquely identifying each molecule of starting template. Our results demonstrate that the PCR process introduces substantial amplification heterogeneity, independent of primer sequence and bulk experimental conditions. This heterogeneity can be attributed both to inherited differences between different template DNA molecules, and the inherent stochasticity of the PCR process. The results demonstrate that PCR heterogeneity arises even when reaction and substrate conditions are kept as constant as possible, and therefore single molecule barcoding is essential in order to derive reproducible quantitative results from any protocol combining PCR with HTS. PMID:26459131
DNA Barcoding analysis of seafood accuracy in Washington, D.C. restaurants
Stern, David B.; Castro Nallar, Eduardo; Rathod, Jason
2017-01-01
In Washington D.C., recent legislation authorizes citizens to test if products are properly represented and, if they are not, to bring a lawsuit for the benefit of the general public. Recent studies revealing the widespread phenomenon of seafood substitution across the United States make it a fertile area for consumer protection testing. DNA barcoding provides an accurate and cost-effective way to perform these tests, especially when tissue alone is available making species identification based on morphology impossible. In this study, we sequenced the 5′ barcoding region of the Cytochrome Oxidase I gene for 12 samples of vertebrate and invertebrate food items across six restaurants in Washington, D.C. and used multiple analytical methods to make identifications. These samples included several ambiguous menu listings, sequences with little genetic variation among closely related species and one sequence with no available reference sequence. Despite these challenges, we were able to make identifications for all samples and found that 33% were potentially mislabeled. While we found a high degree of mislabeling, the errors involved closely related species and we did not identify egregious substitutions as have been found in other cities. This study highlights the efficacy of DNA barcoding and robust analyses in identifying seafood items for consumer protection. PMID:28462038
Nuclear genomes distinguish cryptic species suggested by their DNA barcodes and ecology
Janzen, Daniel H.; Burns, John M.; Cong, Qian; Hallwachs, Winnie; Dapkey, Tanya; Manjunath, Ramya; Hajibabaei, Mehrdad; Hebert, Paul D. N.; Grishin, Nick V.
2017-01-01
DNA sequencing brings another dimension to exploration of biodiversity, and large-scale mitochondrial DNA cytochrome oxidase I barcoding has exposed many potential new cryptic species. Here, we add complete nuclear genome sequencing to DNA barcoding, ecological distribution, natural history, and subtleties of adult color pattern and size to show that a widespread neotropical skipper butterfly known as Udranomia kikkawai (Weeks) comprises three different species in Costa Rica. Full-length barcodes obtained from all three century-old Venezuelan syntypes of U. kikkawai show that it is a rainforest species occurring from Costa Rica to Brazil. The two new species are Udranomia sallydaleyae Burns, a dry forest denizen occurring from Costa Rica to Mexico, and Udranomia tomdaleyi Burns, which occupies the junction between the rainforest and dry forest and currently is known only from Costa Rica. Whereas the three species are cryptic, differing but slightly in appearance, their complete nuclear genomes totaling 15 million aligned positions reveal significant differences consistent with their 0.00065-Mbp (million base pair) mitochondrial barcodes and their ecological diversification. DNA barcoding of tropical insects reared by a massive inventory suggests that the presence of cryptic species is a widespread phenomenon and that further studies will substantially increase current estimates of insect species richness. PMID:28716927
González-Vaquero, Rocío Ana; Roig-Alsina, Arturo; Packer, Laurence
2016-10-01
Special care is needed in the delimitation and identification of halictid bee species, which are renowned for being morphologically monotonous. Corynura Spinola and Halictillus Moure (Halictidae: Augochlorini) contain species that are key elements in southern South American ecosystems. These bees are very difficult to identify due to close morphological similarity among species and high sexual dimorphism. We analyzed 170 barcode-compliant COI sequences from 19 species. DNA barcodes were useful to confirm gender associations and to detect two new cryptic species. Interspecific distances were significantly higher than those reported for other bees. Maximum intraspecific divergence was less than 1% in 14 species. Barcode index numbers (BINs) were useful to identify putative species that need further study. More than one BIN was assigned to five species. The name Corynura patagonica (Cockerell) probably refers to two cryptic species. The results suggest that Corynura and Halictillus species can be identified using DNA barcodes. The sequences of the species included in this study can be used as a reference to assess the identification of unknown specimens. This study provides additional support for the use of DNA barcodes in bee taxonomy and the identification of specimens, which is particularly relevant in insects of ecological importance such as pollinators.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes
Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar
2012-01-01
DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. Availability The database is available for free at http://mail.nbfgr.res.in/fbis/ PMID:22715304
DNA barcoding the floras of biodiversity hotspots.
Lahaye, Renaud; van der Bank, Michelle; Bogarin, Diego; Warner, Jorge; Pupulin, Franco; Gigot, Guillaume; Maurin, Olivier; Duthoit, Sylvie; Barraclough, Timothy G; Savolainen, Vincent
2008-02-26
DNA barcoding is a technique in which species identification is performed by using DNA sequences from a small fragment of the genome, with the aim of contributing to a wide range of ecological and conservation studies in which traditional taxonomic identification is not practical. DNA barcoding is well established in animals, but there is not yet any universally accepted barcode for plants. Here, we undertook intensive field collections in two biodiversity hotspots (Mesoamerica and southern Africa). Using >1,600 samples, we compared eight potential barcodes. Going beyond previous plant studies, we assessed to what extent a "DNA barcoding gap" is present between intra- and interspecific variations, using multiple accessions per species. Given its adequate rate of variation, easy amplification, and alignment, we identified a portion of the plastid matK gene as a universal DNA barcode for flowering plants. Critically, we further demonstrate the applicability of DNA barcoding for biodiversity inventories. In addition, analyzing >1,000 species of Mesoamerican orchids, DNA barcoding with matK alone reveals cryptic species and proves useful in identifying species listed in Convention on International Trade of Endangered Species (CITES) appendixes.
DNA barcoding the floras of biodiversity hotspots
Lahaye, Renaud; van der Bank, Michelle; Bogarin, Diego; Warner, Jorge; Pupulin, Franco; Gigot, Guillaume; Maurin, Olivier; Duthoit, Sylvie; Barraclough, Timothy G.; Savolainen, Vincent
2008-01-01
DNA barcoding is a technique in which species identification is performed by using DNA sequences from a small fragment of the genome, with the aim of contributing to a wide range of ecological and conservation studies in which traditional taxonomic identification is not practical. DNA barcoding is well established in animals, but there is not yet any universally accepted barcode for plants. Here, we undertook intensive field collections in two biodiversity hotspots (Mesoamerica and southern Africa). Using >1,600 samples, we compared eight potential barcodes. Going beyond previous plant studies, we assessed to what extent a “DNA barcoding gap” is present between intra- and interspecific variations, using multiple accessions per species. Given its adequate rate of variation, easy amplification, and alignment, we identified a portion of the plastid matK gene as a universal DNA barcode for flowering plants. Critically, we further demonstrate the applicability of DNA barcoding for biodiversity inventories. In addition, analyzing >1,000 species of Mesoamerican orchids, DNA barcoding with matK alone reveals cryptic species and proves useful in identifying species listed in Convention on International Trade of Endangered Species (CITES) appendixes. PMID:18258745
DNA barcode reference data for the Korean herpetofauna and their applications.
Jeong, Tae Jin; Jun, Jumin; Han, Sanghoon; Kim, Hyun Tae; Oh, Kyunghee; Kwak, Myounghai
2013-11-01
Recently, amphibians and reptiles have drawn attention because of declines in species and populations caused mainly by habitat loss, overexploitation and climate change. This study constructed a DNA barcode database for the Korean herpetofauna, including all the recorded amphibians and 68% of the recorded reptiles, to provide a useful, standardized tool for species identification in monitoring and management. A total of 103 individuals from 18 amphibian and 17 reptile species were used to generate barcode sequences using partial sequences of the mitochondrial cytochrome c oxidase subunit I (COI) gene and to compare it with other suggested barcode loci. Comparing 16S rRNA, cytochrome b (Cytb) and COI for amphibians and 12S rRNA, Cytb and COI for reptiles, our results revealed that COI is better than the other markers in terms of a high level of sequence variation without length variation and moderate amplification success. Although the COI marker had no clear barcoding gap because of the high level of intraspecific variation, all of the analysed individuals from the same species clustered together in a neighbour-joining tree. High intraspecific variation suggests the possibility of cryptic species. Finally, using this database, confiscated snakes were identified as Elaphe schrenckii, designated as endangered in Korea and a food contaminant was identified as the lizard Takydromus amurensis. © 2013 John Wiley & Sons Ltd.
Barcoding Neotropical birds: assessing the impact of nonmonophyly in a highly diverse group.
Chaves, Bárbara R N; Chaves, Anderson V; Nascimento, Augusto C A; Chevitarese, Juliana; Vasconcelos, Marcelo F; Santos, Fabrício R
2015-07-01
In this study, we verified the power of DNA barcodes to discriminate Neotropical birds using Bayesian tree reconstructions of a total of 7404 COI sequences from 1521 species, including 55 Brazilian species with no previous barcode data. We found that 10.4% of species were nonmonophyletic, most likely due to inaccurate taxonomy, incomplete lineage sorting or hybridization. At least 0.5% of the sequences (2.5% of the sampled species) retrieved from GenBank were associated with database errors (poor-quality sequences, NuMTs, misidentification or unnoticed hybridization). Paraphyletic species (5.8% of the total) can be related to rapid speciation events leading to nonreciprocal monophyly between recently diverged sister species, or to absence of synapomorphies in the small COI region analysed. We also performed two series of genetic distance calculations under the K2P model for intraspecific and interspecific comparisons: the first included all COI sequences, and the second included only monophyletic taxa observed in the Bayesian trees. As expected, the mean and median pairwise distances were smaller for intraspecific than for interspecific comparisons. However, there was no precise 'barcode gap', which was shown to be larger in the monophyletic taxon data set than for the data from all species, as expected. Our results indicated that although database errors may explain some of the difficulties in the species discrimination of Neotropical birds, distance-based barcode assignment may also be compromised because of the high diversity of bird species and more complex speciation events in the Neotropics. © 2014 John Wiley & Sons Ltd.
Ashfaq, Muhammad; Asif, Muhammad; Anjum, Zahid Iqbal; Zafar, Yusuf
2013-07-01
Although two plastid regions have been adopted as the standard markers for plant DNA barcoding, their limited resolution has provoked the consideration of other gene regions, especially in taxonomically diverse genera. The genus Gossypium (cotton) includes eight diploid genome groups (A-G, and K) and five allotetraploid species which are difficult to discriminate morphologically. In this study, we tested the effectiveness of three widely used markers (matK, rbcL, and ITS2) in the discrimination of 20 diploid and five tetraploid species of cotton. Sequences were analysed locus-wise and in combinations to determine the most effective strategy for species identification. Sequence recovery was high, ranging from 92% to 100% with mean pairwise interspecific distance highest for ITS2 (3.68%) and lowest for rbcL (0.43%). At a 0.5% threshold, the combination of matK+ITS2 produced the greatest number of species clusters. Based on 'best match' analysis, the combination of matK+ITS2 was best, while based on 'all species barcodes' analysis, ITS2 gave the highest percentage of correct species identifications (98.93%). The combination of sequences for all three markers produced the best resolved tree. The disparity index test based on matK+rbcL+ITS2 was significant (P < 0.05) for a higher number of species pairs than the individual gene sequences. Although all three barcodes separated the species with respect to their genome type, no single combination of barcodes could differentiate all the Gossypium species, and tetraploid species were particularly difficult. © 2013 John Wiley & Sons Ltd.
Molecular Identification of Commercialized Medicinal Plants in Southern Morocco
Krüger, Åsa; Rydberg, Anders; Abbad, Abdelaziz; Björk, Lars; Martin, Gary
2012-01-01
Background Medicinal plant trade is important for local livelihoods. However, many medicinal plants are difficult to identify when they are sold as roots, powders or bark. DNA barcoding involves using a short, agreed-upon region of a genome as a unique identifier for species– ideally, as a global standard. Research Question What is the functionality, efficacy and accuracy of the use of barcoding for identifying root material, using medicinal plant roots sold by herbalists in Marrakech, Morocco, as a test dataset. Methodology In total, 111 root samples were sequenced for four proposed barcode regions rpoC1, psbA-trnH, matK and ITS. Sequences were searched against a tailored reference database of Moroccan medicinal plants and their closest relatives using BLAST and Blastclust, and through inference of RAxML phylograms of the aligned market and reference samples. Principal Findings Sequencing success was high for rpoC1, psbA-trnH, and ITS, but low for matK. Searches using rpoC1 alone resulted in a number of ambiguous identifications, indicating insufficient DNA variation for accurate species-level identification. Combining rpoC1, psbA-trnH and ITS allowed the majority of the market samples to be identified to genus level. For a minority of the market samples, the barcoding identification differed significantly from previous hypotheses based on the vernacular names. Conclusions/Significance Endemic plant species are commercialized in Marrakech. Adulteration is common and this may indicate that the products are becoming locally endangered. Nevertheless the majority of the traded roots belong to species that are common and not known to be endangered. A significant conclusion from our results is that unknown samples are more difficult to identify than earlier suggested, especially if the reference sequences were obtained from different populations. A global barcoding database should therefore contain sequences from different populations of the same species to assure the reference sequences characterize the species throughout its distributional range. PMID:22761800
Barcoding Fauna Bavarica: 78% of the Neuropterida Fauna Barcoded!
Morinière, Jérome; Hendrich, Lars; Hausmann, Axel; Hebert, Paul; Haszprunar, Gerhard; Gruppe, Axel
2014-01-01
This publication provides the first comprehensive DNA barcode data set for the Neuropterida of Central Europe, including 80 of the 102 species (78%) recorded from Bavaria (Germany) and three other species from nearby regions (Austria, France and the UK). Although the 286 specimens analyzed had a heterogeneous conservation history (60% dried; 30% in 80% EtOH; 10% fresh specimens in 95% EtOH), 237 (83%) generated a DNA barcode. Eleven species (13%) shared a BIN, but three of these taxa could be discriminated through barcodes. Four pairs of closely allied species shared barcodes including Chrysoperla pallida Henry et al., 2002 and C. lucasina Lacroix, 1912; Wesmaelius concinnus (Stephens, 1836) and W. quadrifasciatus (Reuter, 1894); Hemerobius handschini Tjeder, 1957 and H. nitidulus Fabricius, 1777; and H. atrifrons McLachlan, 1868 and H. contumax Tjeder, 1932. Further studies are needed to test the possible synonymy of these species pairs or to determine if other genetic markers permit their discrimination. Our data highlight five cases of potential cryptic diversity within Bavarian Neuropterida: Nineta flava (Scopoli, 1763), Sympherobius pygmaeus (Rambur, 1842), Sisyra nigra (Retzius, 1783), Semidalis aleyrodiformis (Stephens, 1836) and Coniopteryx pygmaea Enderlein, 1906 are each split into two or three BINs. The present DNA barcode library not only allows the identification of adult and larval stages, but also provides valuable information for alpha-taxonomy, and for ecological and evolutionary research. PMID:25286434
Makarova, Olga; Contaldo, Nicoletta; Paltrinieri, Samanta; Kawube, Geofrey; Bertaccini, Assunta; Nicolaisen, Mogens
2012-01-01
Background Phytoplasmas are bacterial phytopathogens responsible for significant losses in agricultural production worldwide. Several molecular markers are available for identification of groups or strains of phytoplasmas. However, they often cannot be used for identification of phytoplasmas from different groups simultaneously or are too long for routine diagnostics. DNA barcoding recently emerged as a convenient tool for species identification. Here, the development of a universal DNA barcode based on the elongation factor Tu (tuf) gene for phytoplasma identification is reported. Methodology/Principal Findings We designed a new set of primers and amplified a 420–444 bp fragment of tuf from all 91 phytoplasmas strains tested (16S rRNA groups -I through -VII, -IX through -XII, -XV, and -XX). Comparison of NJ trees constructed from the tuf barcode and a 1.2 kbp fragment of the 16S ribosomal gene revealed that the tuf tree is highly congruent with the 16S rRNA tree and had higher inter- and intra- group sequence divergence. Mean K2P inter−/intra- group divergences of the tuf barcode did not overlap and had approximately one order of magnitude difference for most groups, suggesting the presence of a DNA barcoding gap. The use of the tuf barcode allowed separation of main ribosomal groups and most of their subgroups. Phytoplasma tuf barcodes were deposited in the NCBI GenBank and Q-bank databases. Conclusions/Significance This study demonstrates that DNA barcoding principles can be applied for identification of phytoplasmas. Our findings suggest that the tuf barcode performs as well or better than a 1.2 kbp fragment of the 16S rRNA gene and thus provides an easy procedure for phytoplasma identification. The obtained sequences were used to create a publicly available reference database that can be used by plant health services and researchers for online phytoplasma identification. PMID:23272216
Noise reduction in single time frame optical DNA maps
Müller, Vilhelm; Westerlund, Fredrik
2017-01-01
In optical DNA mapping technologies sequence-specific intensity variations (DNA barcodes) along stretched and stained DNA molecules are produced. These “fingerprints” of the underlying DNA sequence have a resolution of the order one kilobasepairs and the stretching of the DNA molecules are performed by surface adsorption or nano-channel setups. A post-processing challenge for nano-channel based methods, due to local and global random movement of the DNA molecule during imaging, is how to align different time frames in order to produce reproducible time-averaged DNA barcodes. The current solutions to this challenge are computationally rather slow. With high-throughput applications in mind, we here introduce a parameter-free method for filtering a single time frame noisy barcode (snap-shot optical map), measured in a fraction of a second. By using only a single time frame barcode we circumvent the need for post-processing alignment. We demonstrate that our method is successful at providing filtered barcodes which are less noisy and more similar to time averaged barcodes. The method is based on the application of a low-pass filter on a single noisy barcode using the width of the Point Spread Function of the system as a unique, and known, filtering parameter. We find that after applying our method, the Pearson correlation coefficient (a real number in the range from -1 to 1) between the single time-frame barcode and the time average of the aligned kymograph increases significantly, roughly by 0.2 on average. By comparing to a database of more than 3000 theoretical plasmid barcodes we show that the capabilities to identify plasmids is improved by filtering single time-frame barcodes compared to the unfiltered analogues. Since snap-shot experiments and computational time using our method both are less than a second, this study opens up for high throughput optical DNA mapping with improved reproducibility. PMID:28640821
Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong
2012-01-01
Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.
Walsh, Neville G.; Cantrill, David J.; Holmes, Gareth D.; Murphy, Daniel J.
2017-01-01
In Australia, Poaceae tribe Poeae are represented by 19 genera and 99 species, including economically and environmentally important native and introduced pasture grasses [e.g. Poa (Tussock-grasses) and Lolium (Ryegrasses)]. We used this tribe, which are well characterised in regards to morphological diversity and evolutionary relationships, to test the efficacy of DNA barcoding methods. A reference library was generated that included 93.9% of species in Australia (408 individuals, x¯ = 3.7 individuals per species). Molecular data were generated for official plant barcoding markers (rbcL, matK) and the nuclear ribosomal internal transcribed spacer (ITS) region. We investigated accuracy of specimen identifications using distance- (nearest neighbour, best-close match, and threshold identification) and tree-based (maximum likelihood, Bayesian inference) methods and applied species discovery methods (automatic barcode gap discovery, Poisson tree processes) based on molecular data to assess congruence with recognised species. Across all methods, success rate for specimen identification of genera was high (87.5–99.5%) and of species was low (25.6–44.6%). Distance- and tree-based methods were equally ineffective in providing accurate identifications for specimens to species rank (26.1–44.6% and 25.6–31.3%, respectively). The ITS marker achieved the highest success rate for specimen identification at both generic and species ranks across the majority of methods. For distance-based analyses the best-close match method provided the greatest accuracy for identification of individuals with a high percentage of “correct” (97.6%) and a low percentage of “incorrect” (0.3%) generic identifications, based on the ITS marker. For tribe Poeae, and likely for other grass lineages, sequence data in the standard DNA barcode markers are not variable enough for accurate identification of specimens to species rank. For recently diverged grass species similar challenges are encountered in the application of genetic and morphological data to species delimitations, with taxonomic signal limited by extensive infra-specific variation and shared polymorphisms among species in both data types. PMID:29084279
Birch, Joanne L; Walsh, Neville G; Cantrill, David J; Holmes, Gareth D; Murphy, Daniel J
2017-01-01
In Australia, Poaceae tribe Poeae are represented by 19 genera and 99 species, including economically and environmentally important native and introduced pasture grasses [e.g. Poa (Tussock-grasses) and Lolium (Ryegrasses)]. We used this tribe, which are well characterised in regards to morphological diversity and evolutionary relationships, to test the efficacy of DNA barcoding methods. A reference library was generated that included 93.9% of species in Australia (408 individuals, [Formula: see text] = 3.7 individuals per species). Molecular data were generated for official plant barcoding markers (rbcL, matK) and the nuclear ribosomal internal transcribed spacer (ITS) region. We investigated accuracy of specimen identifications using distance- (nearest neighbour, best-close match, and threshold identification) and tree-based (maximum likelihood, Bayesian inference) methods and applied species discovery methods (automatic barcode gap discovery, Poisson tree processes) based on molecular data to assess congruence with recognised species. Across all methods, success rate for specimen identification of genera was high (87.5-99.5%) and of species was low (25.6-44.6%). Distance- and tree-based methods were equally ineffective in providing accurate identifications for specimens to species rank (26.1-44.6% and 25.6-31.3%, respectively). The ITS marker achieved the highest success rate for specimen identification at both generic and species ranks across the majority of methods. For distance-based analyses the best-close match method provided the greatest accuracy for identification of individuals with a high percentage of "correct" (97.6%) and a low percentage of "incorrect" (0.3%) generic identifications, based on the ITS marker. For tribe Poeae, and likely for other grass lineages, sequence data in the standard DNA barcode markers are not variable enough for accurate identification of specimens to species rank. For recently diverged grass species similar challenges are encountered in the application of genetic and morphological data to species delimitations, with taxonomic signal limited by extensive infra-specific variation and shared polymorphisms among species in both data types.
Company profile: Complete Genomics Inc.
Reid, Clifford
2011-02-01
Complete Genomics Inc. is a life sciences company that focuses on complete human genome sequencing. It is taking a completely different approach to DNA sequencing than other companies in the industry. Rather than building a general-purpose platform for sequencing all organisms and all applications, it has focused on a single application - complete human genome sequencing. The company's Complete Genomics Analysis Platform (CGA™ Platform) comprises an integrated package of biochemistry, instrumentation and software that sequences human genomes at the highest quality, lowest cost and largest scale available. Complete Genomics offers a turnkey service that enables customers to outsource their human genome sequencing to the company's genome sequencing center in Mountain View, CA, USA. Customers send in their DNA samples, the company does all the library preparation, DNA sequencing, assembly and variant analysis, and customers receive research-ready data that they can use for biological discovery.
Defining operational taxonomic units using DNA barcode data
Blaxter, Mark; Mann, Jenna; Chapman, Tom; Thomas, Fran; Whitton, Claire; Floyd, Robin; Abebe, Eyualem
2005-01-01
Abstract The scale of diversity of life on this planet is a significant challenge for any scientific programme hoping to produce a complete catalogue, whatever means is used. For DNA barcoding studies, this difficulty is compounded by the realization that any chosen barcode sequence is not the gene ‘for’ speciation and that taxa have evolutionary histories. How are we to disentangle the confounding effects of reticulate population genetic processes? Using the DNA barcode data from meiofaunal surveys, here we discuss the benefits of treating the taxa defined by barcodes without reference to their correspondence to ‘species’, and suggest that using this non-idealist approach facilitates access to taxon groups that are not accessible to other methods of enumeration and classification. Major issues remain, in particular the methodologies for taxon discrimination in DNA barcode data. PMID:16214751
The unholy trinity: taxonomy, species delimitation and DNA barcoding
DeSalle, Rob; Egan, Mary G; Siddall, Mark
2005-01-01
Recent excitement over the development of an initiative to generate DNA sequences for all named species on the planet has in our opinion generated two major areas of contention as to how this ‘DNA barcoding’ initiative should proceed. It is critical that these two issues are clarified and resolved, before the use of DNA as a tool for taxonomy and species delimitation can be universalized. The first issue concerns how DNA data are to be used in the context of this initiative; this is the DNA barcode reader problem (or barcoder problem). Currently, many of the published studies under this initiative have used tree building methods and more precisely distance approaches to the construction of the trees that are used to place certain DNA sequences into a taxonomic context. The second problem involves the reaction of the taxonomic community to the directives of the ‘DNA barcoding’ initiative. This issue is extremely important in that the classical taxonomic approach and the DNA approach will need to be reconciled in order for the ‘DNA barcoding’ initiative to proceed with any kind of community acceptance. In fact, we feel that DNA barcoding is a misnomer. Our preference is for the title of the London meetings—Barcoding Life. In this paper we discuss these two concerns generated around the DNA barcoding initiative and attempt to present a phylogenetic systematic framework for an improved barcoder as well as a taxonomic framework for interweaving classical taxonomy with the goals of ‘DNA barcoding’. PMID:16214748
Zhang, Jian-Qiang; Meng, Shi-Yong; Wen, Jun; Rao, Guang-Yuan
2015-01-01
DNA barcoding, the identification of species using one or a few short standardized DNA sequences, is an important complement to traditional taxonomy. However, there are particular challenges for barcoding plants, especially for species with complex evolutionary histories. We herein evaluated the utility of five candidate sequences - rbcL, matK, trnH-psbA, trnL-F and the internal transcribed spacer (ITS) - for barcoding Rhodiola species, a group of high-altitude plants frequently used as adaptogens, hemostatics and tonics in traditional Tibetan medicine. Rhodiola was suggested to have diversified rapidly recently. The genus is thus a good model for testing DNA barcoding strategies for recently diversified medicinal plants. This study analyzed 189 accessions, representing 47 of the 55 recognized Rhodiola species in the Flora of China treatment. Based on intraspecific and interspecific divergence and degree of monophyly statistics, ITS was the best single-locus barcode, resolving 66% of the Rhodiola species. The core combination rbcL+matK resolved only 40.4% of them. Unsurprisingly, the combined use of all five loci provided the highest discrimination power, resolving 80.9% of the species. However, this is weaker than the discrimination power generally reported in barcoding studies of other plant taxa. The observed complications may be due to the recent diversification, incomplete lineage sorting and reticulate evolution of the genus. These processes are common features of numerous plant groups in the high-altitude regions of the Qinghai-Tibetan Plateau.
Ashfaq, Muhammad; Hebert, Paul D N; Mirza, M Sajjad; Khan, Arif M; Mansoor, Shahid; Shah, Ghulam S; Zafar, Yusuf
2014-01-01
Although whiteflies (Bemisia tabaci complex) are an important pest of cotton in Pakistan, its taxonomic diversity is poorly understood. As DNA barcoding is an effective tool for resolving species complexes and analyzing species distributions, we used this approach to analyze genetic diversity in the B. tabaci complex and map the distribution of B. tabaci lineages in cotton growing areas of Pakistan. Sequence diversity in the DNA barcode region (mtCOI-5') was examined in 593 whiteflies from Pakistan to determine the number of whitefly species and their distributions in the cotton-growing areas of Punjab and Sindh provinces. These new records were integrated with another 173 barcode sequences for B. tabaci, most from India, to better understand regional whitefly diversity. The Barcode Index Number (BIN) System assigned the 766 sequences to 15 BINs, including nine from Pakistan. Representative specimens of each Pakistan BIN were analyzed for mtCOI-3' to allow their assignment to one of the putative species in the B. tabaci complex recognized on the basis of sequence variation in this gene region. This analysis revealed the presence of Asia II 1, Middle East-Asia Minor 1, Asia 1, Asia II 5, Asia II 7, and a new lineage "Pakistan". The first two taxa were found in both Punjab and Sindh, but Asia 1 was only detected in Sindh, while Asia II 5, Asia II 7 and "Pakistan" were only present in Punjab. The haplotype networks showed that most haplotypes of Asia II 1, a species implicated in transmission of the cotton leaf curl virus, occurred in both India and Pakistan. DNA barcodes successfully discriminated cryptic species in B. tabaci complex. The dominant haplotypes in the B. tabaci complex were shared by India and Pakistan. Asia II 1 was previously restricted to Punjab, but is now the dominant lineage in southern Sindh; its southward spread may have serious implications for cotton plantations in this region.
Barcoding of fresh water fishes from Pakistan.
Karim, Asma; Iqbal, Asad; Akhtar, Rehan; Rizwan, Muhammad; Amar, Ali; Qamar, Usman; Jahan, Shah
2016-07-01
DNA bar-coding is a taxonomic method that uses small genetic markers in organisms' mitochondrial DNA (mt DNA) for identification of particular species. It uses sequence diversity in a 658-base pair fragment near the 5' end of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene as a tool for species identification. DNA barcoding is more accurate and reliable method as compared with the morphological identification. It is equally useful in juveniles as well as adult stages of fishes. The present study was conducted to identify three farm fish species of Pakistan (Cyprinus carpio, Cirrhinus mrigala, and Ctenopharyngodon idella) genetically. All of them belonged to family cyprinidae. CO1 gene was amplified. PCR products were sequenced and analyzed by bioinformatic software. Conspecific, congenric, and confamilial k2P nucleotide divergence was estimated. From these findings, it was concluded that the gene sequence, CO1, may serve as milestone for the identification of related species at molecular level.
Dentinger, Bryn T M; Margaritescu, Simona; Moncalvo, Jean-Marc
2010-07-01
We present two methods for DNA extraction from fresh and dried mushrooms that are adaptable to high-throughput sequencing initiatives, such as DNA barcoding. Our results show that these protocols yield ∼85% sequencing success from recently collected materials. Tests with both recent (<2 year) and older (>100 years) specimens reveal that older collections have low success rates and may be an inefficient resource for populating a barcode database. However, our method of extracting DNA from herbarium samples using small amount of tissue is reliable and could be used for important historical specimens. The application of these protocols greatly reduces time, and therefore cost, of generating DNA sequences from mushrooms and other fungi vs. traditional extraction methods. The efficiency of these methods illustrates that standardization and streamlining of sample processing should be shifted from the laboratory to the field. © 2009 Blackwell Publishing Ltd.
ERIC Educational Resources Information Center
Celano, Donna C.; Neuman, Susan B.
2016-01-01
Because English language learners enter kindergarten at a distinct disadvantage, Celano and Neuman examine the role public libraries can play in rallying around these young children to better prepare them for school. The authors document a new program called Every Child Ready to Read, which recently launched in 4,000 public libraries across the…
Expanding Library Support of Faculty Research: Exploring Readiness
ERIC Educational Resources Information Center
Brown, Jeanne M.; Tucker, Cory
2013-01-01
The changing research and information environment requires a reexamination of library support for research. This study considers research-related attitudes and practices to identify elements indicating readiness or resistance to expanding the library's role in research support. A survey of faculty conducted at the University of Nevada Las Vegas…
bold: The Barcode of Life Data System (http://www.barcodinglife.org)
RATNASINGHAM, SUJEEVAN; HEBERT, PAUL D N
2007-01-01
The Barcode of Life Data System (bold) is an informatics workbench aiding the acquisition, storage, analysis and publication of DNA barcode records. By assembling molecular, morphological and distributional data, it bridges a traditional bioinformatics chasm. bold is freely available to any researcher with interests in DNA barcoding. By providing specialized services, it aids the assembly of records that meet the standards needed to gain BARCODE designation in the global sequence databases. Because of its web-based delivery and flexible data security model, it is also well positioned to support projects that involve broad research alliances. This paper provides a brief introduction to the key elements of bold, discusses their functional capabilities, and concludes by examining computational resources and future prospects. PMID:18784790
Dou, Rong-kun; Bi, Zhen-fei; Bai, Rui-xue; Ren, Yao-yao; Tan, Rui; Song, Liang-ke; Li, Di-qiang; Mao, Can-quan
2015-04-01
The study is aimed to ensure the quality and safety of medicinal plants by using ITS2 DNA barcode technology to identify Corydalis boweri, Meconopsis horridula and their close related species. The DNA of 13 herb samples including C. boweri and M. horridula from Lhasa of Tibet was extracted, ITS PCR were amplified and sequenced. Both assembled and web downloaded 71 ITS2 sequences were removed of 5. 8S and 28S. Multiple sequence alignment was completed and the intraspecific and interspecific genetic distances were calculated by MEGA 5.0, while the neighbor-joining phylogenetic trees were constructed. We also predicted the ITS2 secondary structure of C. boweri, M. horridula and their close related species. The results showed that ITS2 as DNA barcode was able to identify C. boweri, M. horridula as well as well as their close related species effectively. The established based on ITS2 barcode method provides the regular and safe detection technology for identification of C. boweri, M. horridula and their close related species, adulterants and counterfeits, in order to ensure their quality control, safe medication, reasonable development and utilization.
Geographically widespread swordfish barcode stock identification: a case study of its application.
Pappalardo, Anna Maria; Guarino, Francesca; Reina, Simona; Messina, Angela; De Pinto, Vito
2011-01-01
The swordfish (Xiphias gladius) is a cosmopolitan large pelagic fish inhabiting tempered and tropical waters and it is a target species for fisheries all around the world. The present study investigated the ability of COI barcoding to reliably identify swordfish and particularly specific stocks of this commercially important species. We applied the classical DNA barcoding technology, upon a 682 bp segment of COI, and compared swordfish sequences from different geographical sources (Atlantic, Indian Oceans and Mediterranean Sea). The sequences of the 5' hyper-variable fragment of the control region (5'dloop), were also used to validate the efficacy of COI as a stock-specific marker. This information was successfully applied to the discrimination of unknown samples from the market, detecting in some cases mislabeled seafood products. The NJ distance-based phenogram (K2P model) obtained with COI sequences allowed us to correlate the swordfish haplotypes to the different geographical stocks. Similar results were obtained with 5'dloop. Our preliminary data in swordfish Xiphias gladius confirm that Cytochrome Oxidase I can be proposed as an efficient species-specific marker that has also the potential to assign geographical provenance. This information might speed the samples analysis in commercial application of barcoding.
Molecular Barcoding of Aquatic Oligochaetes: Implications for Biomonitoring
Vivien, Régis; Wyler, Sofia; Lafont, Michel; Pawlowski, Jan
2015-01-01
Aquatic oligochaetes are well recognized bioindicators of quality of sediments and water in watercourses and lakes. However, the difficult taxonomic determination based on morphological features compromises their more common use in eco-diagnostic analyses. To overcome this limitation, we investigated molecular barcodes as identification tool for broad range of taxa of aquatic oligochaetes. We report 185 COI and 52 ITS2 rDNA sequences for specimens collected in Switzerland and belonging to the families Naididae, Lumbriculidae, Enchytraeidae and Lumbricidae. Phylogenetic analyses allowed distinguishing 41 lineages separated by more than 10 % divergence in COI sequences. The lineage distinction was confirmed by Automatic Barcode Gap Discovery (ABGD) method and by ITS2 data. Our results showed that morphological identification underestimates the oligochaete diversity. Only 26 of the lineages could be assigned to morphospecies, of which seven were sequenced for the first time. Several cryptic species were detected within common morphospecies. Many juvenile specimens that could not be assigned morphologically have found their home after genetic analysis. Our study showed that COI barcodes performed very well as species identifiers in aquatic oligochaetes. Their easy amplification and good taxonomic resolution might help promoting aquatic oligochaetes as bioindicators for next generation environmental DNA biomonitoring of aquatic ecosystems. PMID:25856230
Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space
Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.
2013-01-01
For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960
Combinatorial pooling enables selective sequencing of the barley gene space.
Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J
2013-04-01
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
NASA Astrophysics Data System (ADS)
McFadden, C. S.; Brown, A. S.; Brayton, C.; Hunt, C. B.; van Ofwegen, L. P.
2014-06-01
The application of DNA barcoding to anthozoan cnidarians has been hindered by their slow rates of mitochondrial gene evolution and the failure to identify alternative molecular markers that distinguish species reliably. Among octocorals, however, multilocus barcodes can distinguish up to 70 % of morphospecies, thereby facilitating the identification of species that are ecologically important but still very poorly known taxonomically. We tested the ability of these imperfect DNA barcodes to estimate species richness in a biodiversity survey of the shallow-water octocoral fauna of Palau using multilocus ( COI, mtMutS, 28S rDNA) sequences obtained from 305 specimens representing 38 genera of octocorals. Numbers and identities of species were estimated independently (1) by a taxonomic expert using morphological criteria and (2) by assigning sequences to molecular operational taxonomic units (MOTUs) using predefined genetic distance thresholds. Estimated numbers of MOTUs ranged from 73 to 128 depending on the barcode and distance threshold applied, bracketing the estimated number of 118 morphospecies. Concordance between morphospecies identifications and MOTUs ranged from 71 to 75 % and differed little among barcodes. For the speciose and ecologically dominant genus Sinularia, however, we were able to identify 95 % of specimens correctly simply by comparing mtMutS sequences and in situ photographs of colonies to an existing vouchered database. Because we lack a clear understanding of species boundaries in most of these taxa, numbers of morphospecies and MOTUs are both estimates of the true species diversity, and we cannot currently determine which is more accurate. Our results suggest, however, that the two methods provide comparable estimates of species richness for shallow-water Indo-Pacific octocorals. Use of molecular barcodes in biodiversity surveys will facilitate comparisons of species richness and composition among localities and over time, data that do not currently exist for any octocoral community.
Zahiri, Reza; Lafontaine, J. Donald; Schmidt, B. Christian; deWaard, Jeremy R.; Zakharov, Evgeny V.; Hebert, Paul D. N.
2014-01-01
This study provides a first, comprehensive, diagnostic use of DNA barcodes for the Canadian fauna of noctuoids or “owlet” moths (Lepidoptera: Noctuoidea) based on vouchered records for 1,541 species (99.1% species coverage), and more than 30,000 sequences. When viewed from a Canada-wide perspective, DNA barcodes unambiguously discriminate 90% of the noctuoid species recognized through prior taxonomic study, and resolution reaches 95.6% when considered at a provincial scale. Barcode sharing is concentrated in certain lineages with 54% of the cases involving 1.8% of the genera. Deep intraspecific divergence exists in 7.7% of the species, but further studies are required to clarify whether these cases reflect an overlooked species complex or phylogeographic variation in a single species. Non-native species possess higher Nearest-Neighbour (NN) distances than native taxa, whereas generalist feeders have lower NN distances than those with more specialized feeding habits. We found high concordance between taxonomic names and sequence clusters delineated by the Barcode Index Number (BIN) system with 1,082 species (70%) assigned to a unique BIN. The cases of discordance involve both BIN mergers and BIN splits with 38 species falling into both categories, most likely reflecting bidirectional introgression. One fifth of the species are involved in a BIN merger reflecting the presence of 158 species sharing their barcode sequence with at least one other taxon, and 189 species with low, but diagnostic COI divergence. A very few cases (13) involved species whose members fell into both categories. Most of the remaining 140 species show a split into two or three BINs per species, while Virbia ferruginosa was divided into 16. The overall results confirm that DNA barcodes are effective for the identification of Canadian noctuoids. This study also affirms that BINs are a strong proxy for species, providing a pathway for a rapid, accurate estimation of animal diversity. PMID:24667847
Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding.
Lan, Freeman; Demaree, Benjamin; Ahmed, Noorsher; Abate, Adam R
2017-07-01
The application of single-cell genome sequencing to large cell populations has been hindered by technical challenges in isolating single cells during genome preparation. Here we present single-cell genomic sequencing (SiC-seq), which uses droplet microfluidics to isolate, fragment, and barcode the genomes of single cells, followed by Illumina sequencing of pooled DNA. We demonstrate ultra-high-throughput sequencing of >50,000 cells per run in a synthetic community of Gram-negative and Gram-positive bacteria and fungi. The sequenced genomes can be sorted in silico based on characteristic sequences. We use this approach to analyze the distributions of antibiotic-resistance genes, virulence factors, and phage sequences in microbial communities from an environmental sample. The ability to routinely sequence large populations of single cells will enable the de-convolution of genetic heterogeneity in diverse cell populations.
DNA Barcoding of Sigmodontine Rodents: Identifying Wildlife Reservoirs of Zoonoses
Müller, Lívia; Gonçalves, Gislene L.; Cordeiro-Estrela, Pedro; Marinho, Jorge R.; Althoff, Sérgio L.; Testoni, André. F.; González, Enrique M.; Freitas, Thales R. O.
2013-01-01
Species identification through DNA barcoding is a tool to be added to taxonomic procedures, once it has been validated. Applying barcoding techniques in public health would aid in the identification and correct delimitation of the distribution of rodents from the subfamily Sigmodontinae. These rodents are reservoirs of etiological agents of zoonoses including arenaviruses, hantaviruses, Chagas disease and leishmaniasis. In this study we compared distance-based and probabilistic phylogenetic inference methods to evaluate the performance of cytochrome c oxidase subunit I (COI) in sigmodontine identification. A total of 130 sequences from 21 field-trapped species (13 genera), mainly from southern Brazil, were generated and analyzed, together with 58 GenBank sequences (24 species; 10 genera). Preliminary analysis revealed a 9.5% rate of misidentifications in the field, mainly of juveniles, which were reclassified after examination of external morphological characters and chromosome numbers. Distance and model-based methods of tree reconstruction retrieved similar topologies and monophyly for most species. Kernel density estimation of the distance distribution showed a clear barcoding gap with overlapping of intraspecific and interspecific densities < 1% and 21 species with mean intraspecific distance < 2%. Five species that are reservoirs of hantaviruses could be identified through DNA barcodes. Additionally, we provide information for the description of a putative new species, as well as the first COI sequence of the recently described genus Drymoreomys. The data also indicated an expansion of the distribution of Calomys tener. We emphasize that DNA barcoding should be used in combination with other taxonomic and systematic procedures in an integrative framework and based on properly identified museum collections, to improve identification procedures, especially in epidemiological surveillance and ecological assessments. PMID:24244670
Jiang, F; Jin, Q; Liang, L; Zhang, A B; Li, Z H
2014-11-01
Fruit flies in the family Tephritidae are the economically important pests that have many species complexes. DNA barcoding has gradually been verified as an effective tool for identifying species in a wide range of taxonomic groups, and there are several publications on rapid and accurate identification of fruit flies based on this technique; however, comprehensive analyses of large and new taxa for the effectiveness of DNA barcoding for fruit flies identification have been rare. In this study, we evaluated the COI barcode sequences for the diagnosis of fruit flies using 1426 sequences for 73 species of Bactrocera distributed worldwide. Tree-based [neighbour-joining (NJ)]; distance-based, such as Best Match (BM), Best Close Match (BCM) and Minimum Distance (MD); and character-based methods were used to evaluate the barcoding success rates obtained with maintaining the species complex in the data set, treating a species complex as a single taxon unit, and removing the species complex. Our results indicate that the average divergence between species was 14.04% (0.00-25.16%), whereas within a species this was 0.81% (0.00-9.71%); the existence of species complexes largely reduced the barcoding success for Tephritidae, for example relatively low success rates (74.4% based on BM and BCM and 84.8% based on MD) were obtained when the sequences from species complexes were included in the analysis, whereas significantly higher success rates were achieved if the species complexes were treated as a single taxon or removed from the data set - BM (98.9%), BCM (98.5%) and MD (97.5%), or BM (98.1%), BCM (97.4%) and MD (98.2%). © 2014 John Wiley & Sons Ltd.
Development of a single nucleotide polymorphism barcode to genotype Plasmodium vivax infections.
Baniecki, Mary Lynn; Faust, Aubrey L; Schaffner, Stephen F; Park, Daniel J; Galinsky, Kevin; Daniels, Rachel F; Hamilton, Elizabeth; Ferreira, Marcelo U; Karunaweera, Nadira D; Serre, David; Zimmerman, Peter A; Sá, Juliana M; Wellems, Thomas E; Musset, Lise; Legrand, Eric; Melnikov, Alexandre; Neafsey, Daniel E; Volkman, Sarah K; Wirth, Dyann F; Sabeti, Pardis C
2015-03-01
Plasmodium vivax, one of the five species of Plasmodium parasites that cause human malaria, is responsible for 25-40% of malaria cases worldwide. Malaria global elimination efforts will benefit from accurate and effective genotyping tools that will provide insight into the population genetics and diversity of this parasite. The recent sequencing of P. vivax isolates from South America, Africa, and Asia presents a new opportunity by uncovering thousands of novel single nucleotide polymorphisms (SNPs). Genotyping a selection of these SNPs provides a robust, low-cost method of identifying parasite infections through their unique genetic signature or barcode. Based on our experience in generating a SNP barcode for P. falciparum using High Resolution Melting (HRM), we have developed a similar tool for P. vivax. We selected globally polymorphic SNPs from available P. vivax genome sequence data that were located in putatively selectively neutral sites (i.e., intergenic, intronic, or 4-fold degenerate coding). From these candidate SNPs we defined a barcode consisting of 42 SNPs. We analyzed the performance of the 42-SNP barcode on 87 P. vivax clinical samples from parasite populations in South America (Brazil, French Guiana), Africa (Ethiopia) and Asia (Sri Lanka). We found that the P. vivax barcode is robust, as it requires only a small quantity of DNA (limit of detection 0.3 ng/μl) to yield reproducible genotype calls, and detects polymorphic genotypes with high sensitivity. The markers are informative across all clinical samples evaluated (average minor allele frequency > 0.1). Population genetic and statistical analyses show the barcode captures high degrees of population diversity and differentiates geographically distinct populations. Our 42-SNP barcode provides a robust, informative, and standardized genetic marker set that accurately identifies a genomic signature for P. vivax infections.
Development of a Single Nucleotide Polymorphism Barcode to Genotype Plasmodium vivax Infections
Baniecki, Mary Lynn; Faust, Aubrey L.; Schaffner, Stephen F.; Park, Daniel J.; Galinsky, Kevin; Daniels, Rachel F.; Hamilton, Elizabeth; Ferreira, Marcelo U.; Karunaweera, Nadira D.; Serre, David; Zimmerman, Peter A.; Sá, Juliana M.; Wellems, Thomas E.; Musset, Lise; Legrand, Eric; Melnikov, Alexandre; Neafsey, Daniel E.; Volkman, Sarah K.; Wirth, Dyann F.; Sabeti, Pardis C.
2015-01-01
Plasmodium vivax, one of the five species of Plasmodium parasites that cause human malaria, is responsible for 25–40% of malaria cases worldwide. Malaria global elimination efforts will benefit from accurate and effective genotyping tools that will provide insight into the population genetics and diversity of this parasite. The recent sequencing of P. vivax isolates from South America, Africa, and Asia presents a new opportunity by uncovering thousands of novel single nucleotide polymorphisms (SNPs). Genotyping a selection of these SNPs provides a robust, low-cost method of identifying parasite infections through their unique genetic signature or barcode. Based on our experience in generating a SNP barcode for P. falciparum using High Resolution Melting (HRM), we have developed a similar tool for P. vivax. We selected globally polymorphic SNPs from available P. vivax genome sequence data that were located in putatively selectively neutral sites (i.e., intergenic, intronic, or 4-fold degenerate coding). From these candidate SNPs we defined a barcode consisting of 42 SNPs. We analyzed the performance of the 42-SNP barcode on 87 P. vivax clinical samples from parasite populations in South America (Brazil, French Guiana), Africa (Ethiopia) and Asia (Sri Lanka). We found that the P. vivax barcode is robust, as it requires only a small quantity of DNA (limit of detection 0.3 ng/μl) to yield reproducible genotype calls, and detects polymorphic genotypes with high sensitivity. The markers are informative across all clinical samples evaluated (average minor allele frequency > 0.1). Population genetic and statistical analyses show the barcode captures high degrees of population diversity and differentiates geographically distinct populations. Our 42-SNP barcode provides a robust, informative, and standardized genetic marker set that accurately identifies a genomic signature for P. vivax infections. PMID:25781890
Applying plant DNA barcodes to identify species of Parnassia (Parnassiaceae).
Yang, Jun-Bo; Wang, Yi-Ping; Möller, Michael; Gao, Lian-Ming; Wu, Ding
2012-03-01
DNA barcoding is a technique to identify species by using standardized DNA sequences. In this study, a total of 105 samples, representing 30 Parnassia species, were collected to test the effectiveness of four proposed DNA barcodes (rbcL, matK, trnH-psbA and ITS) for species identification. Our results demonstrated that all four candidate DNA markers have a maximum level of primer universality and sequencing success. As a single DNA marker, the ITS region provided the highest species resolution with 86.7%, followed by trnH-psbA with 73.3%. The combination of the core barcode regions, matK+rbcL, gave the lowest species identification success (63.3%) among any combination of multiple markers and was found unsuitable as DNA barcode for Parnassia. The combination of ITS+trnH-psbA achieved the highest species discrimination with 90.0% resolution (27 of 30 sampled species), equal to the four-marker combination and higher than any two or three marker combination including rbcL or matK. Therefore, matK and rbcL should not be used as DNA barcodes for the species identification of Parnassia. Based on the overall performance, the combination of ITS+trnH-psbA is proposed as the most suitable DNA barcode for identifying Parnassia species. DNA barcoding is a useful technique and provides a reliable and effective mean for the discrimination of Parnassia species, and in combination with morphology-based taxonomy, will be a robust approach for tackling taxonomically complex groups. In the light of our findings, we found among the three species not identified a possible cryptic speciation event in Parnassia. © 2011 Blackwell Publishing Ltd.
Patterns of DNA barcode variation in Canadian marine molluscs.
Layton, Kara K S; Martel, André L; Hebert, Paul D N
2014-01-01
Molluscs are the most diverse marine phylum and this high diversity has resulted in considerable taxonomic problems. Because the number of species in Canadian oceans remains uncertain, there is a need to incorporate molecular methods into species identifications. A 648 base pair segment of the cytochrome c oxidase subunit I gene has proven useful for the identification and discovery of species in many animal lineages. While the utility of DNA barcoding in molluscs has been demonstrated in other studies, this is the first effort to construct a DNA barcode registry for marine molluscs across such a large geographic area. This study examines patterns of DNA barcode variation in 227 species of Canadian marine molluscs. Intraspecific sequence divergences ranged from 0-26.4% and a barcode gap existed for most taxa. Eleven cases of relatively deep (>2%) intraspecific divergence were detected, suggesting the possible presence of overlooked species. Structural variation was detected in COI with indels found in 37 species, mostly bivalves. Some indels were present in divergent lineages, primarily in the region of the first external loop, suggesting certain areas are hotspots for change. Lastly, mean GC content varied substantially among orders (24.5%-46.5%), and showed a significant positive correlation with nearest neighbour distances. DNA barcoding is an effective tool for the identification of Canadian marine molluscs and for revealing possible cases of overlooked species. Some species with deep intraspecific divergence showed a biogeographic partition between lineages on the Atlantic, Arctic and Pacific coasts, suggesting the role of Pleistocene glaciations in the subdivision of their populations. Indels were prevalent in the barcode region of the COI gene in bivalves and gastropods. This study highlights the efficacy of DNA barcoding for providing insights into sequence variation across a broad taxonomic group on a large geographic scale.
Weger-Lucarelli, James; Garcia, Selene M; Rückert, Claudia; Byas, Alex; O'Connor, Shelby L; Aliota, Matthew T; Friedrich, Thomas C; O'Connor, David H; Ebel, Gregory D
2018-06-20
Arboviruses such as Zika virus (ZIKV, Flaviviridae; Flavivirus) must replicate in both mammalian and insect hosts possessing strong immune defenses. Accordingly, transmission between and replication within hosts involves genetic bottlenecks, during which viral population size and genetic diversity may be significantly reduced. To help quantify these bottlenecks and their effects, we constructed 4 "barcoded" ZIKV populations that theoretically contain thousands of barcodes each. After identifying the most diverse barcoded virus, we passaged this virus 3 times in 2 mammalian and mosquito cell lines and characterized the population using deep sequencing of the barcoded region of the genome. C6/36 maintain higher barcode diversity, even after 3 passages, than Vero. Additionally, field-caught mosquitoes exposed to the virus to assess bottlenecks in a natural host. A progressive reduction in barcode diversity occurred throughout systemic infection of these mosquitoes. Differences in bottlenecks during systemic spread were observed between different populations of Aedes aegypti. Copyright © 2018. Published by Elsevier Inc.
2009-01-01
Background Parthenium argentatum (guayule) is an industrial crop that produces latex, which was recently commercialized as a source of latex rubber safe for people with Type I latex allergy. The complete plastid genome of P. argentatum was sequenced. The sequence provides important information useful for genetic engineering strategies. Comparison to the sequences of plastid genomes from three other members of the Asteraceae, Lactuca sativa, Guitozia abyssinica and Helianthus annuus revealed details of the evolution of the four genomes. Chloroplast-specific DNA barcodes were developed for identification of Parthenium species and lines. Results The complete plastid genome of P. argentatum is 152,803 bp. Based on the overall comparison of individual protein coding genes with those in L. sativa, G. abyssinica and H. annuus, we demonstrate that the P. argentatum chloroplast genome sequence is most closely related to that of H. annuus. Similar to chloroplast genomes in G. abyssinica, L. sativa and H. annuus, the plastid genome of P. argentatum has a large 23 kb inversion with a smaller 3.4 kb inversion, within the large inversion. Using the matK and psbA-trnH spacer chloroplast DNA barcodes, three of the four Parthenium species tested, P. tomentosum, P. hysterophorus and P. schottii, can be differentiated from P. argentatum. In addition, we identified lines within P. argentatum. Conclusion The genome sequence of the P. argentatum chloroplast will enrich the sequence resources of plastid genomes in commercial crops. The availability of the complete plastid genome sequence may facilitate transformation efficiency by using the precise sequence of endogenous flanking sequences and regulatory elements in chloroplast transformation vectors. The DNA barcoding study forms the foundation for genetic identification of commercially significant lines of P. argentatum that are important for producing latex. PMID:19917140
Forensic identification of CITES protected slimming cactus (Hoodia) using DNA barcoding.
Gathier, Gerard; van der Niet, Timotheus; Peelen, Tamara; van Vugt, Rogier R; Eurlings, Marcel C M; Gravendeel, Barbara
2013-11-01
Slimming cactus (Hoodia), found only in southwestern Africa, is a well-known herbal product for losing weight. Consequently, Hoodia extracts are sought-after worldwide despite a CITES Appendix II status. The failure to eradicate illegal trade is due to problems with detecting and identifying Hoodia using morphological and chemical characters. Our aim was to evaluate the potential of molecular identification of Hoodia based on DNA barcoding. Screening of nrITS1 and psbA-trnH DNA sequences from 26 accessions of Ceropegieae resulted in successful identification, while conventional chemical profiling using DLI-MS led to inaccurate detection and identification of Hoodia. The presence of Hoodia in herbal products was also successfully established using DNA sequences. A validation procedure of our DNA barcoding protocol demonstrated its robustness to changes in PCR conditions. We conclude that DNA barcoding is an effective tool for Hoodia detection and identification which can contribute to preventing illegal trade. © 2013 American Academy of Forensic Sciences.
Makarchenko, Eugenyi A; Makarchenko, Marina A; Semenchenko, Alexander A
2015-08-14
Illustrated descriptions of adult male, pupa and fourth instar larva, as well as DNA barcoding, of Hydrobaenus majus sp. nov. in comparison with the close related species H. sikhotealinensis Makarchenko et Makarchenko from the Russian Far East are provided. The species-specificity of H. majus sp. nov. COI sequences is analyzed and the sequences are presented as diagnostic characters--molecular markers of H. majus and H. sikhotealinensis.
Morise, Hisashi; Miyazaki, Erika; Yoshimitsu, Shoko; Eki, Toshihiko
2012-01-01
Soil nematodes play crucial roles in the soil food web and are a suitable indicator for assessing soil environments and ecosystems. Previous nematode community analyses based on nematode morphology classification have been shown to be useful for assessing various soil environments. Here we have conducted DNA barcode analysis for soil nematode community analyses in Japanese soils. We isolated nematodes from two different environmental soils of an unmanaged flowerbed and an agricultural field using the improved flotation-sieving method. Small subunit (SSU) rDNA fragments were directly amplified from each of 68 (flowerbed samples) and 48 (field samples) isolated nematodes to determine the nucleotide sequence. Sixteen and thirteen operational taxonomic units (OTUs) were obtained by multiple sequence alignment from the flowerbed and agricultural field nematodes, respectively. All 29 SSU rDNA-derived OTUs (rOTUs) were further mapped onto a phylogenetic tree with 107 known nematode species. Interestingly, the two nematode communities examined were clearly distinct from each other in terms of trophic groups: Animal predators and plant feeders were markedly abundant in the flowerbed soils, in contrast, bacterial feeders were dominantly observed in the agricultural field soils. The data from the flowerbed nematodes suggests a possible food web among two different trophic nematode groups and plants (weeds) in the closed soil environment. Finally, DNA sequences derived from the mitochondrial cytochrome oxidase c subunit 1 (COI) gene were determined as a DNA barcode from 43 agricultural field soil nematodes. These nematodes were assigned to 13 rDNA-derived OTUs, but in the COI gene analysis were assigned to 23 COI gene-derived OTUs (cOTUs), indicating that COI gene-based barcoding may provide higher taxonomic resolution than conventional SSU rDNA-barcoding in soil nematode community analysis. PMID:23284767
Hussain, Fatma; Ahmed, Nisar; Ghorbani, Abdolbaset
2018-01-01
In pursuit of developing fast and accurate species-level molecular identification methods, we tested six DNA barcodes, namely ITS2, matK, rbcLa, ITS2+matK, ITS2+rbcLa, matK+rbcLa and ITS2+matK+rbcLa, for their capacity to identify frequently consumed but geographically isolated medicinal species of Fabaceae and Poaceae indigenous to the desert of Cholistan. Data were analysed by BLASTn sequence similarity, pairwise sequence divergence in TAXONDNA, and phylogenetic (neighbour-joining and maximum-likelihood trees) methods. Comparison of six barcode regions showed that ITS2 has the highest number of variable sites (209/360) for tested Fabaceae and (106/365) Poaceae species, the highest species-level identification (40%) in BLASTn procedure, distinct DNA barcoding gap, 100% correct species identification in BM and BCM functions of TAXONDNA, and clear cladding pattern with high nodal support in phylogenetic trees in both families. ITS2+matK+rbcLa followed ITS2 in its species-level identification capacity. The study was concluded with advocating the DNA barcoding as an effective tool for species identification and ITS2 as the best barcode region in identifying medicinal species of Fabaceae and Poaceae. Current research has practical implementation potential in the fields of pharmaco-vigilance, trade of medicinal plants and biodiversity conservation. PMID:29576968
The LAM-PCR Method to Sequence LV Integration Sites.
Wang, Wei; Bartholomae, Cynthia C; Gabriel, Richard; Deichmann, Annette; Schmidt, Manfred
2016-01-01
Integrating viral gene transfer vectors are commonly used gene delivery tools in clinical gene therapy trials providing stable integration and continuous gene expression of the transgene in the treated host cell. However, integration of the reverse-transcribed vector DNA into the host genome is a potentially mutagenic event that may directly contribute to unwanted side effects. A comprehensive and accurate analysis of the integration site (IS) repertoire is indispensable to study clonality in transduced cells obtained from patients undergoing gene therapy and to identify potential in vivo selection of affected cell clones. To date, next-generation sequencing (NGS) of vector-genome junctions allows sophisticated studies on the integration repertoire in vitro and in vivo. We have explored the use of the Illumina MiSeq Personal Sequencer platform to sequence vector ISs amplified by non-restrictive linear amplification-mediated PCR (nrLAM-PCR) and LAM-PCR. MiSeq-based high-quality IS sequence retrieval is accomplished by the introduction of a double-barcode strategy that substantially minimizes the frequency of IS sequence collisions compared to the conventionally used single-barcode protocol. Here, we present an updated protocol of (nr)LAM-PCR for the analysis of lentiviral IS using a double-barcode system and followed by deep sequencing using the MiSeq device.
DNA Barcoding Green Microalgae Isolated from Neotropical Inland Waters
Hadi, Sámed I. I. A.; Santana, Hugo; Brunale, Patrícia P. M.; Gomes, Taísa G.; Oliveira, Márcia D.; Matthiensen, Alexandre; Oliveira, Marcos E. C.; Silva, Flávia C. P.; Brasil, Bruno S. A. F.
2016-01-01
This study evaluated the feasibility of using the Ribulose Bisphosphate Carboxylase Large subunit gene (rbcL) and the Internal Transcribed Spacers 1 and 2 of the nuclear rDNA (nuITS1 and nuITS2) markers for identifying a very diverse, albeit poorly known group, of green microalgae from neotropical inland waters. Fifty-one freshwater green microalgae strains isolated from Brazil, the largest biodiversity reservoir in the neotropics, were submitted to DNA barcoding. Currently available universal primers for ITS1-5.8S-ITS2 region amplification were sufficient to successfully amplify and sequence 47 (92%) of the samples. On the other hand, new sets of primers had to be designed for rbcL, which allowed 96% of the samples to be sequenced. Thirty-five percent of the strains could be unambiguously identified to the species level based either on nuITS1 or nuITS2 sequences’ using barcode gap calculations. nuITS2 Compensatory Base Change (CBC) and ITS1-5.8S-ITS2 region phylogenetic analysis, together with morphological inspection, confirmed the identification accuracy. In contrast, only 6% of the strains could be assigned to the correct species based solely on rbcL sequences. In conclusion, the data presented here indicates that either nuITS1 or nuITS2 are useful markers for DNA barcoding of freshwater green microalgae, with advantage for nuITS2 due to the larger availability of analytical tools and reference barcodes deposited at databases for this marker. PMID:26900844
Evaluation of the DNA barcodes in Dendrobium (Orchidaceae) from mainland Asia.
Xu, Songzhi; Li, Dezhu; Li, Jianwu; Xiang, Xiaoguo; Jin, Weitao; Huang, Weichang; Jin, Xiaohua; Huang, Luqi
2015-01-01
DNA barcoding has been proposed to be one of the most promising tools for accurate and rapid identification of taxa. However, few publications have evaluated the efficiency of DNA barcoding for the large genera of flowering plants. Dendrobium, one of the largest genera of flowering plants, contains many species that are important in horticulture, medicine and biodiversity conservation. Besides, Dendrobium is a notoriously difficult group to identify. DNA barcoding was expected to be a supplementary means for species identification, conservation and future studies in Dendrobium. We assessed the power of 11 candidate barcodes on the basis of 1,698 accessions of 184 Dendrobium species obtained primarily from mainland Asia. Our results indicated that five single barcodes, i.e., ITS, ITS2, matK, rbcL and trnH-psbA, can be easily amplified and sequenced with the currently established primers. Four barcodes, ITS, ITS2, ITS+matK, and ITS2+matK, have distinct barcoding gaps. ITS+matK was the optimal barcode based on all evaluation methods. Furthermore, the efficiency of ITS+matK was verified in four other large genera including Ficus, Lysimachia, Paphiopedilum, and Pedicularis in this study. Therefore, we tentatively recommend the combination of ITS+matK as a core DNA barcode for large flowering plant genera.
Evaluation of the DNA Barcodes in Dendrobium (Orchidaceae) from Mainland Asia
Xu, Songzhi; Li, Dezhu; Li, Jianwu; Xiang, Xiaoguo; Jin, Weitao; Huang, Weichang; Jin, Xiaohua; Huang, Luqi
2015-01-01
DNA barcoding has been proposed to be one of the most promising tools for accurate and rapid identification of taxa. However, few publications have evaluated the efficiency of DNA barcoding for the large genera of flowering plants. Dendrobium, one of the largest genera of flowering plants, contains many species that are important in horticulture, medicine and biodiversity conservation. Besides, Dendrobium is a notoriously difficult group to identify. DNA barcoding was expected to be a supplementary means for species identification, conservation and future studies in Dendrobium. We assessed the power of 11 candidate barcodes on the basis of 1,698 accessions of 184 Dendrobium species obtained primarily from mainland Asia. Our results indicated that five single barcodes, i.e., ITS, ITS2, matK, rbcL and trnH-psbA, can be easily amplified and sequenced with the currently established primers. Four barcodes, ITS, ITS2, ITS+matK, and ITS2+matK, have distinct barcoding gaps. ITS+matK was the optimal barcode based on all evaluation methods. Furthermore, the efficiency of ITS+matK was verified in four other large genera including Ficus, Lysimachia, Paphiopedilum, and Pedicularis in this study. Therefore, we tentatively recommend the combination of ITS+matK as a core DNA barcode for large flowering plant genera. PMID:25602282
Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E
2013-08-15
Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.
Hanner, Robert; Becker, Sven; Ivanova, Natalia V; Steinke, Dirk
2011-10-01
The Fish Barcode of Life campaign involves a broad international collaboration among scientists working to advance the identification of fishes using DNA barcodes. With over 25% of the world's known ichthyofauna currently profiled, forensic identification of seafood products is now feasible and is becoming routine. Driven by growing consumer interest in the food supply, investigative reporters from five different media establishments procured seafood samples (n = 254) from numerous retail establishments located among five Canadian metropolitan areas between 2008 and 2010. The specimens were sent to the Canadian Centre for DNA Barcoding for analysis. By integrating the results from these individual case studies in a summary analysis, we provide a broad perspective on seafood substitution across Canada. Barcodes were recovered from 93% of the samples (n = 236), and identified using the Barcode of Life Data Systems "species identification" engine ( www.barcodinglife.org ). A 99% sequence similarity threshold was employed as a conservative matching criterion for specimen identification to the species level. Comparing these results against the Canadian Food Inspection Agency's "Fish List" a guideline to interpreting "false, misleading or deceptive" names (as per s 27 of the Fish Inspection regulations) demonstrated that 41% of the samples were mislabeled. Most samples were readily identified; however, this was not true in all cases because some samples had no close match. Others were ambiguous due to limited barcode resolution (or imperfect taxonomy) observed within a few closely related species complexes. The latter cases did not significantly impact the results because even the partial resolution achieved was sufficient to demonstrate mislabeling. This work highlights the functional utility of barcoding for the identification of diverse market samples. It also demonstrates how barcoding serves as a bridge linking scientific nomenclature with approved market names, potentially empowering regulatory bodies to enforce labeling standards. By synchronizing taxonomic effort with sequencing effort and database curation, barcoding provides a molecular identification resource of service to applied forensics.
Ashfaq, Muhammad; Ali, Hayssam M.; Yessoufou, Kowiyou
2017-01-01
DNA barcoding relies on short and standardized gene regions to identify species. The agricultural and horticultural applications of barcoding such as for marketplace regulation and copyright protection remain poorly explored. This study examines the effectiveness of the standard plant barcode markers (matK and rbcL) for the identification of plant species in private and public nurseries in northern Egypt. These two markers were sequenced from 225 specimens of 161 species and 62 plant families of horticultural importance. The sequence recovery was similar for rbcL (96.4%) and matK (84%), but the number of specimens assigned correctly to the respective genera and species was lower for rbcL (75% and 29%) than matK (85% and 40%). The combination of rbcL and matK brought the number of correct generic and species assignments to 83.4% and 40%, respectively. Individually, the efficiency of both markers varied among different plant families; for example, all palm specimens (Arecaceae) were correctly assigned to species while only one individual of Asteraceae was correctly assigned to species. Further, barcodes reliably assigned ornamental horticultural and medicinal plants correctly to genus while they showed a lower or no success in assigning these plants to species and cultivars. For future, we recommend the combination of a complementary barcode (e.g. ITS or trnH-psbA) with rbcL + matK to increase the performance of taxa identification. By aiding species identification of horticultural crops and ornamental palms, the analysis of the barcode regions will have large impact on horticultural industry. PMID:28199378
O Elansary, Hosam; Ashfaq, Muhammad; Ali, Hayssam M; Yessoufou, Kowiyou
2017-01-01
DNA barcoding relies on short and standardized gene regions to identify species. The agricultural and horticultural applications of barcoding such as for marketplace regulation and copyright protection remain poorly explored. This study examines the effectiveness of the standard plant barcode markers (matK and rbcL) for the identification of plant species in private and public nurseries in northern Egypt. These two markers were sequenced from 225 specimens of 161 species and 62 plant families of horticultural importance. The sequence recovery was similar for rbcL (96.4%) and matK (84%), but the number of specimens assigned correctly to the respective genera and species was lower for rbcL (75% and 29%) than matK (85% and 40%). The combination of rbcL and matK brought the number of correct generic and species assignments to 83.4% and 40%, respectively. Individually, the efficiency of both markers varied among different plant families; for example, all palm specimens (Arecaceae) were correctly assigned to species while only one individual of Asteraceae was correctly assigned to species. Further, barcodes reliably assigned ornamental horticultural and medicinal plants correctly to genus while they showed a lower or no success in assigning these plants to species and cultivars. For future, we recommend the combination of a complementary barcode (e.g. ITS or trnH-psbA) with rbcL + matK to increase the performance of taxa identification. By aiding species identification of horticultural crops and ornamental palms, the analysis of the barcode regions will have large impact on horticultural industry.
Shen, Yanjun; Guan, Lihong; Wang, Dengqiang; Gan, Xiaoni
2016-05-01
The Yangtze River is the longest river in China and is divided into upstream and mid-downstream regions by the Three Gorges (the natural barriers of the Yangtze River), resulting in a complex distribution of fish. Dramatic changes to habitat environments may ultimately threaten fish survival; thus, it is necessary to evaluate the genetic diversity and propose protective measures. Species identification is the most significant task in many fields of biological research and in conservation efforts. DNA barcoding, which constitutes the analysis of a short fragment of the mitochondrial cytochrome c oxidase subunit I (COI) sequence, has been widely used for species identification. In this study, we collected 561 COI barcode sequences from 35 fish from the midstream of the Yangtze River. The intraspecific distances of all species were below 2% (with the exception of Acheilognathus macropterus and Hemibarbus maculatus). Nevertheless, all species could be unambiguously identified from the trees, barcoding gaps and taxonomic resolution ratio values. Furthermore, the COI barcode diversity was found to be low (≤0.5%), with the exception of H. maculatus (0.87%), A. macropterus (2.02%) and Saurogobio dabryi (0.82%). No or few shared haplotypes were detected between the upstream and downstream populations for ten species with overall nucleotide diversities greater than 0.00%, which indicated the likelihood of significant population genetic structuring. Our analyses indicated that DNA barcoding is an effective tool for the identification of cyprinidae fish in the midstream of the Yangtze River. It is vital that some protective measures be taken immediately because of the low COI barcode diversity.
Zhang, Jian-Qiang; Meng, Shi-Yong; Wen, Jun; Rao, Guang-Yuan
2015-01-01
DNA barcoding, the identification of species using one or a few short standardized DNA sequences, is an important complement to traditional taxonomy. However, there are particular challenges for barcoding plants, especially for species with complex evolutionary histories. We herein evaluated the utility of five candidate sequences — rbcL, matK, trnH-psbA, trnL-F and the internal transcribed spacer (ITS) — for barcoding Rhodiola species, a group of high-altitude plants frequently used as adaptogens, hemostatics and tonics in traditional Tibetan medicine. Rhodiola was suggested to have diversified rapidly recently. The genus is thus a good model for testing DNA barcoding strategies for recently diversified medicinal plants. This study analyzed 189 accessions, representing 47 of the 55 recognized Rhodiola species in the Flora of China treatment. Based on intraspecific and interspecific divergence and degree of monophyly statistics, ITS was the best single-locus barcode, resolving 66% of the Rhodiola species. The core combination rbcL+matK resolved only 40.4% of them. Unsurprisingly, the combined use of all five loci provided the highest discrimination power, resolving 80.9% of the species. However, this is weaker than the discrimination power generally reported in barcoding studies of other plant taxa. The observed complications may be due to the recent diversification, incomplete lineage sorting and reticulate evolution of the genus. These processes are common features of numerous plant groups in the high-altitude regions of the Qinghai-Tibetan Plateau. PMID:25774915
Marsic, Damien; Méndez-Gómez, Héctor R; Zolotukhin, Sergei
2015-01-01
Biodistribution analysis is a key step in the evaluation of adeno-associated virus (AAV) capsid variants, whether natural isolates or produced by rational design or directed evolution. Indeed, when screening candidate vectors, accurate knowledge about which tissues are infected and how efficiently is essential. We describe the design, validation, and application of a new vector, pTR-UF50-BC, encoding a bioluminescent protein, a fluorescent protein and a DNA barcode, which can be used to visualize localization of transduction at the organism, organ, tissue, or cellular levels. In addition, by linking capsid variants to different barcoded versions of the vector and amplifying the barcode region from various tissue samples using barcoded primers, biodistribution of viral genomes can be analyzed with high accuracy and efficiency.
Peñafiel, Nicolás; Arteaga, Alejandro; Bustamante, Lucas; Pichardo, Frank; Coloma, Luis A; Barrio-Amorós, César L; Salazar-Valenzuela, David; Prost, Stefan
2018-01-01
Abstract Background Advancements in portable scientific instruments provide promising avenues to expedite field work in order to understand the diverse array of organisms that inhabit our planet. Here, we tested the feasibility for in situ molecular analyses of endemic fauna using a portable laboratory fitting within a single backpack in one of the world's most imperiled biodiversity hotspots, the Ecuadorian Chocó rainforest. We used portable equipment, including the MinION nanopore sequencer (Oxford Nanopore Technologies) and the miniPCR (miniPCR), to perform DNA extraction, polymerase chain reaction amplification, and real-time DNA barcoding of reptile specimens in the field. Findings We demonstrate that nanopore sequencing can be implemented in a remote tropical forest to quickly and accurately identify species using DNA barcoding, as we generated consensus sequences for species resolution with an accuracy of >99% in less than 24 hours after collecting specimens. The flexibility of our mobile laboratory further allowed us to generate sequence information at the Universidad Tecnológica Indoamérica in Quito for rare, endangered, and undescribed species. This includes the recently rediscovered Jambato toad, which was thought to be extinct for 28 years. Sequences generated on the MinION required as few as 30 reads to achieve high accuracy relative to Sanger sequencing, and with further multiplexing of samples, nanopore sequencing can become a cost-effective approach for rapid and portable DNA barcoding. Conclusions Overall, we establish how mobile laboratories and nanopore sequencing can help to accelerate species identification in remote areas to aid in conservation efforts and be applied to research facilities in developing countries. This opens up possibilities for biodiversity studies by promoting local research capacity building, teaching nonspecialists and students about the environment, tackling wildlife crime, and promoting conservation via research-focused ecotourism. PMID:29617771
Hawlitschek, Oliver; Nagy, Zoltán T.; Berger, Johannes; Glaw, Frank
2013-01-01
In the past decade, DNA barcoding became increasingly common as a method for species identification in biodiversity inventories and related studies. However, mainly due to technical obstacles, squamate reptiles have been the target of few barcoding studies. In this article, we present the results of a DNA barcoding study of squamates of the Comoros archipelago, a poorly studied group of oceanic islands close to and mostly colonized from Madagascar. The barcoding dataset presented here includes 27 of the 29 currently recognized squamate species of the Comoros, including 17 of the 18 endemic species. Some species considered endemic to the Comoros according to current taxonomy were found to cluster with non-Comoran lineages, probably due to poorly resolved taxonomy. All other species for which more than one barcode was obtained corresponded to distinct clusters useful for species identification by barcoding. In most species, even island populations could be distinguished using barcoding. Two cryptic species were identified using the DNA barcoding approach. The obtained barcoding topology, a Bayesian tree based on COI sequences of 5 genera, was compared with available multigene topologies, and in 3 cases, major incongruences between the two topologies became evident. Three of the multigene studies were initiated after initial screening of a preliminary version of the barcoding dataset presented here. We conclude that in the case of the squamates of the Comoros Islands, DNA barcoding has proven a very useful and efficient way of detecting isolated populations and promising starting points for subsequent research. PMID:24069192
DNA barcode-based molecular identification system for fish species.
Kim, Sungmin; Eo, Hae-Seok; Koo, Hyeyoung; Choi, Jun-Kil; Kim, Won
2010-12-01
In this study, we applied DNA barcoding to identify species using short DNA sequence analysis. We examined the utility of DNA barcoding by identifying 53 Korean freshwater fish species, 233 other freshwater fish species, and 1339 saltwater fish species. We successfully developed a web-based molecular identification system for fish (MISF) using a profile hidden Markov model. MISF facilitates efficient and reliable species identification, overcoming the limitations of conventional taxonomic approaches. MISF is freely accessible at http://bioinfosys.snu.ac.kr:8080/MISF/misf.jsp .
Chemical genomic profiling via barcode sequencing to predict compound mode of action
Piotrowski, Jeff S.; Simpkins, Scott W.; Li, Sheena C.; Deshpande, Raamesh; McIlwain, Sean; Ong, Irene; Myers, Chad L.; Boone, Charlie; Andersen, Raymond J.
2015-01-01
Summary Chemical genomics is an unbiased, whole-cell approach to characterizing novel compounds to determine mode of action and cellular target. Our version of this technique is built upon barcoded deletion mutants of Saccharomyces cerevisiae and has been adapted to a high-throughput methodology using next-generation sequencing. Here we describe the steps to generate a chemical genomic profile from a compound of interest, and how to use this information to predict molecular mechanism and targets of bioactive compounds. PMID:25618354
DNA Barcoding in Fragaria L. (Strawberry) Species
USDA-ARS?s Scientific Manuscript database
DNA barcoding for species identification using a short DNA sequence has been successful in animals due to rapid mutation rates of the mitochondrial genome where the animal DNA barocode, cytochrome c oxidase 1 gene is located. The chloroplast PsbA-trnH spacer and the nuclear ribosomal internal transc...
DNA Barcode for Identifying Folium Artemisiae Argyi from Counterfeits.
Mei, Quanxi; Chen, Xiaolu; Xiang, Li; Liu, Yue; Su, Yanyan; Gao, Yuqiao; Dai, Weibo; Dong, Pengpeng; Chen, Shilin
2016-01-01
Folium Artemisiae Argyi is an important herb in traditional Chinese medicine. It is commonly used in moxibustion, medicine, etc. However, identifying Artemisia argyi is difficult because this herb exhibits similar morphological characteristics to closely related species and counterfeits. To verify the applicability of DNA barcoding, ITS2 and psbA-trnH were used to identify A. argyi from 15 closely related species and counterfeits. Results indicated that total DNA was easily extracted from all the samples and that both ITS2 and psbA-trnH fragments can be easily amplified. ITS2 was a more ideal barcode than psbA-trnH and ITS2+psbA-trnH to identify A. argyi from closely related species and counterfeits on the basis of sequence character, genetic distance, and tree methods. The sequence length was 225 bp for the 56 ITS2 sequences of A. argyi, and no variable site was detected. For the ITS2 sequences, A. capillaris, A. anomala, A. annua, A. igniaria, A. maximowicziana, A. princeps, Dendranthema vestitum, and D. indicum had single nucleotide polymorphisms (SNPs). The intraspecific Kimura 2-Parameter distance was zero, which is lower than the minimum interspecific distance (0.005). A. argyi, the closely related species, and counterfeits, except for Artemisia maximowicziana and Artemisia sieversiana, were separated into pairs of divergent clusters by using the neighbor joining, maximum parsimony, and maximum likelihood tree methods. Thus, the ITS2 sequence was an ideal barcode to identify A. argyi from closely related species and counterfeits to ensure the safe use of this plant.
Shapcott, Alison; Forster, Paul I.; Guymer, Gordon P.; McDonald, William J. F.; Faith, Daniel P.; Erickson, David; Kress, W. John
2015-01-01
Australian rainforests have been fragmented due to past climatic changes and more recently landscape change as a result of clearing for agriculture and urban spread. The subtropical rainforests of South Eastern Queensland are significantly more fragmented than the tropical World Heritage listed northern rainforests and are subject to much greater human population pressures. The Australian rainforest flora is relatively taxonomically rich at the family level, but less so at the species level. Current methods to assess biodiversity based on species numbers fail to adequately capture this richness at higher taxonomic levels. We developed a DNA barcode library for the SE Queensland rainforest flora to support a methodology for biodiversity assessment that incorporates both taxonomic diversity and phylogenetic relationships. We placed our SE Queensland phylogeny based on a three marker DNA barcode within a larger international rainforest barcode library and used this to calculate phylogenetic diversity (PD). We compared phylo- diversity measures, species composition and richness and ecosystem diversity of the SE Queensland rainforest estate to identify which bio subregions contain the greatest rainforest biodiversity, subregion relationships and their level of protection. We identified areas of highest conservation priority. Diversity was not correlated with rainforest area in SE Queensland subregions but PD was correlated with both the percent of the subregion occupied by rainforest and the diversity of regional ecosystems (RE) present. The patterns of species diversity and phylogenetic diversity suggest a strong influence of historical biogeography. Some subregions contain significantly more PD than expected by chance, consistent with the concept of refugia, while others were significantly phylogenetically clustered, consistent with recent range expansions. PMID:25803607
Shapcott, Alison; Forster, Paul I; Guymer, Gordon P; McDonald, William J F; Faith, Daniel P; Erickson, David; Kress, W John
2015-01-01
Australian rainforests have been fragmented due to past climatic changes and more recently landscape change as a result of clearing for agriculture and urban spread. The subtropical rainforests of South Eastern Queensland are significantly more fragmented than the tropical World Heritage listed northern rainforests and are subject to much greater human population pressures. The Australian rainforest flora is relatively taxonomically rich at the family level, but less so at the species level. Current methods to assess biodiversity based on species numbers fail to adequately capture this richness at higher taxonomic levels. We developed a DNA barcode library for the SE Queensland rainforest flora to support a methodology for biodiversity assessment that incorporates both taxonomic diversity and phylogenetic relationships. We placed our SE Queensland phylogeny based on a three marker DNA barcode within a larger international rainforest barcode library and used this to calculate phylogenetic diversity (PD). We compared phylo- diversity measures, species composition and richness and ecosystem diversity of the SE Queensland rainforest estate to identify which bio subregions contain the greatest rainforest biodiversity, subregion relationships and their level of protection. We identified areas of highest conservation priority. Diversity was not correlated with rainforest area in SE Queensland subregions but PD was correlated with both the percent of the subregion occupied by rainforest and the diversity of regional ecosystems (RE) present. The patterns of species diversity and phylogenetic diversity suggest a strong influence of historical biogeography. Some subregions contain significantly more PD than expected by chance, consistent with the concept of refugia, while others were significantly phylogenetically clustered, consistent with recent range expansions.
Advances in DNA metabarcoding for food and wildlife forensic species identification.
Staats, Martijn; Arulandhu, Alfred J; Gravendeel, Barbara; Holst-Jensen, Arne; Scholtens, Ingrid; Peelen, Tamara; Prins, Theo W; Kok, Esther
2016-07-01
Species identification using DNA barcodes has been widely adopted by forensic scientists as an effective molecular tool for tracking adulterations in food and for analysing samples from alleged wildlife crime incidents. DNA barcoding is an approach that involves sequencing of short DNA sequences from standardized regions and comparison to a reference database as a molecular diagnostic tool in species identification. In recent years, remarkable progress has been made towards developing DNA metabarcoding strategies, which involves next-generation sequencing of DNA barcodes for the simultaneous detection of multiple species in complex samples. Metabarcoding strategies can be used in processed materials containing highly degraded DNA e.g. for the identification of endangered and hazardous species in traditional medicine. This review aims to provide insight into advances of plant and animal DNA barcoding and highlights current practices and recent developments for DNA metabarcoding of food and wildlife forensic samples from a practical point of view. Special emphasis is placed on new developments for identifying species listed in the Convention on International Trade of Endangered Species (CITES) appendices for which reliable methods for species identification may signal and/or prevent illegal trade. Current technological developments and challenges of DNA metabarcoding for forensic scientists will be assessed in the light of stakeholders' needs.
Syromyatnikov, Mikhail Y; Golub, Victor B; Kokina, Anastasia V; Victoria A Soboleva; Popov, Vasily N
2017-01-01
The genus Eurygaster Laporte, 1833 includes ten species five of which inhabit the European part of Russia. The harmful species of the genus is E. integriceps . Eurygaster species identification based on the morphological traits is very difficult, while that of the species at the egg or larval stages is extremely difficult or impossible. Eurygaster integriceps , E. maura , and E. testudinaria differ only slightly between each other morphologically, E. maura and E. testudinaria being almost indiscernible. DNA barcoding based on COI sequences have shown that E. integriceps differs significantly from these closely related species, which enables its rapid and accurate identification. Based on COI nucleotide sequences, three species of Sunn pests, E. maura , E. testudinarius , E. dilaticollis , could not be differentiated from each other through DNA barcoding. The difference in the DNA sequences between the COI gene of E. integriceps and COI genes of E. maura and E. testudinarius was more than 4%. In the present study DNA barcoding of two Eurygaster species was performed for the first time on E. integriceps , the most dangerous pest in the genus, and E. dilaticollis that only inhabits natural ecosystems. The PCR-RFLP method was developed in this work for the rapid identification of E. integriceps .
DNA barcode variability and host plant usage of fruit flies (Diptera: Tephritidae) in Thailand.
Kunprom, Chonticha; Pramual, Pairot
2016-10-01
The objectives of this study were to examine the genetic variation in fruit flies (Diptera: Tephritidae) in Thailand and to test the efficiency of the mitochondrial cytochrome c oxidase subunit I (COI) barcoding region for species-level identification. Twelve fruit fly species were collected from 24 host plant species of 13 families. The number of host plant species for each fruit fly species ranged between 1 and 11, with Bactrocera correcta found in the most diverse host plants. A total of 123 COI sequences were obtained from these fruit fly species. Sequences from the NCBI database were also included, for a total of 17 species analyzed. DNA barcoding identification analysis based on the best close match method revealed a good performance, with 94.4% of specimens correctly identified. However, many specimens (3.6%) had ambiguous identification, mostly due to intra- and interspecific overlap between members of the B. dorsalis complex. A phylogenetic tree based on the mitochondrial barcode sequences indicated that all species, except for the members of the B. dorsalis complex, were monophyletic with strong support. Our work supports recent calls for synonymization of these species. Divergent lineages were observed within B. correcta and B. tuberculata, and this suggested that these species need further taxonomic reexamination.
Syromyatnikov, Mikhail Y.; Golub, Victor B.; Kokina, Anastasia V.; Victoria A. Soboleva; Popov, Vasily N.
2017-01-01
Abstract The genus Eurygaster Laporte, 1833 includes ten species five of which inhabit the European part of Russia. The harmful species of the genus is E. integriceps. Eurygaster species identification based on the morphological traits is very difficult, while that of the species at the egg or larval stages is extremely difficult or impossible. Eurygaster integriceps, E. maura, and E. testudinaria differ only slightly between each other morphologically, E. maura and E. testudinaria being almost indiscernible. DNA barcoding based on COI sequences have shown that E. integriceps differs significantly from these closely related species, which enables its rapid and accurate identification. Based on COI nucleotide sequences, three species of Sunn pests, E. maura, E. testudinarius, E. dilaticollis, could not be differentiated from each other through DNA barcoding. The difference in the DNA sequences between the COI gene of E. integriceps and COI genes of E. maura and E. testudinarius was more than 4%. In the present study DNA barcoding of two Eurygaster species was performed for the first time on E. integriceps, the most dangerous pest in the genus, and E. dilaticollis that only inhabits natural ecosystems. The PCR-RFLP method was developed in this work for the rapid identification of E. integriceps. PMID:29118620
A checklist of the bats of Peninsular Malaysia and progress towards a DNA barcode reference library.
Lim, Voon-Ching; Ramli, Rosli; Bhassu, Subha; Wilson, John-James
2017-01-01
Several published checklists of bat species have covered Peninsular Malaysia as part of a broader region and/or in combination with other mammal groups. Other researchers have produced comprehensive checklists for specific localities within the peninsula. To our knowledge, a comprehensive checklist of bats specifically for the entire geopolitical region of Peninsular Malaysia has never been published, yet knowing which species are present in Peninsular Malaysia and their distributions across the region are crucial in developing suitable conservation plans. Our literature search revealed that 110 bat species have been documented in Peninsular Malaysia; 105 species have precise locality records while five species lack recent and/or precise locality records. We retrieved 18 species from records dated before the year 2000 and seven species have only ever been recorded once. Our search of Barcode of Life Datasystems (BOLD) found that 86 (of the 110) species have public records of which 48 species have public DNA barcodes available from bats sampled in Peninsular Malaysia. Based on Neighbour-Joining tree analyses and the allocation of DNA barcodes to Barcode Index Number system (BINs) by BOLD, several DNA barcodes recorded under the same species name are likely to represent distinct taxa. We discuss these cases in detail and highlight the importance of further surveys to determine the occurences and resolve the taxonomy of particular bat species in Peninsular Malaysia, with implications for conservation priorities.
Use of a rep-PCR system to predict species in the Aspergillus section Nigri.
Palencia, Edwin R; Klich, Maren A; Glenn, Anthony E; Bacon, Charles W
2009-10-01
The Aspergillus niger aggregate within the A. section Nigri is a group of black-spored aspergilli of great agro-economic importance whose well defined taxonomy has been elusive. Rep-PCR has become a rapid and cost-effective method for genotyping fungi and bacteria. In the present study, we evaluated the discriminatory power of a semi-automated rep-PCR barcoding system to distinguish morphotypic species and compare the results with the data obtained from ITS and partial calmodulin regions. For this purpose, 20 morphotyped black-spored Aspergillus species were used to create the A. section Nigri library in this barcoding system that served to identify 34 field isolates. A pair-wise similarity matrix was calculated using the cone-based Pearson correlation method and the dendrogram was generated by the unweighted pair group method with arithmetic mean (UPGMA), illustrating four different clustered groups: the uniseriate cluster (I), the Aspergillus carbonarius cluster (II), and. the two A. niger aggregate clusters (named III.A and III.B). Rep-PCR showed higher resolution than the ITS and the partial calmodulin gene analytical procedures. The data of the 34 unknown field isolates, collected from different locations in the United States, indicated that only 12% of the field isolates were >95% similar to one of the genotypes included in the A. section Nigri library. However, 64% of the field isolates matched genotypes with the reference library (similarity values >90%). Based on these results, this barcoding procedure has the potential for use as a reproducible tool for identifying the black-spored aspergilli.
Using high-throughput barcode sequencing to efficiently map connectomes
Peikon, Ian D.; Kebschull, Justus M.; Vagin, Vasily V.; Ravens, Diana I.; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R.; Bressan, Dario
2017-01-01
Abstract The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision—a ‘connectome’—is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence—an RNA ‘barcode’—which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. PMID:28449067
Cytochrome c oxidase I primers for corbiculate bees: DNA barcode and mini-barcode.
Françoso, E; Arias, M C
2013-09-01
Bees (Apidae), of which there are more than 19 900 species, are extremely important for ecosystem services and economic purposes, so taxon identity is a major concern. The goal of this study was to optimize the DNA barcode technique based on the Cytochrome c oxidase (COI) mitochondrial gene region. This approach has previously been shown to be useful in resolving taxonomic inconsistencies and for species identification when morphological data are poor. Specifically, we designed and tested new primers and standardized PCR conditions to amplify the barcode region for bees, focusing on the corbiculate Apids. In addition, primers were designed to amplify small COI amplicons and tested with pinned specimens. Short barcode sequences were easily obtained for some Bombus century-old museum specimens and shown to be useful as mini-barcodes. The new primers and PCR conditions established in this study proved to be successful for the amplification of the barcode region for all species tested, regardless of the conditions of tissue preservation. We saw no evidence of Wolbachia or numts amplification by these primers, and so we suggest that these new primers are of broad value for corbiculate bee identification through DNA barcode. © 2013 John Wiley & Sons Ltd.
Dincă, Vlad; Montagud, Sergio; Talavera, Gerard; Hernández-Roldán, Juan; Munguira, Miguel L.; García-Barros, Enrique; Hebert, Paul D. N.; Vila, Roger
2015-01-01
How common are cryptic species - those overlooked because of their morphological similarity? Despite its wide-ranging implications for biology and conservation, the answer remains open to debate. Butterflies constitute the best-studied invertebrates, playing a similar role as birds do in providing models for vertebrate biology. An accurate assessment of cryptic diversity in this emblematic group requires meticulous case-by-case assessments, but a preview to highlight cases of particular interest will help to direct future studies. We present a survey of mitochondrial genetic diversity for the butterfly fauna of the Iberian Peninsula with unprecedented resolution (3502 DNA barcodes for all 228 species), creating a reliable system for DNA-based identification and for the detection of overlooked diversity. After compiling available data for European butterflies (5782 sequences, 299 species), we applied the Generalized Mixed Yule-Coalescent model to explore potential cryptic diversity at a continental scale. The results indicate that 27.7% of these species include from two to four evolutionary significant units (ESUs), suggesting that cryptic biodiversity may be higher than expected for one of the best-studied invertebrate groups and regions. The ESUs represent important units for conservation, models for studies of evolutionary and speciation processes, and sentinels for future research to unveil hidden diversity. PMID:26205828
DNA barcoding and molecular systematics of the benthic and demersal organisms of the CEAMARC survey
NASA Astrophysics Data System (ADS)
Dettai, Agnes; Adamowizc, Sarah J.; Allcock, Louise; Arango, Claudia P.; Barnes, David K. A.; Barratt, Iain; Chenuil, Anne; Couloux, Arnaud; Cruaud, Corinne; David, Bruno; Denis, Françoise; Denys, Gael; Díaz, Angie; Eléaume, Marc; Féral, Jean-Pierre; Froger, Aurélie; Gallut, Cyril; Grant, Rachel; Griffiths, Huw J.; Held, Christoph; Hemery, Lenaïg G.; Hosie, Graham; Kuklinski, Piotr; Lecointre, Guillaume; Linse, Katrin; Lozouet, Pierre; Mah, Christopher; Monniot, Françoise; Norman, Mark D.; O'Hara, Timothy; Ozouf-Costaz, Catherine; Piedallu, Claire; Pierrat, Benjamin; Poulin, Elie; Puillandre, Nicolas; Riddle, Martin; Samadi, Sarah; Saucède, Thomas; Schubart, Christoph; Smith, Peter J.; Stevens, Darren W.; Steinke, Dirk; Strugnell, Jan M.; Tarnowska, K.; Wadley, Victoria; Ameziane, Nadia
2011-08-01
The Dumont d’Urville Sea (East Antarctic region) has been less investigated for DNA barcoding and molecular taxonomy than other parts of the Southern Ocean, such as the Ross Sea and the Antarctic Peninsula. The Collaborative East Antarctic MARine Census (CEAMARC) took place in this area during the austral summer of 2007-2008. The Australian vessel RSV Aurora Australis collected very diverse samples of demersal and benthic organisms. The specimens were sorted centrally, and then distributed to taxonomic experts for molecular and morphological taxonomy and identification, especially barcoding. The COI sequences generated from CEAMARC material provide a sizeable proportion of the Census of Antarctic Marine Life barcodes although the studies are still ongoing, and represent the only source of sequences for a number of species. Barcoding appears to be a valuable method for identification within most groups, despite low divergences and haplotype sharing in a few species, and it is also useful as a preliminary taxonomic exploration method. Several new species are being described. CEAMARC samples have already provided new material for phylogeographic and phylogenetic studies in cephalopods, pycnogonids, teleost fish, crinoids and sea urchins, helping these studies to provide a better insight in the patterns of evolution in the Southern Ocean.
Palomares-Rius, J E; Cantalapiedra-Navarrete, C; Archidona-Yuste, A; Subbotin, S A; Castillo, P
2017-09-07
The traditional identification of plant-parasitic nematode species by morphology and morphometric studies is very difficult because of high morphological variability that can lead to considerable overlap of many characteristics and their ambiguous interpretation. For this reason, it is essential to implement approaches to ensure accurate species identification. DNA barcoding aids in identification and advances species discovery. This study sought to unravel the use of the mitochondrial marker cytochrome c oxidase subunit 1 (coxI) as barcode for Longidoridae species identification, and as a phylogenetic marker. The results showed that mitochondrial and ribosomal markers could be used as barcoding markers, except for some species from the Xiphinema americanum group. The ITS1 region showed a promising role in barcoding for species identification because of the clear molecular variability among species. Some species presented important molecular variability in coxI. The analysis of the newly provided sequences and the sequences deposited in GenBank showed plausible misidentifications, and the use of voucher species and topotype specimens is a priority for this group of nematodes. The use of coxI and D2 and D3 expansion segments of the 28S rRNA gene did not clarify the phylogeny at the genus level.
Cristescu, Melania E
2014-10-01
DNA-based species identification, known as barcoding, transformed the traditional approach to the study of biodiversity science. The field is transitioning from barcoding individuals to metabarcoding communities. This revolution involves new sequencing technologies, bioinformatics pipelines, computational infrastructure, and experimental designs. In this dynamic genomics landscape, metabarcoding studies remain insular and biodiversity estimates depend on the particular methods used. In this opinion article, I discuss the need for a coordinated advancement of DNA-based species identification that integrates taxonomic and barcoding information. Such an approach would facilitate access to almost 3 centuries of taxonomic knowledge and 1 decade of building repository barcodes. Conservation projects are time sensitive, research funding is becoming restricted, and informed decisions depend on our ability to embrace integrative approaches to biodiversity science. Copyright © 2014 Elsevier Ltd. All rights reserved.
Chao, Zhi; Liao, Jing; Liang, Zhenbiao; Huang, Suhua; Zhang, Liang; Li, Junde
2014-01-01
Objective: To test the feasibility of DNA barcoding for accurate identification of Jinqian Baihua She and its adulterants. Materials and Methods: Standard cytochrome C oxidase subunit I (COI) gene fragments were sequenced for DNA barcoding of 39 samples from 9 snake species, including Bungarus multicinctus, the officially recognized origin animal by Chinese Pharmacopoeia, and other 8 adulterate species. The aligned sequences, 658 base pairs in length, were analyzed for divergence using the Kimura-2-parameter (K2P) distance model with MEGA5.0. Results: The mean intraspecific K2P distance was 0.0103 and the average interspecific genetic distance was 0.2178 in B. multicinctus, far greater than the minimal interspecific genetic distance of 0.027 recommended for species identification. A neighbor-joining (NJ) tree was constructed, in which each species formed a monophyletic clade with bootstrap supports of 100%. All the data were submitted to Barcode of Life Data system version 3.0 (BOLD, http://www.barcodinglife.org) under the project title “DNA barcoding Bungarus multicinctus and its adulterants”. Ten samples of commercially available crude drugs of JBS were identified using the identification engine provided by BOLD. All the samples were clearly identified at the species level, among which five were found to be the adulterants and identified as Dinodon rufozonatum. Conclusion: DNA barcoding using the standard COI gene fragments provides an effective and accurate means for JBS identification and authentication. PMID:25422545
"Crown of thorns" of Daphnia: an exceptional inducible defense discovered by DNA barcoding.
Laforsch, Christian; Haas, Andreas; Jung, Nina; Schwenk, Klaus; Tollrian, Ralph; Petrusek, Adam
2009-09-01
DNA barcoding has emerged as valuable tool to document global biodiversity. Mitochondrial cytochrome oxidase I (COI) sequences serve as genetic markers to catalogue species richness in the animal kingdom and to identify cryptic and polymorphic animal species. Furthermore, DNA barcoding data serve as a fuel for ecological studies, as they provide the opportunity to unravel species interactions among hosts and parasites, predators and prey, and among competitors in unprecedented detail. In a recent paper we described how DNA barcoding in combination with morphological and ecological data unravelled a striking predator-prey interaction of organisms from temporary aquatic habitats, the predatory notostracan Triops and its prey, cladocerans of the Daphnia atkinsoni complex.
Ashfaq, Muhammad; Hebert, Paul D. N.; Mirza, M. Sajjad; Khan, Arif M.; Mansoor, Shahid; Shah, Ghulam S.; Zafar, Yusuf
2014-01-01
Background Although whiteflies (Bemisia tabaci complex) are an important pest of cotton in Pakistan, its taxonomic diversity is poorly understood. As DNA barcoding is an effective tool for resolving species complexes and analyzing species distributions, we used this approach to analyze genetic diversity in the B. tabaci complex and map the distribution of B. tabaci lineages in cotton growing areas of Pakistan. Methods/Principal Findings Sequence diversity in the DNA barcode region (mtCOI-5′) was examined in 593 whiteflies from Pakistan to determine the number of whitefly species and their distributions in the cotton-growing areas of Punjab and Sindh provinces. These new records were integrated with another 173 barcode sequences for B. tabaci, most from India, to better understand regional whitefly diversity. The Barcode Index Number (BIN) System assigned the 766 sequences to 15 BINs, including nine from Pakistan. Representative specimens of each Pakistan BIN were analyzed for mtCOI-3′ to allow their assignment to one of the putative species in the B. tabaci complex recognized on the basis of sequence variation in this gene region. This analysis revealed the presence of Asia II 1, Middle East-Asia Minor 1, Asia 1, Asia II 5, Asia II 7, and a new lineage “Pakistan”. The first two taxa were found in both Punjab and Sindh, but Asia 1 was only detected in Sindh, while Asia II 5, Asia II 7 and “Pakistan” were only present in Punjab. The haplotype networks showed that most haplotypes of Asia II 1, a species implicated in transmission of the cotton leaf curl virus, occurred in both India and Pakistan. Conclusions DNA barcodes successfully discriminated cryptic species in B. tabaci complex. The dominant haplotypes in the B. tabaci complex were shared by India and Pakistan. Asia II 1 was previously restricted to Punjab, but is now the dominant lineage in southern Sindh; its southward spread may have serious implications for cotton plantations in this region. PMID:25099936
Feng, Shangguo; Jiang, Yan; Wang, Shang; Jiang, Mengying; Chen, Zhe; Ying, Qicai; Wang, Huizhong
2015-09-11
The over-collection and habitat destruction of natural Dendrobium populations for their commercial medicinal value has led to these plants being under severe threat of extinction. In addition, many Dendrobium plants are similarly shaped and easily confused during the absence of flowering stages. In the present study, we examined the application of the ITS2 region in barcoding and phylogenetic analyses of Dendrobium species (Orchidaceae). For barcoding, ITS2 regions of 43 samples in Dendrobium were amplified. In combination with sequences from GenBank, the sequences were aligned using Clustal W and genetic distances were computed using MEGA V5.1. The success rate of PCR amplification and sequencing was 100%. There was a significant divergence between the inter- and intra-specific genetic distances of ITS2 regions, while the presence of a barcoding gap was obvious. Based on the BLAST1, nearest distance and TaxonGAP methods, our results showed that the ITS2 regions could successfully identify the species of most Dendrobium samples examined; Second, we used ITS2 as a DNA marker to infer phylogenetic relationships of 64 Dendrobium species. The results showed that cluster analysis using the ITS2 region mainly supported the relationship between the species of Dendrobium established by traditional morphological methods and many previous molecular analyses. To sum up, the ITS2 region can not only be used as an efficient barcode to identify Dendrobium species, but also has the potential to contribute to the phylogenetic analysis of the genus Dendrobium.
Ghahramanzadeh, R; Esselink, G; Kodde, L P; Duistermaat, H; van Valkenburg, J L C H; Marashi, S H; Smulders, M J M; van de Wiel, C C M
2013-01-01
Biological invasions are regarded as threats to global biodiversity. Among invasive aliens, a number of plant species belonging to the genera Myriophyllum, Ludwigia and Cabomba, and to the Hydrocharitaceae family pose a particular ecological threat to water bodies. Therefore, one would try to prevent them from entering a country. However, many related species are commercially traded, and distinguishing invasive from non-invasive species based on morphology alone is often difficult for plants in a vegetative stage. In this regard, DNA barcoding could become a good alternative. In this study, 242 samples belonging to 26 species from 10 genera of aquatic plants were assessed using the chloroplast loci trnH-psbA, matK and rbcL. Despite testing a large number of primer sets and several PCR protocols, the matK locus could not be amplified or sequenced reliably and therefore was left out of the analysis. Using the other two loci, eight invasive species could be distinguished from their respective related species, a ninth one failed to produce sequences of sufficient quality. Based on the criteria of universal application, high sequence divergence and level of species discrimination, the trnH-psbA noncoding spacer was the best performing barcode in the aquatic plant species studied. Thus, DNA barcoding may be helpful with enforcing a ban on trade of such invasive species, such as is already in place in the Netherlands. This will become even more so once DNA barcoding would be turned into machinery routinely operable by a nonspecialist in botany and molecular genetics. © 2012 Blackwell Publishing Ltd.
Feng, Shangguo; Jiang, Yan; Wang, Shang; Jiang, Mengying; Chen, Zhe; Ying, Qicai; Wang, Huizhong
2015-01-01
The over-collection and habitat destruction of natural Dendrobium populations for their commercial medicinal value has led to these plants being under severe threat of extinction. In addition, many Dendrobium plants are similarly shaped and easily confused during the absence of flowering stages. In the present study, we examined the application of the ITS2 region in barcoding and phylogenetic analyses of Dendrobium species (Orchidaceae). For barcoding, ITS2 regions of 43 samples in Dendrobium were amplified. In combination with sequences from GenBank, the sequences were aligned using Clustal W and genetic distances were computed using MEGA V5.1. The success rate of PCR amplification and sequencing was 100%. There was a significant divergence between the inter- and intra-specific genetic distances of ITS2 regions, while the presence of a barcoding gap was obvious. Based on the BLAST1, nearest distance and TaxonGAP methods, our results showed that the ITS2 regions could successfully identify the species of most Dendrobium samples examined; Second, we used ITS2 as a DNA marker to infer phylogenetic relationships of 64 Dendrobium species. The results showed that cluster analysis using the ITS2 region mainly supported the relationship between the species of Dendrobium established by traditional morphological methods and many previous molecular analyses. To sum up, the ITS2 region can not only be used as an efficient barcode to identify Dendrobium species, but also has the potential to contribute to the phylogenetic analysis of the genus Dendrobium. PMID:26378526
Lee, Shih-Chieh; Wang, Chia-Hsiang; Yen, Cheng-En; Chang, Chieh
2017-04-01
The major aim of made tea identification is to identify the variety and provenance of the tea plant. The present experiment used 113 tea plants [Camellia sinensis (L.) O. Kuntze] housed at the Tea Research and Extension Substation, from which 113 internal transcribed spacer 2 (ITS2) fragments, 104 trnL intron, and 98 trnL-trnF intergenic sequence region DNA sequences were successfully sequenced. The similarity of the ITS2 nucleotide sequences between tea plants housed at the Tea Research and Extension Substation was 0.379-0.994. In this polymerase chain reaction-amplified noncoding region, no varieties possessed identical sequences. Compared with the trnL intron and trnL-trnF intergenic sequence fragments of chloroplast cpDNA, the proportion of ITS2 nucleotide sequence variation was large and is more suitable for establishing a DNA barcode database to identify tea plant varieties. After establishing the database, 30 imported teas and 35 domestic made teas were used in this model system to explore the feasibility of using ITS2 sequences to identify the varieties and provenances of made teas. A phylogenetic tree was constructed using ITS2 sequences with the unweighted pair group method with arithmetic mean, which indicated that the same variety of tea plant is likely to be successfully categorized into one cluster, but contamination from other tea plants was also detected. This result provides molecular evidence that the similarity between important tea varieties in Taiwan remains high. We suggest a direct, wide collection of made tea and original samples of tea plants to establish an ITS2 sequence molecular barcode identification database to identify the varieties and provenances of tea plants. The DNA barcode comparison method can satisfy the need for a rapid, low-cost, frontline differentiation of the large amount of made teas from Taiwan and abroad, and can provide molecular evidence of their varieties and provenances. Copyright © 2016. Published by Elsevier B.V.
Magnacca, Karl N; Brown, Mark J F
2010-06-11
The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna.
2010-01-01
Background The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Results Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Conclusions Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna. PMID:20540728
Krosch, Matt N; Strutt, Francesca; Blacket, Mark J; Batovska, Jana; Starkie, Melissa; Clarke, Anthony R; Cameron, Stephen L; Schutze, Mark K
2018-06-06
Accurate species-level identifications underpin many aspects of basic and applied biology; however, identifications can be hampered by a lack of discriminating morphological characters, taxonomic expertise or time. Molecular approaches, such as DNA 'barcoding' of the cytochrome c oxidase (COI) gene, are argued to overcome these issues. However, nuclear encoding of mitochondrial genes (numts) and poor amplification success of suboptimally preserved specimens can lead to erroneous identifications. One insect group for which these molecular and morphological problems are significant are the dacine fruit flies (Diptera: Tephritidae: Dacini). We addressed these issues associated with COI barcoding in the dacines by first assessing several 'universal' COI primers against public mitochondrial genome and numt sequences for dacine taxa. We then modified a set of four primers that more closely matched true dacine COI sequence and amplified two overlapping portions of the COI barcode region. Our new primers were tested alongside universal primers on a selection of dacine species, including both fresh preserved and decades-old dry specimens. Additionally, Bactrocera tryoni mitochondrial and nuclear genomes were compared to identify putative numts. Four numt clades were identified, three of which were amplified using existing universal primers. In contrast, our new primers preferentially amplified the 'true' mitochondrial COI barcode in all dacine species tested. The new primers also successfully amplified partial barcodes from dry specimens for which full length barcodes were unobtainable. Thus we recommend these new primers be incorporated into the suites of primers used by diagnosticians and quarantine labs for the accurate identification of dacine species. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Cervantes, Fernando A; Arcangeli, Jésica; Hortelano-Moncada, Yolanda; Borisenko, Alex V
2010-12-01
Two morphologically similar species of opossum from the genus Didelphis-Didelphis virginiana and Didelphis marsupialis-cooccur sympatrically in Mexico. High intraspecific variation complicates their morphological discrimination, under both field and museum conditions. This study aims to evaluate the utility and reliability of using DNA barcodes (short standardized genome fragments used for DNA-based identification) to distinguish these two species. Sequences of the cytochrome c oxidase subunit I (Cox1) mitochondrial gene were obtained from 12 D. marsupialis and 29 D. virginiana individuals and were compared using the neighbor-joining (NJ) algorithm with Kimura's two-parameter (K2P) model of nucleotide substitution. Average K2P distances were 1.56% within D. virginiana and 1.65% in D. marsupialis. Interspecific distances between D. virginiana and D. marsupialis varied from 7.8 to 9.3% and their barcode sequences formed distinct non-overlapping clusters on NJ trees. All sympatric specimens of both species were effectively discriminated, confirming the utility of Cox1 barcoding as a tool for taxonomic identification of these morphologically similar taxa.
Enan, M R; Ahamed, A
2014-02-14
The cultivated date palm is the most agriculturally important species of the Arecaceae family. The standard chloroplast DNA barcode for land plants recommended by the Consortium for the Barcode of Life plant working group needs to be evaluated for a wide range of plant species. Therefore, we assessed the potential of the matK and rpoC1 markers for the authentication of date cultivars. There is not one universal method to authenticate date cultivars. In this study, 11 different date cultivars were sequenced and analyzed for matK and rpoC1 genes by using bioinformatic tools to establish a cultivar-specific molecular monogram. The chloroplast matK marker was more informative than the rpoC1 chloroplast DNA markers. Phylogenetic trees were constructed on the basis of the matK and rpoC1 sequences, and the results suggested that matK alone or in combination with rpoC1 can be used for determining the levels of genetic variation and for barcoding.
Saarela, Jeffery M.; Sokoloff, Paul C.; Gillespie, Lynn J.; Consaul, Laurie L.; Bull, Roger D.
2013-01-01
Accurate identification of Arctic plant species is critical for understanding potential climate-induced changes in their diversity and distributions. To facilitate rapid identification we generated DNA barcodes for the core plastid barcode loci (rbcL and matK) for 490 vascular plant species, representing nearly half of the Canadian Arctic flora and 93% of the flora of the Canadian Arctic Archipelago. Sequence recovery was higher for rbcL than matK (93% and 81%), and rbcL was easier to recover than matK from herbarium specimens (92% and 77%). Distance-based and sequence-similarity analyses of combined rbcL + matK data discriminate 97% of genera, 56% of species, and 7% of infraspecific taxa. There is a significant negative correlation between the number of species sampled per genus and the percent species resolution per genus. We characterize barcode variation in detail in the ten largest genera sampled (Carex, Draba, Festuca, Pedicularis, Poa, Potentilla, Puccinellia, Ranunculus, Salix, and Saxifraga) in the context of their phylogenetic relationships and taxonomy. Discrimination with the core barcode loci in these genera ranges from 0% in Salix to 85% in Carex. Haplotype variation in multiple genera does not correspond to species boundaries, including Taraxacum, in which the distribution of plastid haplotypes among Arctic species is consistent with plastid variation documented in non-Arctic species. Introgression of Poa glauca plastid DNA into multiple individuals of P. hartzii is problematic for identification of these species with DNA barcodes. Of three supplementary barcode loci (psbA–trnH, psbK–psbI, atpF–atpH) collected for a subset of Poa and Puccinellia species, only atpF–atpH improved discrimination in Puccinellia, compared with rbcL and matK. Variation in matK in Vaccinium uliginosum and rbcL in Saxifraga oppositifolia corresponds to variation in other loci used to characterize the phylogeographic histories of these Arctic-alpine species. PMID:24348895
How effective are DNA barcodes in the identification of African rainforest trees?
Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W; Kenfack, David; Chuyong, George B; Cruaud, Corinne; Hardy, Olivier J
2013-01-01
DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95-100% success), but less for species identification (71-88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84-90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications.
How Effective Are DNA Barcodes in the Identification of African Rainforest Trees?
Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W.; Kenfack, David; Chuyong, George B.; Cruaud, Corinne; Hardy, Olivier J.
2013-01-01
Background DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. Methodology/Principal Findings We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95–100% success), but less for species identification (71–88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84–90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Conclusions/Significance Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications. PMID:23565134
Yang, Yichao; Ricke, Steven C.; Tellez, Guillermo; Kwon, Young Min
2017-01-01
Salmonella is an important foodborne bacterial pathogen, however, a fundamental understanding on Salmonella transmission routes within a poultry flock remains unclear. In this study, a series of barcode-tagged strains were constructed by inserting six random nucleotides into a functionally neutral region on the chromosome of S. Enteritidis as a tool for quantitative tracking of Salmonella transmission in chickens. Six distinct barcode-tagged strains were used for infection or contamination at either low dose (103 CFUs; three strains) or high dose (105 CFUs; three strains) in three independent experiments (Experiment 1 oral gavage; Experiment 2 contaminated feed; Experiment 3 contaminated water). For all chick experiments, cecal and foot-wash samples were collected from a subset of the chickens at days 7 or/and 14, from which genomic DNA was extracted and used to amplify the barcode regions. After the resulting PCR amplicons were pooled and analyzed by MiSeq sequencing, a total of approximately 1.5 million reads containing the barcode sequences were analyzed to determine the relative frequency of every barcode-tagged strain in each sample. In Experiment 1, the high dose of oral infection was correlated with greater dominance of the strains in the ceca of the respective seeder chickens and also in the contact chickens yet at lesser degrees. When chicks were exposed to contaminated feed (Experiment 2) or water (Experiment 3), there were no clear patterns of the barcode-tagged strains in relation to the dosage, except that the strains introduced at low dose required a longer time to colonize the ceca with contaminated feed. Most foot-wash samples contained only one to three strains for the majority of the samples, suggesting potential existence of an unknown mechanism(s) for strain exclusion. These results demonstrated the proof of concept of using barcode tagged to investigate transmission dynamics of Salmonella in chickens in a quantitative manner. PMID:28261587
Saarela, Jeffery M; Sokoloff, Paul C; Gillespie, Lynn J; Consaul, Laurie L; Bull, Roger D
2013-01-01
Accurate identification of Arctic plant species is critical for understanding potential climate-induced changes in their diversity and distributions. To facilitate rapid identification we generated DNA barcodes for the core plastid barcode loci (rbcL and matK) for 490 vascular plant species, representing nearly half of the Canadian Arctic flora and 93% of the flora of the Canadian Arctic Archipelago. Sequence recovery was higher for rbcL than matK (93% and 81%), and rbcL was easier to recover than matK from herbarium specimens (92% and 77%). Distance-based and sequence-similarity analyses of combined rbcL + matK data discriminate 97% of genera, 56% of species, and 7% of infraspecific taxa. There is a significant negative correlation between the number of species sampled per genus and the percent species resolution per genus. We characterize barcode variation in detail in the ten largest genera sampled (Carex, Draba, Festuca, Pedicularis, Poa, Potentilla, Puccinellia, Ranunculus, Salix, and Saxifraga) in the context of their phylogenetic relationships and taxonomy. Discrimination with the core barcode loci in these genera ranges from 0% in Salix to 85% in Carex. Haplotype variation in multiple genera does not correspond to species boundaries, including Taraxacum, in which the distribution of plastid haplotypes among Arctic species is consistent with plastid variation documented in non-Arctic species. Introgression of Poa glauca plastid DNA into multiple individuals of P. hartzii is problematic for identification of these species with DNA barcodes. Of three supplementary barcode loci (psbA-trnH, psbK-psbI, atpF-atpH) collected for a subset of Poa and Puccinellia species, only atpF-atpH improved discrimination in Puccinellia, compared with rbcL and matK. Variation in matK in Vaccinium uliginosum and rbcL in Saxifraga oppositifolia corresponds to variation in other loci used to characterize the phylogeographic histories of these Arctic-alpine species.
The seven deadly sins of DNA barcoding.
Collins, R A; Cruickshank, R H
2013-11-01
Despite the broad benefits that DNA barcoding can bring to a diverse range of biological disciplines, a number of shortcomings still exist in terms of the experimental design of studies incorporating this approach. One underlying reason for this lies in the confusion that often exists between species discovery and specimen identification, and this is reflected in the way that hypotheses are generated and tested. Although these aims can be associated, they are quite distinct and require different methodological approaches, but their conflation has led to the frequently inappropriate use of commonly used analytical methods such as neighbour-joining trees, bootstrap resampling and fixed distance thresholds. Furthermore, the misidentification of voucher specimens can also have serious implications for end users of reference libraries such as the Barcode of Life Data Systems, and in this regard we advocate increased diligence in the a priori identification of specimens to be used for this purpose. This commentary provides an assessment of seven deficiencies that we identify as common in the DNA barcoding literature, and outline some potential improvements for its adaptation and adoption towards more reliable and accurate outcomes. © 2012 John Wiley & Sons Ltd.
Nagy, Zoltán T; Sonet, Gontran; Glaw, Frank; Vences, Miguel
2012-01-01
DNA barcoding of non-avian reptiles based on the cytochrome oxidase subunit I (COI) gene is still in a very early stage, mainly due to technical problems. Using a newly developed set of reptile-specific primers for COI we present the first comprehensive study targeting the entire reptile fauna of the fourth-largest island in the world, the biodiversity hotspot of Madagascar. Representatives of the majority of Madagascan non-avian reptile species (including Squamata and Testudines) were sampled and successfully DNA barcoded. The new primer pair achieved a constantly high success rate (72.7-100%) for most squamates. More than 250 species of reptiles (out of the 393 described ones; representing around 64% of the known diversity of species) were barcoded. The average interspecific genetic distance within families ranged from a low of 13.4% in the Boidae to a high of 29.8% in the Gekkonidae. Using the average genetic divergence between sister species as a threshold, 41-48 new candidate (undescribed) species were identified. Simulations were used to evaluate the performance of DNA barcoding as a function of completeness of taxon sampling and fragment length. Compared with available multi-gene phylogenies, DNA barcoding correctly assigned most samples to species, genus and family with high confidence and the analysis of fewer taxa resulted in an increased number of well supported lineages. Shorter marker-lengths generally decreased the number of well supported nodes, but even mini-barcodes of 100 bp correctly assigned many samples to genus and family. The new protocols might help to promote DNA barcoding of reptiles and the established library of reference DNA barcodes will facilitate the molecular identification of Madagascan reptiles. Our results might be useful to easily recognize undescribed diversity (i.e. novel taxa), to resolve taxonomic problems, and to monitor the international pet trade without specialized expert knowledge.
Nagy, Zoltán T.; Sonet, Gontran; Glaw, Frank; Vences, Miguel
2012-01-01
Background DNA barcoding of non-avian reptiles based on the cytochrome oxidase subunit I (COI) gene is still in a very early stage, mainly due to technical problems. Using a newly developed set of reptile-specific primers for COI we present the first comprehensive study targeting the entire reptile fauna of the fourth-largest island in the world, the biodiversity hotspot of Madagascar. Methodology/Principal Findings Representatives of the majority of Madagascan non-avian reptile species (including Squamata and Testudines) were sampled and successfully DNA barcoded. The new primer pair achieved a constantly high success rate (72.7–100%) for most squamates. More than 250 species of reptiles (out of the 393 described ones; representing around 64% of the known diversity of species) were barcoded. The average interspecific genetic distance within families ranged from a low of 13.4% in the Boidae to a high of 29.8% in the Gekkonidae. Using the average genetic divergence between sister species as a threshold, 41–48 new candidate (undescribed) species were identified. Simulations were used to evaluate the performance of DNA barcoding as a function of completeness of taxon sampling and fragment length. Compared with available multi-gene phylogenies, DNA barcoding correctly assigned most samples to species, genus and family with high confidence and the analysis of fewer taxa resulted in an increased number of well supported lineages. Shorter marker-lengths generally decreased the number of well supported nodes, but even mini-barcodes of 100 bp correctly assigned many samples to genus and family. Conclusions/Significance The new protocols might help to promote DNA barcoding of reptiles and the established library of reference DNA barcodes will facilitate the molecular identification of Madagascan reptiles. Our results might be useful to easily recognize undescribed diversity (i.e. novel taxa), to resolve taxonomic problems, and to monitor the international pet trade without specialized expert knowledge. PMID:22479636
Mat Jaafar, Tun Nurul Aimi; Taylor, Martin I.; Mohd Nor, Siti Azizah; de Bruyn, Mark; Carvalho, Gary R.
2012-01-01
Background DNA barcodes, typically focusing on the cytochrome oxidase I gene (COI) in many animals, have been used widely as a species-identification tool. The ability of DNA barcoding to distinguish species from a range of taxa and to reveal cryptic species has been well documented. Despite the wealth of DNA barcode data for fish from many temperate regions, there are relatively few available from the Southeast Asian region. Here, we target the marine fish Family Carangidae, one of the most commercially-important families from the Indo-Malay Archipelago (IMA), to produce an initial reference DNA barcode library. Methodology/Principal Findings Here, a 652 bp region of COI was sequenced for 723 individuals from 36 putative species of Family Carangidae distributed within IMA waters. Within the newly-generated dataset, three described species exhibited conspecific divergences up to ten times greater (4.32–4.82%) than mean estimates (0.24–0.39%), indicating a discrepancy with assigned morphological taxonomic identification, and the existence of cryptic species. Variability of the mitochondrial DNA COI region was compared within and among species to evaluate the COI region's suitability for species identification. The trend in range of mean K2P distances observed was generally in accordance with expectations based on taxonomic hierarchy: 0% to 4.82% between individuals within species, 0% to 16.4% between species within genera, and 8.64% to 25.39% between genera within families. The average Kimura 2-parameter (K2P) distance between individuals, between species within genera, and between genera within family were 0.37%, 10.53% and 16.56%, respectively. All described species formed monophyletic clusters in the Neighbour-joining phylogenetic tree, although three species representing complexes of six potential cryptic species were detected in Indo-Malay Carangidae; Atule mate, Selar crumenophthalmus and Seriolina nigrofasciata. Conclusion/Significance This study confirms that COI is an effective tool for species identification of Carangidae from the IMA. There were moderate levels of cryptic diversity among putative species within the central IMA. However, to explain the hypothesis of species richness in the IMA, it is necessary to sample the whole family across their broad geographic range. Such insights are helpful not only to document mechanisms driving diversification and recruitment in Carangidae, but also to provide a scientific framework for management strategies and conservation of commercially-important fisheries resources. PMID:23209586
DNA Barcoding of Freshwater Fishes of Indo-Myanmar Biodiversity Hotspot.
Barman, Anindya Sundar; Singh, Mamta; Singh, Soibam Khogen; Saha, Himadri; Singh, Yumlembam Jackie; Laishram, Martina; Pandey, Pramod Kumar
2018-06-05
To develop an effective conservation and management strategy, it is required to assess the biodiversity status of an ecosystem, especially when we deal with Indo-Myanmar biodiversity hotspot. Importance of this reaches to an entirely different level as the hotspot represents the area of high endemism which is under continuous threat. Therefore, the need of the present study was conceptualized, dealing with molecular assessment of the fish fauna of Indo-Myanmar region, which covers the Indian states namely, Manipur, Meghalaya, Mizoram, and Nagaland. A total of 363 specimens, representing 109 species were collected and barcoded from the different rivers and their tributaries of the region. The analyses performed in the present study, i.e. Kimura 2-Parameter genetic divergence, Neighbor-Joining, Automated Barcode Gap Discovery and Bayesian Poisson Tree Processes suggest that DNA barcoding is an efficient and reliable tool for species identification. Most of the species were clearly delineated. However, presence of intra-specific and inter-specific genetic distance overlap in few species, revealed the existence of putative cryptic species. A reliable DNA barcode reference library, established in our study provides an adequate knowledge base to the groups of non-taxonomists, researchers, biodiversity managers and policy makers in sketching effective conservation measures for this ecosystem.
Péterfia, Bálint; Kalmár, Alexandra; Patai, Árpád V; Csabai, István; Bodor, András; Micsik, Tamás; Wichmann, Barnabás; Egedi, Krisztina; Hollósi, Péter; Kovalszky, Ilona; Tulassay, Zsolt; Molnár, Béla
2017-01-01
Background: To support cancer therapy, development of low cost library preparation techniques for targeted next generation sequencing (NGS) is needed. In this study we designed and tested a PCR-based library preparation panel with limited target area for sequencing the top 12 somatic mutation hot spots in colorectal cancer on the GS Junior instrument. Materials and Methods: A multiplex PCR panel was designed to amplify regions of mutation hot spots in 12 selected genes ( APC, BRAF, CTNNB1, EGFR, FBXW7, KRAS, NRAS, MSH6, PIK3CA, SMAD2, SMAD4, TP53 ). Amplicons were sequenced on a GS Junior instrument using ligated and barcoded adaptors. Eight samples were sequenced in a single run. Colonic DNA samples (8 normal mucosa; 33 adenomas; 17 adenocarcinomas) as well as HT-29 and Caco-2 cell lines with known mutation profiles were analyzed. Variants found by the panel on APC, BRAF, KRAS and NRAS genes were validated by conventional sequencing. Results: In total, 34 kinds of mutations were detected including two novel mutations ( FBXW7 c.1740:C>G and SMAD4 c.413C>G) that have not been recorded in mutation databases, and one potential germline mutation ( APC ). The most frequently mutated genes were APC, TP53 and KRAS with 30%, 15% and 21% frequencies in adenomas and 29%, 53% and 29% frequencies in carcinomas, respectively. In cell lines, all the expected mutations were detected except for one located in a homopolymer region. According to re-sequencing results sensitivity and specificity was 100% and 92% respectively. Conclusions: Our NGS-based screening panel denotes a promising step towards low cost colorectal cancer genotyping on the GS Junior instrument. Despite the relatively low coverage, we discovered two novel mutations and obtained mutation frequencies comparable to literature data. Additionally, as an advantage, this panel requires less template DNA than sequence capture colon cancer panels currently available for the GS Junior instrument.
Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael
2011-02-09
DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.
Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael
2011-01-01
Background DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. Methodology/Principal Findings The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. Conclusion/Significance In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species. PMID:21347370
DNA barcoding and the identification of tree frogs (Amphibia: Anura: Rhacophoridae).
Dang, Ning-Xin; Sun, Feng-Hui; Lv, Yun-Yun; Zhao, Bo-Han; Wang, Ji-Chao; Murphy, Robert W; Wang, Wen-Zhi; Li, Jia-Tang
2016-07-01
The DNA barcoding gene COI (cytochrome c oxidase subunit I) effectively identifies many species. Herein, we barcoded 172 individuals from 37 species belonging to nine genera in Rhacophoridae to test if the gene serves equally well to identify species of tree frogs. Phenetic neighbor joining and phylogenetic Bayesian inference were used to construct phylogenetic trees, which resolved all nine genera as monophyletic taxa except for Rhacophorus, two new matrilines for Liuixalus, and Polypedates leucomystax species complex. Intraspecific genetic distances ranged from 0.000 to 0.119 and interspecific genetic distances ranged from 0.015 to 0.334. Within Rhacophorus and Kurixalus, the intra- and interspecific genetic distances did not reveal an obvious barcode gap. Notwithstanding, we found that COI sequences unambiguously identified rhacophorid species and helped to discover likely new cryptic species via the synthesis of genealogical relationships and divergence patterns. Our results supported that COI is an effective DNA barcoding marker for Rhacophoridae.
Moftah, Marie; Abdel Aziz, Sayeda H.; Elramah, Sara; Favereaux, Alexandre
2011-01-01
The identification of species constitutes the first basic step in phylogenetic studies, biodiversity monitoring and conservation. DNA barcoding, i.e. the sequencing of a short standardized region of DNA, has been proposed as a new tool for animal species identification. The present study provides an update on the composition of shark in the Egyptian Mediterranean waters off Alexandria, since the latest study to date was performed 30 years ago, DNA barcoding was used in addition to classical taxonomical methodologies. Thus, 51 specimen were DNA barcoded for a 667 bp region of the mitochondrial COI gene. Although DNA barcoding aims at developing species identification systems, some phylogenetic signals were apparent in the data. In the neighbor-joining tree, 8 major clusters were apparent, each of them containing individuals belonging to the same species, and most with 100% bootstrap value. This study is the first to our knowledge to use DNA barcoding of the mitochondrial COI gene in order to confirm the presence of species Squalus acanthias, Oxynotus centrina, Squatina squatina, Scyliorhinus canicula, Scyliorhinus stellaris, Mustelus mustelus, Mustelus punctulatus and Carcharhinus altimus in the Egyptian Mediterranean waters. Finally, our study is the starting point of a new barcoding database concerning shark composition in the Egyptian Mediterranean waters (Barcoding of Egyptian Mediterranean Sharks [BEMS], http://www.boldsystems.org/views/projectlist.php?Barcoding%20Fish%20%28FishBOL%29). PMID:22087242
Laskar, Boni A.; Bhattacharjee, Maloyjo J.; Dhar, Bishal; Mahadani, Pradosh; Kundu, Shantanu; Ghosh, Sankar K.
2013-01-01
Background The taxonomic validity of Northeast Indian endemic Mahseer species, Tor progeneius and Neolissochilus hexastichus, has been argued repeatedly. This is mainly due to disagreements in recognizing the species based on morphological characters. Consequently, both the species have been concealed for many decades. DNA barcoding has become a promising and an independent technique for accurate species level identification. Therefore, utilization of such technique in association with the traditional morphotaxonomic description can resolve the species dilemma of this important group of sport fishes. Methodology/Principal Findings Altogether, 28 mahseer specimens including paratypes were studied from different locations in Northeast India, and 24 morphometric characters were measured invariably. The Principal Component Analysis with morphometric data revealed five distinct groups of sample that were taxonomically categorized into 4 species, viz., Tor putitora, T. progeneius, Neolissochilus hexagonolepis and N. hexastichus. Analysis with a dataset of 76 DNA barcode sequences of different mahseer species exhibited that the queries of T. putitora and N. hexagonolepis clustered cohesively with the respective conspecific database sequences maintaining 0.8% maximum K2P divergence. The closest congeneric divergence was 3 times higher than the mean conspecific divergence and was considered as barcode gap. The maximum divergence among the samples of T. progeneius and T. putitora was 0.8% that was much below the barcode gap, indicating them being synonymous. The query sequences of N. hexastichus invariably formed a discrete and a congeneric clade with the database sequences and maintained the interspecific divergence that supported its distinct species status. Notably, N. hexastichus was encountered in a single site and seemed to be under threat. Conclusion This study substantiated the identification of N. hexastichus to be a true species, and tentatively regarded T. progeneius to be a synonym of T. putitora. It would guide the conservationists to initiate priority conservation of N. hexastichus and T. putitora. PMID:23341979
Bahouth, Suleiman W; Nooh, Mohammed M
2017-08-01
Proper signaling by G protein coupled receptors (GPCR) is dependent on the specific repertoire of transducing, enzymatic and regulatory kinases and phosphatases that shape its signaling output. Activation and signaling of the GPCR through its cognate G protein is impacted by G protein-coupled receptor kinase (GRK)-imprinted "barcodes" that recruit β-arrestins to regulate subsequent desensitization, biased signaling and endocytosis of the GPCR. The outcome of agonist-internalized GPCR in endosomes is also regulated by sequence motifs or "barcodes" within the GPCR that mediate its recycling to the plasma membrane or retention and eventual degradation as well as its subsequent signaling in endosomes. Given the vast number of diverse sequences in GPCR, several trafficking mechanisms for endosomal GPCR have been described. The majority of recycling GPCR, are sorted out of endosomes in a "sequence-dependent pathway" anchored around a type-1 PDZ-binding module found in their C-tails. For a subset of these GPCR, a second "barcode" imprinted onto specific GPCR serine/threonine residues by compartmentalized kinase networks was required for their efficient recycling through the "sequence-dependent pathway". Mutating the serine/threonine residues involved, produced dramatic effects on GPCR trafficking, indicating that they played a major role in setting the trafficking itinerary of these GPCR. While endosomal SNX27, retromer/WASH complexes and actin were required for efficient sorting and budding of all these GPCR, additional proteins were required for GPCR sorting via the second "barcode". Here we will review recent developments in GPCR trafficking in general and the human β 1 -adrenergic receptor in particular across the various trafficking roadmaps. In addition, we will discuss the role of GPCR trafficking in regulating endosomal GPCR signaling, which promote biochemical and physiological effects that are distinct from those generated by the GPCR signal transduction pathway in membranes. Copyright © 2017. Published by Elsevier Inc.
BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.
Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano
2015-07-01
Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent the currently known bacterial and fungal world, showed that BioMaS outperforms QIIME and MOTHUR in terms of extent and accuracy of deep taxonomic sequence assignments.
The Trichoptera barcode initiative: a strategy for generating a species-level Tree of Life.
Zhou, Xin; Frandsen, Paul B; Holzenthal, Ralph W; Beet, Clare R; Bennett, Kristi R; Blahnik, Roger J; Bonada, Núria; Cartwright, David; Chuluunbat, Suvdtsetseg; Cocks, Graeme V; Collins, Gemma E; deWaard, Jeremy; Dean, John; Flint, Oliver S; Hausmann, Axel; Hendrich, Lars; Hess, Monika; Hogg, Ian D; Kondratieff, Boris C; Malicky, Hans; Milton, Megan A; Morinière, Jérôme; Morse, John C; Mwangi, François Ngera; Pauls, Steffen U; Gonzalez, María Razo; Rinne, Aki; Robinson, Jason L; Salokannel, Juha; Shackleton, Michael; Smith, Brian; Stamatakis, Alexandros; StClair, Ros; Thomas, Jessica A; Zamora-Muñoz, Carmen; Ziesmann, Tanja; Kjer, Karl M
2016-09-05
DNA barcoding was intended as a means to provide species-level identifications through associating DNA sequences from unknown specimens to those from curated reference specimens. Although barcodes were not designed for phylogenetics, they can be beneficial to the completion of the Tree of Life. The barcode database for Trichoptera is relatively comprehensive, with data from every family, approximately two-thirds of the genera, and one-third of the described species. Most Trichoptera, as with most of life's species, have never been subjected to any formal phylogenetic analysis. Here, we present a phylogeny with over 16 000 unique haplotypes as a working hypothesis that can be updated as our estimates improve. We suggest a strategy of implementing constrained tree searches, which allow larger datasets to dictate the backbone phylogeny, while the barcode data fill out the tips of the tree. We also discuss how this phylogeny could be used to focus taxonomic attention on ambiguous species boundaries and hidden biodiversity. We suggest that systematists continue to differentiate between 'Barcode Index Numbers' (BINs) and 'species' that have been formally described. Each has utility, but they are not synonyms. We highlight examples of integrative taxonomy, using both barcodes and morphology for species description.This article is part of the themed issue 'From DNA barcodes to biomes'. © 2016 The Authors.
Chen, Jin-Jin; Zhao, Qing-Sheng; Liu, Yi-Lan; Zha, Sheng-Hua; Zhao, Bing
2015-09-01
Maca (Lepidium meyenii) is an herbaceous plant that grows in high plateaus and has been used as both food and folk medicine for centuries because of its benefits to human health. In the present study, ITS (internal transcribed spacer) sequences of forty-three maca samples, collected from different regions or vendors, were amplified and analyzed. The ITS sequences of nineteen potential adulterants of maca were also collected and analyzed. The results indicated that the ITS sequence of maca was consistent in all samples and unique when compared with its adulterants. Therefore, this DNA-barcoding approach based on the ITS sequence can be used for the molecular identification of maca and its adulterants. Copyright © 2015 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Identification of Belgian mosquito species (Diptera: Culicidae) by DNA barcoding.
Versteirt, V; Nagy, Z T; Roelants, P; Denis, L; Breman, F C; Damiens, D; Dekoninck, W; Backeljau, T; Coosemans, M; Van Bortel, W
2015-03-01
Since its introduction in 2003, DNA barcoding has proven to be a promising method for the identification of many taxa, including mosquitoes (Diptera: Culicidae). Many mosquito species are potential vectors of pathogens, and correct identification in all life stages is essential for effective mosquito monitoring and control. To use DNA barcoding for species identification, a reliable and comprehensive reference database of verified DNA sequences is required. Hence, DNA sequence diversity of mosquitoes in Belgium was assessed using a 658 bp fragment of the mitochondrial cytochrome oxidase I (COI) gene, and a reference data set was established. Most species appeared as well-supported clusters. Intraspecific Kimura 2-parameter (K2P) distances averaged 0.7%, and the maximum observed K2P distance was 6.2% for Aedes koreicus. A small overlap between intra- and interspecific K2P distances for congeneric sequences was observed. Overall, the identification success using best match and the best close match criteria were high, that is above 98%. No clear genetic division was found between the closely related species Aedes annulipes and Aedes cantans, which can be confused using morphological identification only. The members of the Anopheles maculipennis complex, that is Anopheles maculipennis s.s. and An. messeae, were weakly supported as monophyletic taxa. This study showed that DNA barcoding offers a reliable framework for mosquito species identification in Belgium except for some closely related species. © 2014 John Wiley & Sons Ltd.
Wang, Xu; Qiu, Yue; Wei, Cong
2016-03-02
One new species of the genus Hyalessa China, H. wangi sp. nov., from Yunnan, China is described. Partial mitochondrial COI gene (DNA barcoding) of this new species is sequenced and uploaded to GenBank. A key to all species of Hyalessa is provided.
[Identification of pyrrosiae folium and its adulterants based on psbA-trnH sequence].
Zhang, Ya-Qin; Shi, Yue; Song, Ming; Lin, Yun-Han; Ma, Xiao-Xi; Sun, Wei; Xiang, Li; Liu, Xi
2014-06-01
In this study, the psbA-trnH sequence as DNA barcode was used to evaluate the accuracy and stability for identification pteridophyte medicinal material Pyrrosiae Foliumas from adulterants. Genomic DNA from 106 samples were extracted successfully. The Kimura 2-Parameter (K2P) distances and ML tree were calculated using software MEGA 6.0. The intra-specific genetic distances of 3 original plants were lower than inter-specific genetic distances of adulterants. The ML tree indicated that Pyrrosiae Folium can be distinguished from its adulterants obviously. Therefore, the psbA-trnH sequence as a barcode of the pteridophyte, can accurately and stably distinguish Pyrrosiae Folium from its adulterants.
A checklist of the bats of Peninsular Malaysia and progress towards a DNA barcode reference library
Ramli, Rosli; Bhassu, Subha
2017-01-01
Several published checklists of bat species have covered Peninsular Malaysia as part of a broader region and/or in combination with other mammal groups. Other researchers have produced comprehensive checklists for specific localities within the peninsula. To our knowledge, a comprehensive checklist of bats specifically for the entire geopolitical region of Peninsular Malaysia has never been published, yet knowing which species are present in Peninsular Malaysia and their distributions across the region are crucial in developing suitable conservation plans. Our literature search revealed that 110 bat species have been documented in Peninsular Malaysia; 105 species have precise locality records while five species lack recent and/or precise locality records. We retrieved 18 species from records dated before the year 2000 and seven species have only ever been recorded once. Our search of Barcode of Life Datasystems (BOLD) found that 86 (of the 110) species have public records of which 48 species have public DNA barcodes available from bats sampled in Peninsular Malaysia. Based on Neighbour-Joining tree analyses and the allocation of DNA barcodes to Barcode Index Number system (BINs) by BOLD, several DNA barcodes recorded under the same species name are likely to represent distinct taxa. We discuss these cases in detail and highlight the importance of further surveys to determine the occurences and resolve the taxonomy of particular bat species in Peninsular Malaysia, with implications for conservation priorities. PMID:28742835
Mapping global biodiversity connections with DNA barcodes: Lepidoptera of Pakistan.
Ashfaq, Muhammad; Akhtar, Saleem; Rafi, Muhammad Athar; Mansoor, Shahid; Hebert, Paul D N
2017-01-01
Sequences from the DNA barcode region of the mitochondrial COI gene are an effective tool for specimen identification and for the discovery of new species. The Barcode of Life Data Systems (BOLD) (www.boldsystems.org) currently hosts 4.5 million records from animals which have been assigned to more than 490,000 different Barcode Index Numbers (BINs), which serve as a proxy for species. Because a fourth of these BINs derive from Lepidoptera, BOLD has a strong capability to both identify specimens in this order and to support studies of faunal overlap. DNA barcode sequences were obtained from 4503 moths from 329 sites across Pakistan, specimens that represented 981 BINs from 52 families. Among 379 species with a Linnaean name assignment, all were represented by a single BIN excepting five species that showed a BIN split. Less than half (44%) of the 981 BINs had counterparts in other countries; the remaining BINs were unique to Pakistan. Another 218 BINs of Lepidoptera from Pakistan were coupled with the 981 from this study before being compared with all 116,768 BINs for this order. As expected, faunal overlap was highest with India (21%), Sri Lanka (21%), United Arab Emirates (20%) and with other Asian nations (2.1%), but it was very low with other continents including Africa (0.6%), Europe (1.3%), Australia (0.6%), Oceania (1.0%), North America (0.1%), and South America (0.1%). This study indicates the way in which DNA barcoding facilitates measures of faunal overlap even when taxa have not been assigned to a Linnean species.
Tian, Qian; Zhao, Wenjun; Lu, Songyu; Zhu, Shuifang; Li, Shidong
2016-01-01
Genus Xanthomonas comprises many economically important plant pathogens that affect a wide range of hosts. Indeed, fourteen Xanthomonas species/pathovars have been regarded as official quarantine bacteria for imports in China. To date, however, a rapid and accurate method capable of identifying all of the quarantine species/pathovars has yet to be developed. In this study, we therefore evaluated the capacity of DNA barcoding as a digital identification method for discriminating quarantine species/pathovars of Xanthomonas. For these analyses, 327 isolates, representing 45 Xanthomonas species/pathovars, as well as five additional species/pathovars from GenBank (50 species/pathovars total), were utilized to test the efficacy of four DNA barcode candidate genes (16S rRNA gene, cpn60, gyrB, and avrBs2). Of these candidate genes, cpn60 displayed the highest rate of PCR amplification and sequencing success. The tree-building (Neighbor-joining), ‘best close match’, and barcode gap methods were subsequently employed to assess the species- and pathovar-level resolution of each gene. Notably, all isolates of each quarantine species/pathovars formed a monophyletic group in the neighbor-joining tree constructed using the cpn60 sequences. Moreover, cpn60 also demonstrated the most satisfactory results in both barcoding gap analysis and the ‘best close match’ test. Thus, compared with the other markers tested, cpn60 proved to be a powerful DNA barcode, providing a reliable and effective means for the species- and pathovar-level identification of the quarantine plant pathogen Xanthomonas. PMID:27861494
Comparing and combining distance-based and character-based approaches for barcoding turtles.
Reid, B N; LE, M; McCord, W P; Iverson, J B; Georges, A; Bergmann, T; Amato, G; Desalle, R; Naro-Maciel, E
2011-11-01
Molecular barcoding can serve as a powerful tool in wildlife forensics and may prove to be a vital aid in conserving organisms that are threatened by illegal wildlife trade, such as turtles (Order Testudines). We produced cytochrome oxidase subunit one (COI) sequences (650 bp) for 174 turtle species and combined these with publicly available sequences for 50 species to produce a data set representative of the breadth of the order. Variability within the barcode region was assessed, and the utility of both distance-based and character-based methods for species identification was evaluated. For species in which genetic material from more than one individual was available (n = 69), intraspecific divergences were 1.3% on average, although divergences greater than the customary 2% barcode threshold occurred within 15 species. High intraspecific divergences could indicate species with a high degree of internal genetic structure or possibly even cryptic species, although introgression is also probable in some of these taxa. Divergences between species of the same genus were 6.4% on average; however, 49 species were <2% divergent from congeners. Low levels of interspecific divergence could be caused by recent evolutionary radiations coupled with the low rates of mtDNA evolution previously observed in turtles. Complementing distance-based barcoding with character-based methods for identifying diagnostic sets of nucleotides provided better resolution in several cases where distance-based methods failed to distinguish species. An online identification engine was created to provide character-based identifications. This study constitutes the first comprehensive barcoding effort for this seriously threatened order. © 2011 Blackwell Publishing Ltd.
When COI barcodes deceive: complete genomes reveal introgression in hairstreaks
Shen, Jinhui; Borek, Dominika; Robbins, Robert K.; Opler, Paul A.; Otwinowski, Zbyszek; Grishin, Nick V.
2017-01-01
Two species of hairstreak butterflies from the genus Calycopis are known in the United States: C. cecrops and C. isobeon. Analysis of mitochondrial COI barcodes of Calycopis revealed cecrops-like specimens from the eastern US with atypical barcodes that were 2.6% different from either USA species, but similar to Central American Calycopis species. To address the possibility that the specimens with atypical barcodes represent an undescribed cryptic species, we sequenced complete genomes of 27 Calycopis specimens of four species: C. cecrops, C. isobeon, C. quintana and C. bactra. Some of these specimens were collected up to 60 years ago and preserved dry in museum collections, but nonetheless produced genomes as complete as fresh samples. Phylogenetic trees reconstructed using the whole mitochondrial and nuclear genomes were incongruent. While USA Calycopis with atypical barcodes grouped with Central American species C. quintana by mitochondria, nuclear genome trees placed them within typical USA C. cecrops in agreement with morphology, suggesting mitochondrial introgression. Nuclear genomes also show introgression, especially between C. cecrops and C. isobeon. About 2.3% of each C. cecrops genome has probably (p-value < 0.01, FDR < 0.1) introgressed from C. isobeon and about 3.4% of each C. isobeon genome may have come from C. cecrops. The introgressed regions are enriched in genes encoding transmembrane proteins, mitochondria-targeting proteins and components of the larval cuticle. This study provides the first example of mitochondrial introgression in Lepidoptera supported by complete genome sequencing. Our results caution about relying solely on COI barcodes and mitochondrial DNA for species identification or discovery. PMID:28179510
Mapping global biodiversity connections with DNA barcodes: Lepidoptera of Pakistan
Akhtar, Saleem; Rafi, Muhammad Athar; Mansoor, Shahid; Hebert, Paul D. N.
2017-01-01
Sequences from the DNA barcode region of the mitochondrial COI gene are an effective tool for specimen identification and for the discovery of new species. The Barcode of Life Data Systems (BOLD) (www.boldsystems.org) currently hosts 4.5 million records from animals which have been assigned to more than 490,000 different Barcode Index Numbers (BINs), which serve as a proxy for species. Because a fourth of these BINs derive from Lepidoptera, BOLD has a strong capability to both identify specimens in this order and to support studies of faunal overlap. DNA barcode sequences were obtained from 4503 moths from 329 sites across Pakistan, specimens that represented 981 BINs from 52 families. Among 379 species with a Linnaean name assignment, all were represented by a single BIN excepting five species that showed a BIN split. Less than half (44%) of the 981 BINs had counterparts in other countries; the remaining BINs were unique to Pakistan. Another 218 BINs of Lepidoptera from Pakistan were coupled with the 981 from this study before being compared with all 116,768 BINs for this order. As expected, faunal overlap was highest with India (21%), Sri Lanka (21%), United Arab Emirates (20%) and with other Asian nations (2.1%), but it was very low with other continents including Africa (0.6%), Europe (1.3%), Australia (0.6%), Oceania (1.0%), North America (0.1%), and South America (0.1%). This study indicates the way in which DNA barcoding facilitates measures of faunal overlap even when taxa have not been assigned to a Linnean species. PMID:28339501
Asgharian, Hosseinali; Sahafi, Homayoun Hosseinzadeh; Ardalan, Aria Ashja; Shekarriz, Shahrokh; Elahi, Elahe
2011-05-01
We provide cytochrome c oxidase subunit 1 (COI) barcode sequences of fishes of the Nayband National Park, Persian Gulf, Iran. Industrial activities, ecological considerations and goals of The Fish Barcode of Life campaign make it crucial that fish species residing in the park be identified. To the best of our knowledge, this is the first report of barcoding data on fishes of the Persian Gulf. We examined 187 individuals representing 76 species, 56 genera and 32 families. The data flagged potentially cryptic species of Gerres filamentosus and Plectorhinchus schotaf. 16S rDNA data on these species are provided. Exclusion of these two potential cryptic species resulted in a mean COI intraspecific distance of 0.18%, and a mean inter- to intraspecific divergence ratio of 66.7. There was no overlap between maximum Kimura 2-parameter distances among conspecifics (1.66%) and minimum distance among congeneric species (6.19%). Barcodes shared among species were not observed. Neighbour-joining analysis showed that most species formed cohesive sequence units with little variation. Finally, the comparison of 16 selected species from this study with meta-data of conspecifics from Australia, India, China and South Africa revealed high interregion divergences and potential existence of six cryptic species. Pairwise interregional comparisons were more informative than global divergence assessments with regard to detection of cryptic variation. Our analysis exemplifies optimal use of the expanding barcode data now becoming available. © 2011 Blackwell Publishing Ltd.
Li, De-Zhu; Gao, Lian-Ming; Li, Hong-Tao; Wang, Hong; Ge, Xue-Jun; Liu, Jian-Quan; Chen, Zhi-Duan; Zhou, Shi-Liang; Chen, Shi-Lin; Yang, Jun-Bo; Fu, Cheng-Xin; Zeng, Chun-Xia; Yan, Hai-Fei; Zhu, Ying-Jie; Sun, Yong-Shuai; Chen, Si-Yun; Zhao, Lei; Wang, Kun; Yang, Tuo; Duan, Guang-Wen
2011-12-06
A two-marker combination of plastid rbcL and matK has previously been recommended as the core plant barcode, to be supplemented with additional markers such as plastid trnH-psbA and nuclear ribosomal internal transcribed spacer (ITS). To assess the effectiveness and universality of these barcode markers in seed plants, we sampled 6,286 individuals representing 1,757 species in 141 genera of 75 families (42 orders) by using four different methods of data analysis. These analyses indicate that (i) the three plastid markers showed high levels of universality (87.1-92.7%), whereas ITS performed relatively well (79%) in angiosperms but not so well in gymnosperms; (ii) in taxonomic groups for which direct sequencing of the marker is possible, ITS showed the highest discriminatory power of the four markers, and a combination of ITS and any plastid DNA marker was able to discriminate 69.9-79.1% of species, compared with only 49.7% with rbcL + matK; and (iii) where multiple individuals of a single species were tested, ascriptions based on ITS and plastid DNA barcodes were incongruent in some samples for 45.2% of the sampled genera (for genera with more than one species sampled). This finding highlights the importance of both sampling multiple individuals and using markers with different modes of inheritance. In cases where it is difficult to amplify and directly sequence ITS in its entirety, just using ITS2 is a useful backup because it is easier to amplify and sequence this subset of the marker. We therefore propose that ITS/ITS2 should be incorporated into the core barcode for seed plants.
Telfer, Angela C; Young, Monica R; Quinn, Jenna; Perez, Kate; Sobel, Crystal N; Sones, Jayme E; Levesque-Beaudin, Valerie; Derbyshire, Rachael; Fernandez-Triana, Jose; Rougerie, Rodolphe; Thevanayagam, Abinah; Boskovic, Adrian; Borisenko, Alex V; Cadel, Alex; Brown, Allison; Pages, Anais; Castillo, Anibal H; Nicolai, Annegret; Glenn Mockford, Barb Mockford; Bukowski, Belén; Wilson, Bill; Trojahn, Brock; Lacroix, Carole Ann; Brimblecombe, Chris; Hay, Christoper; Ho, Christmas; Steinke, Claudia; Warne, Connor P; Garrido Cortes, Cristina; Engelking, Daniel; Wright, Danielle; Lijtmaer, Dario A; Gascoigne, David; Hernandez Martich, David; Morningstar, Derek; Neumann, Dirk; Steinke, Dirk; Marco DeBruin, Donna DeBruin; Dobias, Dylan; Sears, Elizabeth; Richard, Ellen; Damstra, Emily; Zakharov, Evgeny V; Laberge, Frederic; Collins, Gemma E; Blagoev, Gergin A; Grainge, Gerrie; Ansell, Graham; Meredith, Greg; Hogg, Ian; McKeown, Jaclyn; Topan, Janet; Bracey, Jason; Guenther, Jerry; Sills-Gilligan, Jesse; Addesi, Joseph; Persi, Joshua; Layton, Kara K S; D'Souza, Kareina; Dorji, Kencho; Grundy, Kevin; Nghidinwa, Kirsti; Ronnenberg, Kylee; Lee, Kyung Min; Xie, Linxi; Lu, Liuqiong; Penev, Lyubomir; Gonzalez, Mailyn; Rosati, Margaret E; Kekkonen, Mari; Kuzmina, Maria; Iskandar, Marianne; Mutanen, Marko; Fatahi, Maryam; Pentinsaari, Mikko; Bauman, Miriam; Nikolova, Nadya; Ivanova, Natalia V; Jones, Nathaniel; Weerasuriya, Nimalka; Monkhouse, Norman; Lavinia, Pablo D; Jannetta, Paul; Hanisch, Priscila E; McMullin, R Troy; Ojeda Flores, Rafael; Mouttet, Raphaëlle; Vender, Reid; Labbee, Renee N; Forsyth, Robert; Lauder, Rob; Dickson, Ross; Kroft, Ruth; Miller, Scott E; MacDonald, Shannon; Panthi, Sishir; Pedersen, Stephanie; Sobek-Swant, Stephanie; Naik, Suresh; Lipinskaya, Tatsiana; Eagalle, Thanushi; Decaëns, Thibaud; Kosuth, Thibault; Braukmann, Thomas; Woodcock, Tom; Roslin, Tomas; Zammit, Tony; Campbell, Victoria; Dinca, Vlad; Peneva, Vlada; Hebert, Paul D N; deWaard, Jeremy R
2015-01-01
Comprehensive biotic surveys, or 'all taxon biodiversity inventories' (ATBI), have traditionally been limited in scale or scope due to the complications surrounding specimen sorting and species identification. To circumvent these issues, several ATBI projects have successfully integrated DNA barcoding into their identification procedures and witnessed acceleration in their surveys and subsequent increase in project scope and scale. The Biodiversity Institute of Ontario partnered with the rare Charitable Research Reserve and delegates of the 6th International Barcode of Life Conference to complete its own rapid, barcode-assisted ATBI of an established land trust in Cambridge, Ontario, Canada. The existing species inventory for the rare Charitable Research Reserve was rapidly expanded by integrating a DNA barcoding workflow with two surveying strategies - a comprehensive sampling scheme over four months, followed by a one-day bioblitz involving international taxonomic experts. The two surveys resulted in 25,287 and 3,502 specimens barcoded, respectively, as well as 127 human observations. This barcoded material, all vouchered at the Biodiversity Institute of Ontario collection, covers 14 phyla, 29 classes, 117 orders, and 531 families of animals, plants, fungi, and lichens. Overall, the ATBI documented 1,102 new species records for the nature reserve, expanding the existing long-term inventory by 49%. In addition, 2,793 distinct Barcode Index Numbers (BINs) were assigned to genus or higher level taxonomy, and represent additional species that will be added once their taxonomy is resolved. For the 3,502 specimens, the collection, sequence analysis, taxonomic assignment, data release and manuscript submission by 100+ co-authors all occurred in less than one week. This demonstrates the speed at which barcode-assisted inventories can be completed and the utility that barcoding provides in minimizing and guiding valuable taxonomic specialist time. The final product is more than a comprehensive biotic inventory - it is also a rich dataset of fine-scale occurrence and sequence data, all archived and cross-linked in the major biodiversity data repositories. This model of rapid generation and dissemination of essential biodiversity data could be followed to conduct regional assessments of biodiversity status and change, and potentially be employed for evaluating progress towards the Aichi Targets of the Strategic Plan for Biodiversity 2011-2020.
Young, Monica R; Quinn, Jenna; Perez, Kate; Sobel, Crystal N; Sones, Jayme E; Levesque-Beaudin, Valerie; Derbyshire, Rachael; Fernandez-Triana, Jose; Rougerie, Rodolphe; Thevanayagam, Abinah; Boskovic, Adrian; Borisenko, Alex V; Cadel, Alex; Brown, Allison; Pages, Anais; Castillo, Anibal H; Nicolai, Annegret; Glenn Mockford, Barb Mockford; Bukowski, Belén; Wilson, Bill; Trojahn, Brock; Lacroix, Carole Ann; Brimblecombe, Chris; Hay, Christoper; Ho, Christmas; Steinke, Claudia; Warne, Connor P; Garrido Cortes, Cristina; Engelking, Daniel; Wright, Danielle; Lijtmaer, Dario A; Gascoigne, David; Hernandez Martich, David; Morningstar, Derek; Neumann, Dirk; Steinke, Dirk; Marco DeBruin, Donna DeBruin; Dobias, Dylan; Sears, Elizabeth; Richard, Ellen; Damstra, Emily; Zakharov, Evgeny V; Laberge, Frederic; Collins, Gemma E; Blagoev, Gergin A; Grainge, Gerrie; Ansell, Graham; Meredith, Greg; Hogg, Ian; McKeown, Jaclyn; Topan, Janet; Bracey, Jason; Guenther, Jerry; Sills-Gilligan, Jesse; Addesi, Joseph; Persi, Joshua; Layton, Kara K S; D'Souza, Kareina; Dorji, Kencho; Grundy, Kevin; Nghidinwa, Kirsti; Ronnenberg, Kylee; Lee, Kyung Min; Xie, Linxi; Lu, Liuqiong; Penev, Lyubomir; Gonzalez, Mailyn; Rosati, Margaret E; Kekkonen, Mari; Kuzmina, Maria; Iskandar, Marianne; Mutanen, Marko; Fatahi, Maryam; Pentinsaari, Mikko; Bauman, Miriam; Nikolova, Nadya; Ivanova, Natalia V; Jones, Nathaniel; Weerasuriya, Nimalka; Monkhouse, Norman; Lavinia, Pablo D; Jannetta, Paul; Hanisch, Priscila E; McMullin, R. Troy; Ojeda Flores, Rafael; Mouttet, Raphaëlle; Vender, Reid; Labbee, Renee N; Forsyth, Robert; Lauder, Rob; Dickson, Ross; Kroft, Ruth; Miller, Scott E; MacDonald, Shannon; Panthi, Sishir; Pedersen, Stephanie; Sobek-Swant, Stephanie; Naik, Suresh; Lipinskaya, Tatsiana; Eagalle, Thanushi; Decaëns, Thibaud; Kosuth, Thibault; Braukmann, Thomas; Woodcock, Tom; Roslin, Tomas; Zammit, Tony; Campbell, Victoria; Dinca, Vlad; Peneva, Vlada; Hebert, Paul D N
2015-01-01
Abstract Background Comprehensive biotic surveys, or ‘all taxon biodiversity inventories’ (ATBI), have traditionally been limited in scale or scope due to the complications surrounding specimen sorting and species identification. To circumvent these issues, several ATBI projects have successfully integrated DNA barcoding into their identification procedures and witnessed acceleration in their surveys and subsequent increase in project scope and scale. The Biodiversity Institute of Ontario partnered with the rare Charitable Research Reserve and delegates of the 6th International Barcode of Life Conference to complete its own rapid, barcode-assisted ATBI of an established land trust in Cambridge, Ontario, Canada. New information The existing species inventory for the rare Charitable Research Reserve was rapidly expanded by integrating a DNA barcoding workflow with two surveying strategies – a comprehensive sampling scheme over four months, followed by a one-day bioblitz involving international taxonomic experts. The two surveys resulted in 25,287 and 3,502 specimens barcoded, respectively, as well as 127 human observations. This barcoded material, all vouchered at the Biodiversity Institute of Ontario collection, covers 14 phyla, 29 classes, 117 orders, and 531 families of animals, plants, fungi, and lichens. Overall, the ATBI documented 1,102 new species records for the nature reserve, expanding the existing long-term inventory by 49%. In addition, 2,793 distinct Barcode Index Numbers (BINs) were assigned to genus or higher level taxonomy, and represent additional species that will be added once their taxonomy is resolved. For the 3,502 specimens, the collection, sequence analysis, taxonomic assignment, data release and manuscript submission by 100+ co-authors all occurred in less than one week. This demonstrates the speed at which barcode-assisted inventories can be completed and the utility that barcoding provides in minimizing and guiding valuable taxonomic specialist time. The final product is more than a comprehensive biotic inventory – it is also a rich dataset of fine-scale occurrence and sequence data, all archived and cross-linked in the major biodiversity data repositories. This model of rapid generation and dissemination of essential biodiversity data could be followed to conduct regional assessments of biodiversity status and change, and potentially be employed for evaluating progress towards the Aichi Targets of the Strategic Plan for Biodiversity 2011–2020. PMID:26379469
Use of mitochondrial COI gene for the identification of family Salticidae and Lycosidae of spiders.
Naseem, Sajida; Tahir, Hafiz Muhammad
2018-01-01
In recent years, DNA barcoding has become quite popular for molecular identification of species because it is simple, quick and an affordable method. Present study was conducted to identify spiders of most abundant families, i.e. Salticidae and Lycosidae from citrus orchards in Sargodha district using DNA barcoding. A total of 160 specimens were subjected to DNA barcoding but, sequences up to 600 bp were recovered for 156 specimens. This molecular approach proved helpful to assign the exact taxon to those specimens which were misidentified through morphological characters in the study. We were succeeded to discriminate six species of Lycosidae and nine species of Salticidae through DNA barcoding. Results revealed the presence of clear barcode gap (discontinuity in intra- and inter-specific divergences) for members of both families. Furthermore, the maximum intra-specific divergence was less than NN (nearest neighbour) distance for all species. This suggested the reliability of DNA barcoding for spider's identification up to species level. We got 98% success in our study. It is concluded from present study that DNA barcoding is more reliable tool especially for immature spiders, when morphological characters are ambiguous.
DNA barcoding commercially important aquatic invertebrates of Turkey.
Keskin, Emre; Atar, Hasan Hüseyin
2013-08-01
DNA barcoding was used in order to identify aquatic invertebrates sampled from fisheries bycatch and discards. A total of 440 unique cytochrome c oxidase sub unit I (COI) barcodes were generated for 22 species from three important phyla (Arthropoda, Cnidaria, and Mollusca). All the species were sequenced and submitted to GenBank and Barcode of Life Database (BOLD) databases using 654 bp-long fragment of mitochondrial COI gene. Two of them (Pontastacus leptodactylus and Rapana bezoar) were first records of the species for the BOLD database and six of them (Carcinus aestuarii, Loligo vulgaris, Melicertus kerathurus, Nephrops norvegicus, Scyllarides latus, and Scyllarus arctus) were first standard (>648 bp) COI barcode records for the GenBank database. COI barcodes were analyzed for nucleotide composition, nucleotide pair frequencies, and Kimura's two-parameter genetic distance. Mean genetic distance among species was found increasing at higher taxonomic levels. Neighbor-joining trees generated were congruent with morphometric-based taxonomic classification. Findings of this study clearly demonstrate that DNA barcodes could be used as an efficient molecular tool in identification of not only target species from fisheries but also bycatch and discard species, and so it could provide us leverage for a better understanding in monitoring and management of fisheries and biodiversity.
Yu, Ning; Gu, Hong; Wei, Yulong; Zhu, Ning; Wang, Yanli; Zhang, Haiping; Zhu, Yue; Zhang, Xin; Ma, Chao; Sun, Aidong
2016-09-12
Piper kadsura is a vine-like medicinal plant which is widely used in clinical treatment. However, P. kadsura is often substituted by other materials in the markets, thereby causing health risks. In this study, 38 P. kadsura samples and eight sequences from GenBank, including a closely-related species and common adulterants were collected. This study aimed to identify an effective DNA barcode from four popular DNA loci for P. kadsura authentication. The success rates of PCR amplification, sequencing, and sequence acquisition of matK were 10.5%, 75%, and 7.9%, respectively; for rbcL they were 89.5%, 8.8%, and 7.9%, respectively; ITS2 rates were 86.8%, 3.0%, and 2.6%, respectively, while for psbA-trnH they were all 100%, which is much higher than for the other three loci. The sequences were aligned using Muscle, genetic distances were computed using MEGA 5.2.2, and barcoding gap was performed using TAXON DNA. Phylogenetic analysis showed that psbA-trnH could clearly distinguish P. kadsura from its closely related species and the common adulterant. psbA-trnH was then used to evaluate the fake proportions of P. kadsura. Results showed that 18.4% of P. kadsura samples were fake, indicating that adulterant species exist in the Chinese markets. Two-dimensional DNA barcoding imaging of P. kadsura was conducted, which was beneficial to the management of P. kadsura. We conclude that the psbA-trnH region is a powerful tool for P. kadsura identification and supervision in the current medicine markets.