dna identification databases: Topics by Science.gov

Sample records for dna identification databases

Nucleotide Sequence Database Comparison for Routine Dermatophyte Identification by Internal Transcribed Spacer 2 Genetic Region DNA Barcoding.

PubMed

Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R

2018-05-01

Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
[Integrated DNA barcoding database for identifying Chinese animal medicine].

PubMed

Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin

2014-06-01

In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
Geographic origin and individual assignment of Shorea platyclados (Dipterocarpaceae) for forensic identification

PubMed Central

Diway, Bibian; Khoo, Eyen

2017-01-01

The development of timber tracking methods based on genetic markers can provide scientific evidence to verify the origin of timber products and fulfill the growing requirement for sustainable forestry practices. In this study, the origin of an important Dark Red Meranti wood, Shorea platyclados, was studied by using the combination of seven chloroplast DNA and 15 short tandem repeats (STRs) markers. A total of 27 natural populations of S. platyclados were sampled throughout Malaysia to establish population level and individual level identification databases. A haplotype map was generated from chloroplast DNA sequencing for population identification, resulting in 29 multilocus haplotypes, based on 39 informative intraspecific variable sites. Subsequently, a DNA profiling database was developed from 15 STRs allowing for individual identification in Malaysia. Cluster analysis divided the 27 populations into two genetic clusters, corresponding to the region of Eastern and Western Malaysia. The conservativeness tests showed that the Malaysia database is conservative after removal of bias from population subdivision and sampling effects. Independent self-assignment tests correctly assigned individuals to the database in an overall 60.60−94.95% of cases for identified populations, and in 98.99−99.23% of cases for identified regions. Both the chloroplast DNA database and the STRs appear to be useful for tracking timber originating in Malaysia. Hence, this DNA-based method could serve as an effective addition tool to the existing forensic timber identification system for ensuring the sustainably management of this species into the future. PMID:28430826
The barley EST DNA Replication and Repair Database (bEST-DRRD) as a tool for the identification of the genes involved in DNA replication and repair.

PubMed

Gruszka, Damian; Marzec, Marek; Szarejko, Iwona

2012-06-14

The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the "Barley Genome version 0.05" database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. The presented database provides information about the Arabidopsis genes involved in DNA replication and repair, their expression patterns and models of protein interactions. It was designed and established to provide an open-access tool for the identification of monocot homologs of known Arabidopsis genes that are responsible for DNA-related processes. The barley genes identified in the project are currently being analysed to validate their function.
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing

PubMed Central

Karimi, Ramin; Hajdu, Andras

2016-01-01

Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.

PubMed

Karimi, Ramin; Hajdu, Andras

2016-01-01

Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.
How effective are DNA barcodes in the identification of African rainforest trees?

PubMed

Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W; Kenfack, David; Chuyong, George B; Cruaud, Corinne; Hardy, Olivier J

2013-01-01

DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95-100% success), but less for species identification (71-88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84-90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications.
How Effective Are DNA Barcodes in the Identification of African Rainforest Trees?

PubMed Central

Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W.; Kenfack, David; Chuyong, George B.; Cruaud, Corinne; Hardy, Olivier J.

2013-01-01

Background DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. Methodology/Principal Findings We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95–100% success), but less for species identification (71–88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84–90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Conclusions/Significance Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications. PMID:23565134
Evaluation of partial 16S ribosomal DNA sequencing for identification of nocardia species by using the MicroSeq 500 system with an expanded database.

PubMed

Cloud, Joann L; Conville, Patricia S; Croft, Ann; Harmsen, Dag; Witebsky, Frank G; Carroll, Karen C

2004-02-01

Identification of clinically significant nocardiae to the species level is important in patient diagnosis and treatment. A study was performed to evaluate Nocardia species identification obtained by partial 16S ribosomal DNA (rDNA) sequencing by the MicroSeq 500 system with an expanded database. The expanded portion of the database was developed from partial 5' 16S rDNA sequences derived from 28 reference strains (from the American Type Culture Collection and the Japanese Collection of Microorganisms). The expanded MicroSeq 500 system was compared to (i). conventional identification obtained from a combination of growth characteristics with biochemical and drug susceptibility tests; (ii). molecular techniques involving restriction enzyme analysis (REA) of portions of the 16S rRNA and 65-kDa heat shock protein genes; and (iii). when necessary, sequencing of a 999-bp fragment of the 16S rRNA gene. An unknown isolate was identified as a particular species if the sequence obtained by partial 16S rDNA sequencing by the expanded MicroSeq 500 system was 99.0% similar to that of the reference strain. Ninety-four nocardiae representing 10 separate species were isolated from patient specimens and examined by using the three different methods. Sequencing of partial 16S rDNA by the expanded MicroSeq 500 system resulted in only 72% agreement with conventional methods for species identification and 90% agreement with the alternative molecular methods. Molecular methods for identification of Nocardia species provide more accurate and rapid results than the conventional methods using biochemical and susceptibility testing. With an expanded database, the MicroSeq 500 system for partial 16S rDNA was able to correctly identify the human pathogens N. brasiliensis, N. cyriacigeorgica, N. farcinica, N. nova, N. otitidiscaviarum, and N. veterana.
Forensic timber identification: a case study of a CITES listed species, Gonystylus bancanus (Thymelaeaceae).

PubMed

Ng, Kevin Kit Siong; Lee, Soon Leong; Tnah, Lee Hong; Nurul-Farhanah, Zakaria; Ng, Chin Hong; Lee, Chai Ting; Tani, Naoki; Diway, Bibian; Lai, Pei Sing; Khoo, Eyen

2016-07-01

Illegal logging and smuggling of Gonystylus bancanus (Thymelaeaceae) poses a serious threat to this fragile valuable peat swamp timber species. Using G. bancanus as a case study, DNA markers were used to develop identification databases at the species, population and individual level. The species level database for Gonystylus comprised of an rDNA (ITS2) and two cpDNA (trnH-psbA and trnL) markers based on a 20 Gonystylus species database. When concatenated, taxonomic species recognition was achieved with a resolution of 90% (18 out of the 20 species). In addition, based on 17 natural populations of G. bancanus throughout West (Peninsular Malaysia) and East (Sabah and Sarawak) Malaysia, population and individual identification databases were developed using cpDNA and STR markers respectively. A haplotype distribution map for Malaysia was generated using six cpDNA markers, resulting in 12 unique multilocus haplotypes, from 24 informative intraspecific variable sites. These unique haplotypes suggest a clear genetic structuring of West and East regions. A simulation procedure based on the composition of the samples was used to test whether a suspected sample conformed to a given regional origin. Overall, the observed type I and II errors of the databases showed good concordance with the predicted 5% threshold which indicates that the databases were useful in revealing provenance and establishing conformity of samples from West and East Malaysia. Sixteen STRs were used to develop the DNA profiling databases for individual identification. Bayesian clustering analyses divided the 17 populations into two main genetic clusters, corresponding to the regions of West and East Malaysia. Population substructuring (K=2) was observed within each region. After removal of bias resulting from sampling effects and population subdivision, conservativeness tests showed that the West and East Malaysia databases were conservative. This suggests that both databases can be used independently for random match probability estimation within respective regions. The reliability of the databases was further determined by independent self-assignment tests based on the likelihood of each individual's multilocus genotype occurring in each identified population, genetic cluster and region with an average percentage of correctly assigned individuals of 54.80%, 99.60% and 100% respectively. Thus, after appropriate validation, the genetic identification databases developed for G. bancanus in this study could support forensic applications and help safeguard this valuable species into the future. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Molecular Identification and Databases in Fusarium

USDA-ARS?s Scientific Manuscript database

DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources.

PubMed

Lim, Jeongheui; Kim, Sang-Yoon; Kim, Sungmin; Eo, Hae-Seok; Kim, Chang-Bae; Paek, Woon Kee; Kim, Won; Bhak, Jong

2009-12-03

DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org.
A Large Population Genetic Study of 15 Autosomal Short Tandem Repeat Loci for Establishment of Korean DNA Profile Database

PubMed Central

Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha

2011-01-01

Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10-17. This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications. PMID:21597912
A large population genetic study of 15 autosomal short tandem repeat loci for establishment of Korean DNA profile database.

PubMed

Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha

2011-07-01

Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10(-17). This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications.
DNA barcoding of medicinal plant material for identification

USDA-ARS?s Scientific Manuscript database

Because of the increasing demand for herbal remedies and for authentication of the source material, it is vital to provide a single database containing information about authentic plant materials and their potential adulterants. The database should provide DNA barcodes for data retrieval and similar...
BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources

PubMed Central

2009-01-01

Background DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases. Results We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses. Conclusion Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org. PMID:19958506
An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

USDA-ARS?s Scientific Manuscript database

Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...
Identification of food and beverage spoilage yeasts from DNA sequence analyses

USDA-ARS?s Scientific Manuscript database

Detection, identification, and classification of yeasts has undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of th...
DNA Barcodes for Forensically Important Fly Species in Brazil.

PubMed

Koroiva, Ricardo; de Souza, Mirian S; Roque, Fabio de Oliveira; Pepinelli, Mateus

2018-04-07

Here, we analyze 248 DNA barcode sequences of 35 fly species of forensic importance in Brazil. DNA barcoding can be effectively used for specimen identification of these species, allowing the unambiguous identification of 31 species, an overall success rate of 88%. Our results show a high rate of success for molecular identification using DNA barcoding sequences and open new perspectives for immature species identification, a subject on which limited forensic investigations exist in Tropical regions. We also address the implications of building a robust forensic DNA barcode database. A geographic bias is recognized for the COI dataset available for forensically important fly species in Brazil, with concentration of sequences from specimens collected mainly in sites located in the Cerrado, Mata Atlântica, and Pampa biomes.
Development of a multilocus-based approach for sponge (phylum Porifera) identification: refinement and limitations.

PubMed

Yang, Qi; Franco, Christopher M M; Sorokin, Shirley J; Zhang, Wei

2017-02-02

For sponges (phylum Porifera), there is no reliable molecular protocol available for species identification. To address this gap, we developed a multilocus-based Sponge Identification Protocol (SIP) validated by a sample of 37 sponge species belonging to 10 orders from South Australia. The universal barcode COI mtDNA, 28S rRNA gene (D3-D5), and the nuclear ITS1-5.8S-ITS2 region were evaluated for their suitability and capacity for sponge identification. The highest Bit Score was applied to infer the identity. The reliability of SIP was validated by phylogenetic analysis. The 28S rRNA gene and COI mtDNA performed better than the ITS region in classifying sponges at various taxonomic levels. A major limitation is that the databases are not well populated and possess low diversity, making it difficult to conduct the molecular identification protocol. The identification is also impacted by the accuracy of the morphological classification of the sponges whose sequences have been submitted to the database. Re-examination of the morphological identification further demonstrated and improved the reliability of sponge identification by SIP. Integrated with morphological identification, the multilocus-based SIP offers an improved protocol for more reliable and effective sponge identification, by coupling the accuracy of different DNA markers.

Development of a multilocus-based approach for sponge (phylum Porifera) identification: refinement and limitations

PubMed Central

Yang, Qi; Franco, Christopher M. M.; Sorokin, Shirley J.; Zhang, Wei

2017-01-01

For sponges (phylum Porifera), there is no reliable molecular protocol available for species identification. To address this gap, we developed a multilocus-based Sponge Identification Protocol (SIP) validated by a sample of 37 sponge species belonging to 10 orders from South Australia. The universal barcode COI mtDNA, 28S rRNA gene (D3–D5), and the nuclear ITS1-5.8S-ITS2 region were evaluated for their suitability and capacity for sponge identification. The highest Bit Score was applied to infer the identity. The reliability of SIP was validated by phylogenetic analysis. The 28S rRNA gene and COI mtDNA performed better than the ITS region in classifying sponges at various taxonomic levels. A major limitation is that the databases are not well populated and possess low diversity, making it difficult to conduct the molecular identification protocol. The identification is also impacted by the accuracy of the morphological classification of the sponges whose sequences have been submitted to the database. Re-examination of the morphological identification further demonstrated and improved the reliability of sponge identification by SIP. Integrated with morphological identification, the multilocus-based SIP offers an improved protocol for more reliable and effective sponge identification, by coupling the accuracy of different DNA markers. PMID:28150727
Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

NASA Astrophysics Data System (ADS)

Chen, K.

2017-01-01

With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).
Suitability of partial 16S ribosomal RNA gene sequence analysis for the identification of dangerous bacterial pathogens.

PubMed

Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F

2007-03-01

In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.
International Society of Human and Animal Mycology (ISHAM)-ITS reference DNA barcoding database--the quality controlled standard tool for routine identification of human and animal pathogenic fungi.

PubMed

Irinyi, Laszlo; Serena, Carolina; Garcia-Hermoso, Dea; Arabatzis, Michael; Desnos-Ollivier, Marie; Vu, Duong; Cardinali, Gianluigi; Arthur, Ian; Normand, Anne-Cécile; Giraldo, Alejandra; da Cunha, Keith Cassia; Sandoval-Denis, Marcelo; Hendrickx, Marijke; Nishikaku, Angela Satie; de Azevedo Melo, Analy Salles; Merseguel, Karina Bellinghausen; Khan, Aziza; Parente Rocha, Juliana Alves; Sampaio, Paula; da Silva Briones, Marcelo Ribeiro; e Ferreira, Renata Carmona; de Medeiros Muniz, Mauro; Castañón-Olivares, Laura Rosio; Estrada-Barcenas, Daniel; Cassagne, Carole; Mary, Charles; Duan, Shu Yao; Kong, Fanrong; Sun, Annie Ying; Zeng, Xianyu; Zhao, Zuotao; Gantois, Nausicaa; Botterel, Françoise; Robbertse, Barbara; Schoch, Conrad; Gams, Walter; Ellis, David; Halliday, Catriona; Chen, Sharon; Sorrell, Tania C; Piarroux, Renaud; Colombo, Arnaldo L; Pais, Célia; de Hoog, Sybren; Zancopé-Oliveira, Rosely Maria; Taylor, Maria Lucia; Toriello, Conchita; de Almeida Soares, Célia Maria; Delhaes, Laurence; Stubbe, Dirk; Dromer, Françoise; Ranque, Stéphane; Guarro, Josep; Cano-Lira, Jose F; Robert, Vincent; Velegraki, Aristea; Meyer, Wieland

2015-05-01

Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on "DNA barcoding of human and animal pathogenic fungi." A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens. © The Author 2015. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Identification of Rays through DNA Barcoding: An Application for Ecologists

PubMed Central

Cerutti-Pereyra, Florencia; Meekan, Mark G.; Wei, Nu-Wei V.; O'Shea, Owen; Bradshaw, Corey J. A.; Austin, Chris M.

2012-01-01

DNA barcoding potentially offers scientists who are not expert taxonomists a powerful tool to support the accuracy of field studies involving taxa that are diverse and difficult to identify. The taxonomy of rays has received reasonable attention in Australia, although the fauna in remote locations such as Ningaloo Reef, Western Australia is poorly studied and the identification of some species in the field is problematic. Here, we report an application of DNA-barcoding to the identification of 16 species (from 10 genera) of tropical rays as part of an ecological study. Analysis of the dataset combined across all samples grouped sequences into clearly defined operational taxonomic units, with two conspicuous exceptions: the Neotrygon kuhlii species complex and the Aetobatus species complex. In the field, the group that presented the most difficulties for identification was the spotted whiptail rays, referred to as the ‘uarnak’ complex. Two sets of problems limited the successful application of DNA barcoding: (1) the presence of cryptic species, species complexes with unresolved taxonomic status and intra-specific geographical variation, and (2) insufficient numbers of entries in online databases that have been verified taxonomically, and the presence of lodged sequences in databases with inconsistent names. Nevertheless, we demonstrate the potential of the DNA barcoding approach to confirm field identifications and to highlight species complexes where taxonomic uncertainty might confound ecological data. PMID:22701556
Morchella MLST database

USDA-ARS?s Scientific Manuscript database

Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...
Characteristics of Populations of the Russian Federation over the Panel of Fifteen Loci Used for DNA Identification and in Forensic Medical Examination

PubMed Central

A Stepanov, V.; Balanovsky, O.P.; Melnikov, A.V.; Lash-Zavada, A.Yu.; Khar’kov, V.N.; Tyazhelova, T.V.; Akhmetova, V.L.; Zhukova, O.V.; Shneider, Yu.V.; Shil’nikova, I.N.; Borinskaya, S.A.; Marusin, A.V.; Spiridonova, M.G.; Simonova, K.V.; Khitrinskaya, I.Yu.; Radzhabov, M.O.; Romanov, A.G.; Shtygasheva, O.V.; Koshel’, S.M.; Balanovskaya, E.V.; Rybakova, A.V.; Khusnutdinova, E.K.; Puzyrev, V.P.; Yankovsky, N.K.

2011-01-01

Seventeen population groups within the Russian Federation were characterized for the first time using a panel of 15 genetic markers that are used for DNA identification and in forensic medical examinations. The degree of polymorphism and population diversity of microsatellite loci within the Power Plex system (Promega) in Russian populations; the distribution of alleles and genotypes within the populations of six cities and 11 ethnic groups of the Russian Federation; the levels of intra- and interpopulation genetic differentiation of population; genetic relations between populations; and the identification and forensic medical characteristics of the system of markers under study were determined. Significant differences were revealed between the Russian populations and the U.S. reference base that was used recently in the forensic medical examination of the RF. A database of the allelic frequencies of 15 microsatellite loci that are used for DNA identification and forensic medical examination was created; the database has the potential of becoming the reference for performing forensic medical examinations in Russia. The spatial organization of genetic diversity over the panel of the STR markers that are used for DNA identification was revealed. It represents the general regularities of geographical clusterization of human populations over various types of genetic markers. The necessity to take into account a population’s genetic structure during forensic medical examinations and DNA identification of criminal suspects was substantiated. PMID:22649684
gene GIS: Computational Tools for Spatial Analyses of DNA Profiles with Associated Photo-Identification and Telemetry Records of Marine Mammals

DTIC Science & Technology

2011-09-30

DNA profiles. Referred to as geneGIS, the program will provide the ability to display, browse, select, filter and summarize spatial or temporal...of the SPLASH photo-identification records and available DNA profiles is underway through integration and crosschecking by Cascadia and MMI . An...Darwin Core standards where possible and can accommodate the current databases developed for telemetry data at MMI and SPLASH collection records at
Forensic DNA Profiling and Database

PubMed Central

Panneerchelvam, S.; Norazmi, M.N.

2003-01-01

The incredible power of DNA technology as an identification tool had brought a tremendous change in crimnal justice . DNA data base is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. This article discusses the essential steps in compilation of COmbined DNA Index System (CODIS) on validated polymerase chain amplified STRs and their use in crime detection. PMID:23386793
Genetic identification of missing persons: DNA analysis of human remains and compromised samples.

PubMed

Alvarez-Cubero, M J; Saiz, M; Martinez-Gonzalez, L J; Alvarez, J C; Eisenberg, A J; Budowle, B; Lorente, J A

2012-01-01

Human identification has made great strides over the past 2 decades due to the advent of DNA typing. Forensic DNA typing provides genetic data from a variety of materials and individuals, and is applied to many important issues that confront society. Part of the success of DNA typing is the generation of DNA databases to help identify missing persons and to develop investigative leads to assist law enforcement. DNA databases house DNA profiles from convicted felons (and in some jurisdictions arrestees), forensic evidence, human remains, and direct and family reference samples of missing persons. These databases are essential tools, which are becoming quite large (for example the US Database contains 10 million profiles). The scientific, governmental and private communities continue to work together to standardize genetic markers for more effective worldwide data sharing, to develop and validate robust DNA typing kits that contain the reagents necessary to type core identity genetic markers, to develop technologies that facilitate a number of analytical processes and to develop policies to make human identity testing more effective. Indeed, DNA typing is integral to resolving a number of serious criminal and civil concerns, such as solving missing person cases and identifying victims of mass disasters and children who may have been victims of human trafficking, and provides information for historical studies. As more refined capabilities are still required, novel approaches are being sought, such as genetic testing by next-generation sequencing, mass spectrometry, chip arrays and pyrosequencing. Single nucleotide polymorphisms offer the potential to analyze severely compromised biological samples, to determine the facial phenotype of decomposed human remains and to predict the bioancestry of individuals, a new focus in analyzing this type of markers. Copyright © 2012 S. Karger AG, Basel.
DNA preservation in skeletal elements from the World Trade Center disaster: recommendations for mass fatality management.

PubMed

Mundorff, Amy Z; Bartelink, Eric J; Mar-Cash, Elaine

2009-07-01

The World Trade Center (WTC) victim identification effort highlights taphonomic influences on the degradation of DNA from victims of mass fatality incidents. This study uses a subset of the WTC-Human Remains Database to evaluate differential preservation of DNA by skeletal element. Recovery location, sex, and victim type (civilian, firefighter, or plane passenger) do not appear to influence DNA preservation. Results indicate that more intact elements, as well as elements encased in soft tissue, produced slightly higher identification rates than more fragmented remains. DNA identification rates by element type conform to previous findings, with higher rates generally found in denser, weight-bearing bones. However, smaller bones including patellae, metatarsals, and foot phalanges yielded rates comparable to both femora and tibiae. These elements can be easily sampled with a disposable scalpel, and thus reduce potential DNA contamination. These findings have implications for DNA sampling guidelines in future mass fatality incidents.
Advances in DNA metabarcoding for food and wildlife forensic species identification.

PubMed

Staats, Martijn; Arulandhu, Alfred J; Gravendeel, Barbara; Holst-Jensen, Arne; Scholtens, Ingrid; Peelen, Tamara; Prins, Theo W; Kok, Esther

2016-07-01

Species identification using DNA barcodes has been widely adopted by forensic scientists as an effective molecular tool for tracking adulterations in food and for analysing samples from alleged wildlife crime incidents. DNA barcoding is an approach that involves sequencing of short DNA sequences from standardized regions and comparison to a reference database as a molecular diagnostic tool in species identification. In recent years, remarkable progress has been made towards developing DNA metabarcoding strategies, which involves next-generation sequencing of DNA barcodes for the simultaneous detection of multiple species in complex samples. Metabarcoding strategies can be used in processed materials containing highly degraded DNA e.g. for the identification of endangered and hazardous species in traditional medicine. This review aims to provide insight into advances of plant and animal DNA barcoding and highlights current practices and recent developments for DNA metabarcoding of food and wildlife forensic samples from a practical point of view. Special emphasis is placed on new developments for identifying species listed in the Convention on International Trade of Endangered Species (CITES) appendices for which reliable methods for species identification may signal and/or prevent illegal trade. Current technological developments and challenges of DNA metabarcoding for forensic scientists will be assessed in the light of stakeholders' needs.
The construction of an EST database for Bombyx mori and its application

PubMed Central

Mita, Kazuei; Morimyo, Mitsuoki; Okano, Kazuhiro; Koike, Yoshiko; Nohata, Junko; Kawasaki, Hideki; Kadono-Okuda, Keiko; Yamamoto, Kimiko; Suzuki, Masataka G.; Shimada, Toru; Goldsmith, Marian R.; Maeda, Susumu

2003-01-01

To build a foundation for the complete genome analysis of Bombyx mori, we have constructed an EST database. Because gene expression patterns deeply depend on tissues as well as developmental stages, we analyzed many cDNA libraries prepared from various tissues and different developmental stages to cover the entire set of Bombyx genes. So far, the Bombyx EST database contains 35,000 ESTs from 36 cDNA libraries, which are grouped into ≈11,000 nonredundant ESTs with the average length of 1.25 kb. The comparison with FlyBase suggests that the present EST database, SilkBase, covers >55% of all genes of Bombyx. The fraction of library-specific ESTs in each cDNA library indicates that we have not yet reached saturation, showing the validity of our strategy for constructing an EST database to cover all genes. To tackle the coming saturation problem, we have checked two methods, subtraction and normalization, to increase coverage and decrease the number of housekeeping genes, resulting in a 5–11% increase of library-specific ESTs. The identification of a number of genes and comprehensive cloning of gene families have already emerged from the SilkBase search. Direct links of SilkBase with FlyBase and WormBase provide ready identification of candidate Lepidoptera-specific genes. PMID:14614147
Mitochondrial DNA control region sequences from Nairobi (Kenya): inferring phylogenetic parameters for the establishment of a forensic database.

PubMed

Brandstätter, Anita; Peterson, Christine T; Irwin, Jodi A; Mpoke, Solomon; Koech, Davy K; Parson, Walther; Parsons, Thomas J

2004-10-01

Large forensic mtDNA databases which adhere to strict guidelines for generation and maintenance, are not available for many populations outside of the United States and western Europe. We have established a high quality mtDNA control region sequence database for urban Nairobi as both a reference database for forensic investigations, and as a tool to examine the genetic variation of Kenyan sequences in the context of known African variation. The Nairobi sequences exhibited high variation and a low random match probability, indicating utility for forensic testing. Haplogroup identification and frequencies were compared with those reported from other published studies on African, or African-origin populations from Mozambique, Sierra Leone, and the United States, and suggest significant differences in the mtDNA compositions of the various populations. The quality of the sequence data in our study was investigated and supported using phylogenetic measures. Our data demonstrate the diversity and distinctiveness of African populations, and underline the importance of establishing additional forensic mtDNA databases of indigenous African populations.
DNA typing in forensic medicine and in criminal investigations: a current survey.

PubMed

Benecke, M

1997-05-01

Since 1985 DNA typing of biological material has become one of the most powerful tools for personal identification in forensic medicine and in criminal investigations [1-6]. Classical DNA "fingerprinting" is increasingly being replaced by polymerase chain reaction (PCR) based technology which detects very short polymorphic stretches of DNA [7-15]. DNA loci which forensic scientists study do not code for proteins, and they are spread over the whole genome [16, 17]. These loci are neutral, and few provide any information about individuals except for their identity. Minute amounts of biological material are sufficient for DNA typing. Many European countries are beginning to establish databases to store DNA profiles of crime scenes and known offenders. A brief overview is given of past and present DNA typing and the establishment of forensic DNA databases in Europe.
DNA typing in forensic medicine and in criminal investigations: a current survey

NASA Astrophysics Data System (ADS)

Benecke, Mark

Since 1985 DNA typing of biological material has become one of the most powerful tools for personal identification in forensic medicine and in criminal investigations [1-6]. Classical DNA "fingerprinting" is increasingly being replaced by polymerase chain reaction (PCR) based technology which detects very short polymorphic stretches of DNA [7-15]. DNA loci which forensic scientists study do not code for proteins, and they are spread over the whole genome [16, 17]. These loci are neutral, and few provide any information about individuals except for their identity. Minute amounts of biological material are sufficient for DNA typing. Many European countries are beginning to establish databases to store DNA profiles of crime scenes and known offenders. A brief overview is given of past and present DNA typing and the establishment of forensic DNA databases in Europe.
DNA Barcoding for Identification of ‘Candidatus Phytoplasmas’ Using a Fragment of the Elongation Factor Tu Gene

PubMed Central

Makarova, Olga; Contaldo, Nicoletta; Paltrinieri, Samanta; Kawube, Geofrey; Bertaccini, Assunta; Nicolaisen, Mogens

2012-01-01

Background Phytoplasmas are bacterial phytopathogens responsible for significant losses in agricultural production worldwide. Several molecular markers are available for identification of groups or strains of phytoplasmas. However, they often cannot be used for identification of phytoplasmas from different groups simultaneously or are too long for routine diagnostics. DNA barcoding recently emerged as a convenient tool for species identification. Here, the development of a universal DNA barcode based on the elongation factor Tu (tuf) gene for phytoplasma identification is reported. Methodology/Principal Findings We designed a new set of primers and amplified a 420–444 bp fragment of tuf from all 91 phytoplasmas strains tested (16S rRNA groups -I through -VII, -IX through -XII, -XV, and -XX). Comparison of NJ trees constructed from the tuf barcode and a 1.2 kbp fragment of the 16S ribosomal gene revealed that the tuf tree is highly congruent with the 16S rRNA tree and had higher inter- and intra- group sequence divergence. Mean K2P inter−/intra- group divergences of the tuf barcode did not overlap and had approximately one order of magnitude difference for most groups, suggesting the presence of a DNA barcoding gap. The use of the tuf barcode allowed separation of main ribosomal groups and most of their subgroups. Phytoplasma tuf barcodes were deposited in the NCBI GenBank and Q-bank databases. Conclusions/Significance This study demonstrates that DNA barcoding principles can be applied for identification of phytoplasmas. Our findings suggest that the tuf barcode performs as well or better than a 1.2 kbp fragment of the 16S rRNA gene and thus provides an easy procedure for phytoplasma identification. The obtained sequences were used to create a publicly available reference database that can be used by plant health services and researchers for online phytoplasma identification. PMID:23272216
Insight into Identification of Acinetobacter Species by Matrix-Assisted Laser Desorption/Ionization Time of Flight Mass Spectrometry (MALDI-TOF MS) in the Clinical Laboratory

NASA Astrophysics Data System (ADS)

Li, Xiuyuan; Tang, Yanyan; Lu, Xinxin

2018-04-01

Currently, the capability of identification for Acinetobacter species using MALDI-TOF MS still remains unclear in clinical laboratories due to certain elusory phenomena. Thus, we conducted this research to evaluate this technique and reveal the causes of misidentification. Briefly, a total of 788 Acinetobacter strains were collected and confirmed at the species level by 16S rDNA and rpoB sequencing, and subsequently compared to the identification by MALDI-TOF MS using direct smear and bacterial extraction pretreatments. Cluster analysis was performed based on the mass spectra and 16S rDNA to reflect the diversity among different species. Eventually, 19 Acinetobacter species were confirmed, including 6 species unavailable in Biotyper 3.0 database. Another novel species was observed, temporarily named A. corallinus. The accuracy of identification for Acinetobacter species using MALDI-TOF MS was 97.08% (765/788), regardless of which pretreatment was applied. The misidentification only occurred on 3 A. parvus strains and 20 strains of species unavailable in the database. The proportions of strains with identification score ≥ 2.000 using direct smear and bacterial extraction pretreatments were 86.04% (678/788) and 95.43% (752/788), χ 2 = 41.336, P < 0.001. The species similar in 16 rDNA were discriminative from the mass spectra, such as A. baumannii & A. junii, A. pittii & A. calcoaceticus, and A. nosocomialis & A. seifertii. Therefore, using MALDI-TOF MS to identify Acinetobacter strains isolated from clinical samples was deemed reliable. Misidentification occurred occasionally due to the insufficiency of the database rather than sample extraction failure. We suggest gene sequencing should be performed when the identification score is under 2.000 even when using bacterial extraction pretreatment. [Figure not available: see fulltext.
Insight into Identification of Acinetobacter Species by Matrix-Assisted Laser Desorption/Ionization Time of Flight Mass Spectrometry (MALDI-TOF MS) in the Clinical Laboratory.

PubMed

Li, Xiuyuan; Tang, Yanyan; Lu, Xinxin

2018-04-09

Currently, the capability of identification for Acinetobacter species using MALDI-TOF MS still remains unclear in clinical laboratories due to certain elusory phenomena. Thus, we conducted this research to evaluate this technique and reveal the causes of misidentification. Briefly, a total of 788 Acinetobacter strains were collected and confirmed at the species level by 16S rDNA and rpoB sequencing, and subsequently compared to the identification by MALDI-TOF MS using direct smear and bacterial extraction pretreatments. Cluster analysis was performed based on the mass spectra and 16S rDNA to reflect the diversity among different species. Eventually, 19 Acinetobacter species were confirmed, including 6 species unavailable in Biotyper 3.0 database. Another novel species was observed, temporarily named A. corallinus. The accuracy of identification for Acinetobacter species using MALDI-TOF MS was 97.08% (765/788), regardless of which pretreatment was applied. The misidentification only occurred on 3 A. parvus strains and 20 strains of species unavailable in the database. The proportions of strains with identification score ≥ 2.000 using direct smear and bacterial extraction pretreatments were 86.04% (678/788) and 95.43% (752/788), χ 2 = 41.336, P < 0.001. The species similar in 16 rDNA were discriminative from the mass spectra, such as A. baumannii & A. junii, A. pittii & A. calcoaceticus, and A. nosocomialis & A. seifertii. Therefore, using MALDI-TOF MS to identify Acinetobacter strains isolated from clinical samples was deemed reliable. Misidentification occurred occasionally due to the insufficiency of the database rather than sample extraction failure. We suggest gene sequencing should be performed when the identification score is under 2.000 even when using bacterial extraction pretreatment. Graphical Abstract ᅟ.
[Analysis of genetic diversity of Russian regional populations based on common STR markers used in DNA identification].

PubMed

Pesik, V Yu; Fedunin, A A; Agdzhoyan, A T; Utevska, O M; Chukhraeva, M I; Evseeva, I V; Churnosov, M I; Lependina, I N; Bogunov, Yu V; Bogunova, A A; Ignashkin, M A; Yankovsky, N K; Balanovska, E V; Orekhov, V A; Balanovsky, O P

2014-06-01

We conducted the first genetic analysis of a wide a range of rural Russian populations in European Russia with a panel of common DNA markers commonly used in criminalistics genetic identification. We examined a total of 647 samples from indigenous ethnic Russian populations in Arkhangelsk, Belgorod, Voronezh, Kursk, Rostov, Ryazan, and Orel regions. We employed a multiplex genotyping kit, COrDIS Plus, to genotype Short Tandem Repeat (STR) loci, which included the genetic marker panel officially recommended for DNA identification in the Russian Federation, the United States, and the European Union. In the course of our study, we created a database of allelic frequencies, examined the distribution of alleles and genotypes in seven rural Russian populations, and defined the genetic relationships between these populations. We found that, although multidimensional analysis indicated a difference between the Northern gene pool and the rest of the Russian European populations, a pairwise comparison using 19 STR markers among all populations did not reveal significant differences. This is in concordance with previous studies, which examined up to 12 STR markers of urban Russian populations. Therefore, the database of allelic frequencies created in this study can be applied for forensic examinations and DNA identification among the ethnic Russian population over European Russia. We also noted a decrease in the levels of heterozygosity in the northern Russian population compared to ethnic populations in southern and central Russia, which is consistent with trends identified previously using classical gene markers and analysis of mitochondrial DNA.

CRITICA: coding region identification tool invoking comparative analysis

NASA Technical Reports Server (NTRS)

Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

1999-01-01

Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).
Identification of Neoceratitis asiatica (Becker) (Diptera: Tephritidae) based on morphological characteristics and DNA barcode.

PubMed

Guo, Shaokun; He, Jia; Zhao, Zihua; Liu, Lijun; Gao, Liyuan; Wei, Shuhua; Guo, Xiaoyu; Zhang, Rong; Li, Zhihong

2017-12-12

Neoceratitis asiatica (Becker), which especially infests wolfberry (Lycium barbarum L.), could cause serious economic losses every year in China, especially to organic wolfberry production. In some important wolfberry plantings, it is difficult and time-consuming to rear the larvae or pupae to adults for morphological identification. Molecular identification based on DNA barcode is a solution to the problem. In this study, 15 samples were collected from Ningxia, China. Among them, five adults were identified according to their morphological characteristics. The utility of mitochondrial DNA (mtDNA) cytochrome c oxidase I (COI) gene sequence as DNA barcode in distinguishing N. asiatica was evaluated by analysing Kimura 2-parameter distances and phylogenetic trees. There were significant differences between intra-specific and inter-specific genetic distances according to the barcoding gap analysis. The uncertain larval and pupal samples were within the same cluster as N. asiatica adults and formed sister cluster to N. cyanescens. A combination of morphological and molecular methods enabled accurate identification of N. asiatica. This is the first study using DNA barcode to identify N. asiatica and the obtained DNA sequences will be added to the DNA barcode database.
DNA-barcoding of forensically important blow flies (Diptera: Calliphoridae) in the Caribbean Region

PubMed Central

Agnarsson, Ingi

2017-01-01

Correct identification of forensically important insects, such as flies in the family Calliphoridae, is a crucial step for them to be used as evidence in legal investigations. Traditional identification based on morphology has been effective, but has some limitations when it comes to identifying immature stages of certain species. DNA-barcoding, using COI, has demonstrated potential for rapid and accurate identification of Calliphoridae, however, this gene does not reliably distinguish among some recently diverged species, raising questions about its use for delimitation of species of forensic importance. To facilitate DNA based identification of Calliphoridae in the Caribbean we developed a vouchered reference collection from across the region, and a DNA sequence database, and further added the nuclear ITS2 as a second marker to increase accuracy of identification through barcoding. We morphologically identified freshly collected specimens, did phylogenetic analyses and employed several species delimitation methods for a total of 468 individuals representing 19 described species. Our results show that combination of COI + ITS2 genes yields more accurate identification and diagnoses, and better agreement with morphological data, than the mitochondrial barcodes alone. All of our results from independent and concatenated trees and most of the species delimitation methods yield considerably higher diversity estimates than the distance based approach and morphology. Molecular data support at least 24 distinct clades within Calliphoridae in this study, recovering substantial geographic variation for Lucilia eximia, Lucilia retroversa, Lucilia rica and Chloroprocta idioidea, probably indicating several cryptic species. In sum, our study demonstrates the importance of employing a second nuclear marker for barcoding analyses and species delimitation of calliphorids, and the power of molecular data in combination with a complete reference database to enable identification of taxonomically and geographically diverse insects of forensic importance. PMID:28761780
Harnessing mtDNA variation to resolve ambiguity in ‘Redfish’ sold in Europe

PubMed Central

Moore, Lauren; Pampoulie, Christophe; Di Muri, Cristina; Vandamme, Sara; Mariani, Stefano

2017-01-01

Morphology-based identification of North Atlantic Sebastes has long been controversial and misidentification may produce misleading data, with cascading consequences that negatively affect fisheries management and seafood labelling. North Atlantic Sebastes comprises of four species, commonly known as ‘redfish’, but little is known about the number, identity and labelling accuracy of redfish species sold across Europe. We used a molecular approach to identify redfish species from ‘blind’ specimens to evaluate the performance of the Barcode of Life (BOLD) and Genbank databases, as well as carrying out a market product accuracy survey from retailers across Europe. The conventional BOLD approach proved ambiguous, and phylogenetic analysis based on mtDNA control region sequences provided a higher resolution for species identification. By sampling market products from four countries, we found the presence of two species of redfish (S. norvegicus and S. mentella) and one unidentified Pacific rockfish marketed in Europe. Furthermore, public databases revealed the existence of inaccurate reference sequences, likely stemming from species misidentification from previous studies, which currently hinders the efficacy of DNA methods for the identification of Sebastes market samples. PMID:29018597
The effect of wild card designations and rare alleles in forensic DNA database searches.

PubMed

Tvedebrink, Torben; Bright, Jo-Anne; Buckleton, John S; Curran, James M; Morling, Niels

2015-05-01

Forensic DNA databases are powerful tools used for the identification of persons of interest in criminal investigations. Typically, they consist of two parts: (1) a database containing DNA profiles of known individuals and (2) a database of DNA profiles associated with crime scenes. The risk of adventitious or chance matches between crimes and innocent people increases as the number of profiles within a database grows and more data is shared between various forensic DNA databases, e.g. from different jurisdictions. The DNA profiles obtained from crime scenes are often partial because crime samples may be compromised in quantity or quality. When an individual's profile cannot be resolved from a DNA mixture, ambiguity is introduced. A wild card, F, may be used in place of an allele that has dropped out or when an ambiguous profile is resolved from a DNA mixture. Variant alleles that do not correspond to any marker in the allelic ladder or appear above or below the extent of the allelic ladder range are assigned the allele designation R for rare allele. R alleles are position specific with respect to the observed/unambiguous allele. The F and R designations are made when the exact genotype has not been determined. The F and R designation are treated as wild cards for searching, which results in increased chance of adventitious matches. We investigated the probability of adventitious matches given these two types of wild cards. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
DNA Commission of the International Society for Forensic Genetics: revised and extended guidelines for mitochondrial DNA typing.

PubMed

Parson, W; Gusmão, L; Hares, D R; Irwin, J A; Mayr, W R; Morling, N; Pokorak, E; Prinz, M; Salas, A; Schneider, P M; Parsons, T J

2014-11-01

The DNA Commission of the International Society of Forensic Genetics (ISFG) regularly publishes guidelines and recommendations concerning the application of DNA polymorphisms to the question of human identification. Previous recommendations published in 2000 addressed the analysis and interpretation of mitochondrial DNA (mtDNA) in forensic casework. While the foundations set forth in the earlier recommendations still apply, new approaches to the quality control, alignment and nomenclature of mitochondrial sequences, as well as the establishment of mtDNA reference population databases, have been developed. Here, we describe these developments and discuss their application to both mtDNA casework and mtDNA reference population databasing applications. While the generation of mtDNA for forensic casework has always been guided by specific standards, it is now well-established that data of the same quality are required for the mtDNA reference population data used to assess the statistical weight of the evidence. As a result, we introduce guidelines regarding sequence generation, as well as quality control measures based on the known worldwide mtDNA phylogeny, that can be applied to ensure the highest quality population data possible. For both casework and reference population databasing applications, the alignment and nomenclature of haplotypes is revised here and the phylogenetic alignment proffered as acceptable standard. In addition, the interpretation of heteroplasmy in the forensic context is updated, and the utility of alignment-free database searches for unbiased probability estimates is highlighted. Finally, we discuss statistical issues and define minimal standards for mtDNA database searches. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Identification of Belgian mosquito species (Diptera: Culicidae) by DNA barcoding.

PubMed

Versteirt, V; Nagy, Z T; Roelants, P; Denis, L; Breman, F C; Damiens, D; Dekoninck, W; Backeljau, T; Coosemans, M; Van Bortel, W

2015-03-01

Since its introduction in 2003, DNA barcoding has proven to be a promising method for the identification of many taxa, including mosquitoes (Diptera: Culicidae). Many mosquito species are potential vectors of pathogens, and correct identification in all life stages is essential for effective mosquito monitoring and control. To use DNA barcoding for species identification, a reliable and comprehensive reference database of verified DNA sequences is required. Hence, DNA sequence diversity of mosquitoes in Belgium was assessed using a 658 bp fragment of the mitochondrial cytochrome oxidase I (COI) gene, and a reference data set was established. Most species appeared as well-supported clusters. Intraspecific Kimura 2-parameter (K2P) distances averaged 0.7%, and the maximum observed K2P distance was 6.2% for Aedes koreicus. A small overlap between intra- and interspecific K2P distances for congeneric sequences was observed. Overall, the identification success using best match and the best close match criteria were high, that is above 98%. No clear genetic division was found between the closely related species Aedes annulipes and Aedes cantans, which can be confused using morphological identification only. The members of the Anopheles maculipennis complex, that is Anopheles maculipennis s.s. and An. messeae, were weakly supported as monophyletic taxa. This study showed that DNA barcoding offers a reliable framework for mosquito species identification in Belgium except for some closely related species. © 2014 John Wiley & Sons Ltd.
Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants.

PubMed

Tanabe, Akifumi S; Toju, Hirokazu

2013-01-01

Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.
Two New Computational Methods for Universal DNA Barcoding: A Benchmark Using Barcode Sequences of Bacteria, Archaea, Animals, Fungi, and Land Plants

PubMed Central

Tanabe, Akifumi S.; Toju, Hirokazu

2013-01-01

Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research. PMID:24204702
Utilization of matrix-assisted laser desorption and ionization time-of-flight mass spectrometry for identification of infantile seborrheic dermatitis-causing Malassezia and incidence of culture-based cutaneous Malassezia microbiota of 1-month-old infants.

PubMed

Yamamoto, Mikachi; Umeda, Yoshiko; Yo, Ayaka; Yamaura, Mariko; Makimura, Koichi

2014-02-01

Matrix-assisted laser desorption and ionization time-of-flight mass spectrometry (MALDI-TOF-MS) has been utilized for identification of various microorganisms. Malassezia species, including Malassezia restricta, which is associated with seborrheic dermatitis, has been difficult to identify by traditional means. This study was performed to develop a system for identification of Malassezia species with MALDI-TOF-MS and to investigate the incidence and variety of cutaneous Malassezia microbiota of 1-month-old infants using this technique. A Malassezia species-specific MALDI-TOF-MS database was developed from eight standard strains, and the availability of this system was assessed using 54 clinical strains isolated from the skin of 1-month-old infants. Clinical isolates were cultured initially on CHROMagar Malassezia growth medium, and the 28S ribosomal DNA (D1/D2) sequence was analyzed for confirmatory identification. Using this database, we detected and analyzed Malassezia species in 68% and 44% of infants with and without infantile seborrheic dermatitis, respectively. The results of MALDI-TOF-MS analysis were consistent with those of rDNA sequencing identification (100% accuracy rate). To our knowledge, this is the first report of a MALDI-TOF-MS database for major skin pathogenic Malassezia species. This system is an easy, rapid and reliable method for identification of Malassezia. © 2014 Japanese Dermatological Association.
Diagnostics of Neisseriaceae and Moraxellaceae by Ribosomal DNA Sequencing: Ribosomal Differentiation of Medical Microorganisms

PubMed Central

Harmsen, Dag; Singer, Christian; Rothgänger, Jörg; Tønjum, Tone; Sybren de Hoog, Gerrit; Shah, Haroun; Albert, Jürgen; Frosch, Matthias

2001-01-01

Fast and reliable identification of microbial isolates is a fundamental goal of clinical microbiology. However, in the case of some fastidious gram-negative bacterial species, classical phenotype identification based on either metabolic, enzymatic, or serological methods is difficult, time-consuming, and/or inadequate. 16S or 23S ribosomal DNA (rDNA) bacterial sequencing will most often result in accurate speciation of isolates. Therefore, the objective of this study was to find a hypervariable rDNA stretch, flanked by strongly conserved regions, which is suitable for molecular species identification of members of the Neisseriaceae and Moraxellaceae. The inter- and intrageneric relationships were investigated using comparative sequence analysis of PCR-amplified partial 16S and 23S rDNAs from a total of 94 strains. When compared to the type species of the genera Acinetobacter, Moraxella, and Neisseria, an average of 30 polymorphic positions was observed within the partial 16S rDNA investigated (corresponding to Escherichia coli positions 54 to 510) for each species and an average of 11 polymorphic positions was observed within the 202 nucleotides of the 23S rDNA gene (positions 1400 to 1600). Neisseria macacae and Neisseria mucosa subsp. mucosa (ATCC 19696) had identical 16S and 23S rDNA sequences. Species clusters were heterogeneous in both genes in the case of Acinetobacter lwoffii, Moraxella lacunata, and N. mucosa. Neisseria meningitidis isolates failed to cluster only in the 23S rDNA subset. Our data showed that the 16S rDNA region is more suitable than the partial 23S rDNA for the molecular diagnosis of Neisseriaceae and Moraxellaceae and that a reference database should include more than one strain of each species. All sequence chromatograms and taxonomic and disease-related information are available as part of our ribosomal differentiation of medical microorganisms (RIDOM) web-based service (http://www.ridom.hygiene.uni-wuerzburg.de/). Users can submit a sequence and conduct a similarity search against the RIDOM reference database for microbial identification purposes. PMID:11230407
Toward a mtDNA locus-specific mutation database using the LOVD platform.

PubMed

Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert

2012-09-01

The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.
Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform

PubMed Central

Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert

2015-01-01

The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690
DNA Identification of Skeletal Remains from World War II Mass Graves Uncovered in Slovenia

PubMed Central

Marjanović, Damir; Durmić-Pašić, Adaleta; Bakal, Narcisa; Haverić, Sanin; Kalamujić, Belma; Kovačević, Lejla; Ramić, Jasmin; Pojskić, Naris; Škaro, Vedrana; Projić, Petar; Bajrović, Kasim; Hadžiselimović, Rifat; Drobnič, Katja; Huffine, Ed; Davoren, Jon; Primorac, Dragan

2007-01-01

Aim To present the joint effort of three institutions in the identification of human remains from the World War II found in two mass graves in the area of Škofja Loka, Slovenia. Methods The remains of 27 individuals were found in two small and closely located mass graves. The DNA was isolated from bone and teeth samples using either standard phenol/chloroform alcohol extraction or optimized Qiagen DNA extraction procedure. Some recovered samples required the employment of additional DNA purification methods, such as N-buthanol treatment. QuantifilerTM Human DNA Quantification Kit was used for DNA quantification. PowerPlex 16 kit was used to simultaneously amplify 15 short tandem repeat (STR) loci. Matching probabilities were estimated using the DNA View program. Results Out of all processed samples, 15 remains were fully profiled at all 15 STR loci. The other 12 profiles were partial. The least successful profile included 13 loci. Also, 69 referent samples (buccal swabs) from potential living relatives were collected and profiled. Comparison of victims' profile against referent samples database resulted in 4 strong matches. In addition, 5 other profiles were matched to certain referent samples with lower probability. Conclusion Our results show that more than 6 decades after the end of the World War II, DNA analysis may significantly contribute to the identification of the remains from that period. Additional analysis of Y-STRs and mitochondrial DNA (mtDNA) markers will be performed in the second phase of the identification project. PMID:17696306
Complete DNA barcode reference library for a country's butterfly fauna reveals high performance for temperate Europe

PubMed Central

Dincă, Vlad; Zakharov, Evgeny V.; Hebert, Paul D. N.; Vila, Roger

2011-01-01

DNA barcoding aims to accelerate species identification and discovery, but performance tests have shown marked differences in identification success. As a consequence, there remains a great need for comprehensive studies which objectively test the method in groups with a solid taxonomic framework. This study focuses on the 180 species of butterflies in Romania, accounting for about one third of the European butterfly fauna. This country includes five eco-regions, the highest of any in the European Union, and is a good representative for temperate areas. Morphology and DNA barcodes of more than 1300 specimens were carefully studied and compared. Our results indicate that 90 per cent of the species form barcode clusters allowing their reliable identification. The remaining cases involve nine closely related species pairs, some whose taxonomic status is controversial or that hybridize regularly. Interestingly, DNA barcoding was found to be the most effective identification tool, outperforming external morphology, and being slightly better than male genitalia. Romania is now the first country to have a comprehensive DNA barcode reference database for butterflies. Similar barcoding efforts based on comprehensive sampling of specific geographical regions can act as functional modules that will foster the early application of DNA barcoding while a global system is under development. PMID:20702462
Current genetic methodologies in the identification of disaster victims and in forensic analysis.

PubMed

Ziętkiewicz, Ewa; Witt, Magdalena; Daca, Patrycja; Zebracka-Gala, Jadwiga; Goniewicz, Mariusz; Jarząb, Barbara; Witt, Michał

2012-02-01

This review presents the basic problems and currently available molecular techniques used for genetic profiling in disaster victim identification (DVI). The environmental conditions of a mass disaster often result in severe fragmentation, decomposition and intermixing of the remains of victims. In such cases, traditional identification based on the anthropological and physical characteristics of the victims is frequently inconclusive. This is the reason why DNA profiling became the gold standard for victim identification in mass-casualty incidents (MCIs) or any forensic cases where human remains are highly fragmented and/or degraded beyond recognition. The review provides general information about the sources of genetic material for DNA profiling, the genetic markers routinely used during genetic profiling (STR markers, mtDNA and single-nucleotide polymorphisms [SNP]) and the basic statistical approaches used in DNA-based disaster victim identification. Automated technological platforms that allow the simultaneous analysis of a multitude of genetic markers used in genetic identification (oligonucleotide microarray techniques and next-generation sequencing) are also presented. Forensic and population databases containing information on human variability, routinely used for statistical analyses, are discussed. The final part of this review is focused on recent developments, which offer particularly promising tools for forensic applications (mRNA analysis, transcriptome variation in individuals/populations and genetic profiling of specific cells separated from mixtures).
Average probability that a "cold hit" in a DNA database search results in an erroneous attribution.

PubMed

Song, Yun S; Patil, Anand; Murphy, Erin E; Slatkin, Montgomery

2009-01-01

We consider a hypothetical series of cases in which the DNA profile of a crime-scene sample is found to match a known profile in a DNA database (i.e., a "cold hit"), resulting in the identification of a suspect based only on genetic evidence. We show that the average probability that there is another person in the population whose profile matches the crime-scene sample but who is not in the database is approximately 2(N - d)p(A), where N is the number of individuals in the population, d is the number of profiles in the database, and p(A) is the average match probability (AMP) for the population. The AMP is estimated by computing the average of the probabilities that two individuals in the population have the same profile. We show further that if a priori each individual in the population is equally likely to have left the crime-scene sample, then the average probability that the database search attributes the crime-scene sample to a wrong person is (N - d)p(A).
Molecular detection of fungal pathogens in clinical specimens by 18S rDNA high-throughput screening in comparison to ITS PCR and culture.

PubMed

Wagner, K; Springer, B; Pires, V P; Keller, P M

2018-05-03

The rising incidence of invasive fungal infections and the expanding spectrum of fungal pathogens makes early and accurate identification of the causative pathogen a daunting task. Diagnostics using molecular markers enable rapid identification of fungi, offer new insights into infectious disease dynamics, and open new possibilities for infectious disease control and prevention. We performed a retrospective study using clinical specimens (N = 233) from patients with suspected fungal infection previously subjected to culture and/or internal transcribed spacer (ITS) PCR. We used these specimens to evaluate a high-throughput screening method for fungal detection using automated DNA extraction (QIASymphony), fungal ribosomal small subunit (18S) rDNA RT-PCR and amplicon sequencing. Fungal sequences were compared with sequences from the curated, commercially available SmartGene IDNS database for pathogen identification. Concordance between 18S rDNA RT-PCR and culture results was 91%, and congruence between 18S rDNA RT-PCR and ITS PCR results was 94%. In addition, 18S rDNA RT-PCR and Sanger sequencing detected fungal pathogens in culture negative (N = 13) and ITS PCR negative specimens (N = 12) from patients with a clinically confirmed fungal infection. Our results support the use of the 18S rDNA RT-PCR diagnostic workflow for rapid and accurate identification of fungal pathogens in clinical specimens.
Avatar DNA Nanohybrid System in Chip-on-a-Phone

NASA Astrophysics Data System (ADS)

Park, Dae-Hwan; Han, Chang Jo; Shul, Yong-Gun; Choy, Jin-Ho

2014-05-01

Long admired for informational role and recognition function in multidisciplinary science, DNA nanohybrids have been emerging as ideal materials for molecular nanotechnology and genetic information code. Here, we designed an optical machine-readable DNA icon on microarray, Avatar DNA, for automatic identification and data capture such as Quick Response and ColorZip codes. Avatar icon is made of telepathic DNA-DNA hybrids inscribed on chips, which can be identified by camera of smartphone with application software. Information encoded in base-sequences can be accessed by connecting an off-line icon to an on-line web-server network to provide message, index, or URL from database library. Avatar DNA is then converged with nano-bio-info-cogno science: each building block stands for inorganic nanosheets, nucleotides, digits, and pixels. This convergence could address item-level identification that strengthens supply-chain security for drug counterfeits. It can, therefore, provide molecular-level vision through mobile network to coordinate and integrate data management channels for visual detection and recording.
Avatar DNA Nanohybrid System in Chip-on-a-Phone

PubMed Central

Park, Dae-Hwan; Han, Chang Jo; Shul, Yong-Gun; Choy, Jin-Ho

2014-01-01

Long admired for informational role and recognition function in multidisciplinary science, DNA nanohybrids have been emerging as ideal materials for molecular nanotechnology and genetic information code. Here, we designed an optical machine-readable DNA icon on microarray, Avatar DNA, for automatic identification and data capture such as Quick Response and ColorZip codes. Avatar icon is made of telepathic DNA-DNA hybrids inscribed on chips, which can be identified by camera of smartphone with application software. Information encoded in base-sequences can be accessed by connecting an off-line icon to an on-line web-server network to provide message, index, or URL from database library. Avatar DNA is then converged with nano-bio-info-cogno science: each building block stands for inorganic nanosheets, nucleotides, digits, and pixels. This convergence could address item-level identification that strengthens supply-chain security for drug counterfeits. It can, therefore, provide molecular-level vision through mobile network to coordinate and integrate data management channels for visual detection and recording. PMID:24824876

Integrative taxonomy detects cryptic and overlooked fish species in a neotropical river basin.

PubMed

Gomes, Laís Carvalho; Pessali, Tiago Casarim; Sales, Naiara Guimarães; Pompeu, Paulo Santos; Carvalho, Daniel Cardoso

2015-10-01

The great freshwater fish diversity found in the neotropical region makes management and conservation actions challenging. Due to shortage of taxonomists and insufficient infrastructure to deal with such great biodiversity (i.e. taxonomic impediment), proposed remedies to accelerate species identification and descriptions include techniques that combine DNA-based identification and concise morphological description. The building of a DNA barcode reference database correlating meristic and genetic data was developed for 75 % of the Mucuri River basin's freshwater fish. We obtained a total of 141 DNA barcode sequences from 37 species belonging to 30 genera, 19 families, and 5 orders. Genetic distances within species, genera, and families were 0.74, 9.5, and 18.86 %, respectively. All species could be clearly identified by the DNA barcodes. Divergences between meristic morphological characteristics and DNA barcodes revealed two cryptic species among the Cyphocharax gilbert and Astyanax gr. bimaculatus specimens, and helped to identify two overlooked species within the Gymnotus and Astyanax taxa. Therefore, using a simplified model of neotropical biodiversity, we tested the efficiency of an integrative taxonomy approach for species discovery, identification of cryptic diversity, and accelerating biodiversity descriptions.
VIP Barcoding: composition vector-based software for rapid species identification based on DNA barcoding.

PubMed

Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou

2014-07-01

Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.
OCaPPI-Db: an oligonucleotide probe database for pathogen identification through hybridization capture.

PubMed

Gasc, Cyrielle; Constantin, Antony; Jaziri, Faouzi; Peyret, Pierre

2017-01-01

The detection and identification of bacterial pathogens involved in acts of bio- and agroterrorism are essential to avoid pathogen dispersal in the environment and propagation within the population. Conventional molecular methods, such as PCR amplification, DNA microarrays or shotgun sequencing, are subject to various limitations when assessing environmental samples, which can lead to inaccurate findings. We developed a hybridization capture strategy that uses a set of oligonucleotide probes to target and enrich biomarkers of interest in environmental samples. Here, we present Oligonucleotide Capture Probes for Pathogen Identification Database (OCaPPI-Db), an online capture probe database containing a set of 1,685 oligonucleotide probes allowing for the detection and identification of 30 biothreat agents up to the species level. This probe set can be used in its entirety as a comprehensive diagnostic tool or can be restricted to a set of probes targeting a specific pathogen or virulence factor according to the user's needs. : http://ocappidb.uca.works. © The Author(s) 2017. Published by Oxford University Press.
MeDReaders: a database for transcription factors that bind to methylated DNA.

PubMed

Wang, Guohua; Luo, Ximei; Wang, Jianan; Wan, Jun; Xia, Shuli; Zhu, Heng; Qian, Jiang; Wang, Yadong

2018-01-04

Understanding the molecular principles governing interactions between transcription factors (TFs) and DNA targets is one of the main subjects for transcriptional regulation. Recently, emerging evidence demonstrated that some TFs could bind to DNA motifs containing highly methylated CpGs both in vitro and in vivo. Identification of such TFs and elucidation of their physiological roles now become an important stepping-stone toward understanding the mechanisms underlying the methylation-mediated biological processes, which have crucial implications for human disease and disease development. Hence, we constructed a database, named as MeDReaders, to collect information about methylated DNA binding activities. A total of 731 TFs, which could bind to methylated DNA sequences, were manually curated in human and mouse studies reported in the literature. In silico approaches were applied to predict methylated and unmethylated motifs of 292 TFs by integrating whole genome bisulfite sequencing (WGBS) and ChIP-Seq datasets in six human cell lines and one mouse cell line extracted from ENCODE and GEO database. MeDReaders database will provide a comprehensive resource for further studies and aid related experiment designs. The database implemented unified access for users to most TFs involved in such methylation-associated binding actives. The website is available at http://medreader.org/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
DNA sequence database as a tool to identify decapod crustaceans on the São Paulo coastline.

PubMed

Mantelatto, Fernando L; Terossi, Mariana; Negri, Mariana; Buranelli, Raquel C; Robles, Rafael; Magalhães, Tatiana; Tamburus, Ana Francisca; Rossi, Natália; Miyazaki, Mayara J

2017-09-05

DNA barcoding has emerged as an efficient tool for taxonomy and other biodiversity fields. The vast and speciose group of decapod crustaceans is not an exception in the current scenario and comparing short DNA fragments has enabled researchers to overcome some taxonomic impediments to help broadening knowledge on the diversity of this group of crustaceans. Brazil is considered as an important area in terms of global marine biodiversity and some regions stand out in terms of decapod fauna, such as the São Paulo coastline. Thus, the aim of this study is to obtain sequences of the mitochondrial markers (COI and 16S) for decapod crustaceans distributed at the São Paulo coastline and to test the accuracy of these markers for species identification from this region by comparing our sequences to those already present in the GenBank database. We sampled along almost the 300 km of the São Paulo coastline from estuaries to offshore islands during the development of a multidisciplinary research project that took place for 5 years. All the species were processed to obtain the DNA sequences. The diversity of the decapod fauna on the São Paulo coastline comprises at least 404 species. We were able to collect 256 of those species and sequence of at least one of the target genes from 221. By testing the accuracy of these two DNA markers as a tool for identification, we were able to check our own identifications, including new records in GenBank, spot potential mistakes in GenBank, and detect potential new species.
Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

PubMed

Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

2003-04-02

Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Evaluation of protein spectra cluster analysis for Streptococcus spp. identification from various swine clinical samples.

PubMed

Matajira, Carlos E C; Moreno, Luisa Z; Gomes, Vasco T M; Silva, Ana Paula S; Mesquita, Renan E; Doto, Daniela S; Calderaro, Franco F; de Souza, Fernando N; Christ, Ana Paula G; Sato, Maria Inês Z; Moreno, Andrea M

2017-03-01

Traditional microbiological methods enable genus-level identification of Streptococcus spp. isolates. However, as the species of this genus show broad phenotypic variation, species-level identification or even differentiation within the genus is difficult. Herein we report the evaluation of protein spectra cluster analysis for the identification of Streptococcus species associated with disease in swine by means of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). A total of 250 S. suis-like isolates obtained from pigs with clinical signs of encephalitis, arthritis, pneumonia, metritis, and urinary or septicemic infection were studied. The isolates came from pigs in different Brazilian states from 2001 to 2014. The MALDI-TOF MS analysis identified 86% (215 of 250) as S. suis and 14% (35 of 250) as S. alactolyticus, S. dysgalactiae, S. gallinaceus, S. gallolyticus, S. gordonii, S. henryi, S. hyointestinalis, S. hyovaginalis, S. mitis, S. oralis, S. pluranimalium, and S. sanguinis. The MALDI-TOF MS identification was confirmed in 99.2% of the isolates by 16S rDNA sequencing, with MALDI-TOF MS misidentifying 2 S. pluranimalium as S. hyovaginalis. Isolates were also tested by a biochemical automated system that correctly identified all isolates of 8 of the 10 species in the database. Neither the isolates of the 3 species not in the database ( S. gallinaceus, S. henryi, and S. hyovaginalis) nor the isolates of 2 species that were in the database ( S. oralis and S. pluranimalium) could be identified. The topology of the protein spectra cluster analysis appears to sustain the species phylogenetic similarities, further supporting identification by MALDI-TOF MS examination as a rapid and accurate alternative to 16S rDNA sequencing.
The "GeneTrustee": a universal identification system that ensures privacy and confidentiality for human genetic databases.

PubMed

Burnett, Leslie; Barlow-Stewart, Kris; Proos, Anné L; Aizenberg, Harry

2003-05-01

This article describes a generic model for access to samples and information in human genetic databases. The model utilises a "GeneTrustee", a third-party intermediary independent of the subjects and of the investigators or database custodians. The GeneTrustee model has been implemented successfully in various community genetics screening programs and has facilitated research access to genetic databases while protecting the privacy and confidentiality of research subjects. The GeneTrustee model could also be applied to various types of non-conventional genetic databases, including neonatal screening Guthrie card collections, and to forensic DNA samples.
Monitoring an alien invasion: DNA barcoding and the identification of lionfish and their prey on coral reefs of the Mexican Caribbean.

PubMed

Valdez-Moreno, Martha; Quintal-Lizama, Carolina; Gómez-Lozano, Ricardo; García-Rivas, María Del Carmen

2012-01-01

In the Mexican Caribbean, the exotic lionfish Pterois volitans has become a species of great concern because of their predatory habits and rapid expansion onto the Mesoamerican coral reef, the second largest continuous reef system in the world. This is the first report of DNA identification of stomach contents of lionfish using the barcode of life reference database (BOLD). We confirm with barcoding that only Pterois volitans is apparently present in the Mexican Caribbean. We analyzed the stomach contents of 157 specimens of P. volitans from various locations in the region. Based on DNA matches in the Barcode of Life Database (BOLD) and GenBank, we identified fishes from five orders, 14 families, 22 genera and 34 species in the stomach contents. The families with the most species represented were Gobiidae and Apogonidae. Some prey taxa are commercially important species. Seven species were new records for the Mexican Caribbean: Apogon mosavi, Coryphopterus venezuelae, C. thrix, C. tortugae, Lythrypnus minimus, Starksia langi and S. ocellata. DNA matches, as well as the presence of intact lionfish in the stomach contents, indicate some degree of cannibalism, a behavior confirmed in this species by the first time. We obtained 45 distinct crustacean prey sequences, from which only 20 taxa could be identified from the BOLD and GenBank databases. The matches were primarily to Decapoda but only a single taxon could be identified to the species level, Euphausia americana. This technique proved to be an efficient and useful method, especially since prey species could be identified from partially-digested remains. The primary limitation is the lack of comprehensive coverage of potential prey species in the region in the BOLD and GenBank databases, especially among invertebrates.
Monitoring an Alien Invasion: DNA Barcoding and the Identification of Lionfish and Their Prey on Coral Reefs of the Mexican Caribbean

PubMed Central

Valdez-Moreno, Martha; Quintal-Lizama, Carolina; Gómez-Lozano, Ricardo; García-Rivas, María del Carmen

2012-01-01

Background In the Mexican Caribbean, the exotic lionfish Pterois volitans has become a species of great concern because of their predatory habits and rapid expansion onto the Mesoamerican coral reef, the second largest continuous reef system in the world. This is the first report of DNA identification of stomach contents of lionfish using the barcode of life reference database (BOLD). Methodology/Principal Findings We confirm with barcoding that only Pterois volitans is apparently present in the Mexican Caribbean. We analyzed the stomach contents of 157 specimens of P. volitans from various locations in the region. Based on DNA matches in the Barcode of Life Database (BOLD) and GenBank, we identified fishes from five orders, 14 families, 22 genera and 34 species in the stomach contents. The families with the most species represented were Gobiidae and Apogonidae. Some prey taxa are commercially important species. Seven species were new records for the Mexican Caribbean: Apogon mosavi, Coryphopterus venezuelae, C. thrix, C. tortugae, Lythrypnus minimus, Starksia langi and S. ocellata. DNA matches, as well as the presence of intact lionfish in the stomach contents, indicate some degree of cannibalism, a behavior confirmed in this species by the first time. We obtained 45 distinct crustacean prey sequences, from which only 20 taxa could be identified from the BOLD and GenBank databases. The matches were primarily to Decapoda but only a single taxon could be identified to the species level, Euphausia americana. Conclusions/Significance This technique proved to be an efficient and useful method, especially since prey species could be identified from partially-digested remains. The primary limitation is the lack of comprehensive coverage of potential prey species in the region in the BOLD and GenBank databases, especially among invertebrates. PMID:22675470
Pitfalls of Establishing DNA Barcoding Systems in Protists: The Cryptophyceae as a Test Case

PubMed Central

Hoef-Emden, Kerstin

2012-01-01

A DNA barcode is a preferrably short and highly variable region of DNA supposed to facilitate a rapid identification of species. In many protistan lineages, a lack of species-specific morphological characters hampers an identification of species by light or electron microscopy, and difficulties to perform mating experiments in laboratory cultures also do not allow for an identification of biological species. Thus, testing candidate barcode markers as well as establishment of accurately working species identification systems are more challenging than in multicellular organisms. In cryptic species complexes the performance of a potential barcode marker can not be monitored using morphological characters as a feedback, but an inappropriate choice of DNA region may result in artifactual species trees for several reasons. Therefore a priori knowledge of the systematics of a group is required. In addition to identification of known species, methods for an automatic delimitation of species with DNA barcodes have been proposed. The Cryptophyceae provide a mixture of systematically well characterized as well as badly characterized groups and are used in this study to test the suitability of some of the methods for protists. As species identification method the performance of blast in searches against badly to well-sampled reference databases has been tested with COI-5P and 5′-partial LSU rDNA (domains A to D of the nuclear LSU rRNA gene). In addition the performance of two different methods for automatic species delimitation, fixed thresholds of genetic divergence and the general mixed Yule-coalescent model (GMYC), have been examined. The study demonstrates some pitfalls of barcoding methods that have to be taken care of. Also a best-practice approach towards establishing a DNA barcode system in protists is proposed. PMID:22970104
Pitfalls of establishing DNA barcoding systems in protists: the cryptophyceae as a test case.

PubMed

Hoef-Emden, Kerstin

2012-01-01

A DNA barcode is a preferrably short and highly variable region of DNA supposed to facilitate a rapid identification of species. In many protistan lineages, a lack of species-specific morphological characters hampers an identification of species by light or electron microscopy, and difficulties to perform mating experiments in laboratory cultures also do not allow for an identification of biological species. Thus, testing candidate barcode markers as well as establishment of accurately working species identification systems are more challenging than in multicellular organisms. In cryptic species complexes the performance of a potential barcode marker can not be monitored using morphological characters as a feedback, but an inappropriate choice of DNA region may result in artifactual species trees for several reasons. Therefore a priori knowledge of the systematics of a group is required. In addition to identification of known species, methods for an automatic delimitation of species with DNA barcodes have been proposed. The Cryptophyceae provide a mixture of systematically well characterized as well as badly characterized groups and are used in this study to test the suitability of some of the methods for protists. As species identification method the performance of blast in searches against badly to well-sampled reference databases has been tested with COI-5P and 5'-partial LSU rDNA (domains A to D of the nuclear LSU rRNA gene). In addition the performance of two different methods for automatic species delimitation, fixed thresholds of genetic divergence and the general mixed Yule-coalescent model (GMYC), have been examined. The study demonstrates some pitfalls of barcoding methods that have to be taken care of. Also a best-practice approach towards establishing a DNA barcode system in protists is proposed.
Mitochondrial DNA identification of game and harvested freshwater fish species.

PubMed

Kyle, C J; Wilson, C C

2007-02-14

The use of DNA in forensics has grown rapidly for human applications along with the concomitant development of bioinformatics and demographic databases to help fully realize the potential of this molecular information. Similar techniques are also used routinely in many wildlife cases, such as species identification in food products, poaching and the illegal trade of endangered species. The use of molecular techniques in forensic cases related to wildlife and the development of associated databases has, however, mainly focused on large mammals with the exception of a few high-profile species. There is a need to develop similar databases for aquatic species for fisheries enforcement, given the large number of exploited and endangered fish species, the intensity of exploitation, and challenges in identifying species and their derived products. We sequenced a 500bp fragment of the mitochondrial cytochrome b gene from representative individuals from 26 harvested fish taxa from Ontario, Canada, focusing on species that support major commercial and recreational fisheries. Ontario provides a unique model system for the development of a fish species database, as the province contains an evolutionarily diverse array of freshwater fish families representing more than one third of all freshwater fish in Canada. Inter- and intraspecific sequence comparisons using phylogenetic analysis and a BLAST search algorithm provided rigorous statistical metrics for species identification. This methodology and these data will aid in fisheries enforcement, providing a tool to easily and accurately identify fish species in enforcement investigations that would have otherwise been difficult or impossible to pursue.
Genegis: Computational Tools for Spatial Analyses of DNA Profiles with Associated Photo-Identification and Telemetry Records of Marine Mammals

DTIC Science & Technology

2013-09-30

profiles of right whales Eubalaena glacialis from the North Atlantic Right Whale Consortium; 2) DNA profiles of sperm whales Physeter macrocephalus...of other cetacean databases in Wildbook format (e.g., North Atlantic right whales, sperm whales and Hector’s dolphins); 8) Supported continuing...of sperm whales, using samples collected during the 5-year Voyage of the Odyssey; and 3) DNA profiles of Hector’s dolphins from Cloudy Bay, New
Interspecific Introgression in Cetaceans: DNA Markers Reveal Post-F1 Status of a Pilot Whale

PubMed Central

Miralles, Laura; Lens, Santiago; Rodríguez-Folgar, Antonio; Carrillo, Manuel; Martín, Vidal; Mikkelsen, Bjarni; Garcia-Vazquez, Eva

2013-01-01

Visual species identification of cetacean strandings is difficult, especially when dead specimens are degraded and/or species are morphologically similar. The two recognised pilot whale species (Globicephala melas and Globicephala macrorhynchus) are sympatric in the North Atlantic Ocean. These species are very similar in external appearance and their morphometric characteristics partially overlap; thus visual identification is not always reliable. Genetic species identification ensures correct identification of specimens. Here we have employed one mitochondrial (D-Loop region) and eight nuclear loci (microsatellites) as genetic markers to identify six stranded pilot whales found in Galicia (Northwest Spain), one of them of ambiguous phenotype. DNA analyses yielded positive amplification of all loci and enabled species identification. Nuclear microsatellite DNA genotypes revealed mixed ancestry for one individual, identified as a post-F1 interspecific hybrid employing two different Bayesian methods. From the mitochondrial sequence the maternal species was Globicephala melas. This is the first hybrid documented between Globicephala melas and G. macrorhynchus, and the first post-F1 hybrid genetically identified between cetaceans, revealing interspecific genetic introgression in marine mammals. We propose to add nuclear loci to genetic databases for cetacean species identification in order to detect hybrid individuals. PMID:23990883
Population and forensic genetic analyses of mitochondrial DNA control region variation from six major provinces in the Korean population.

PubMed

Hong, Seung Beom; Kim, Ki Cheol; Kim, Wook

2015-07-01

We generated complete mitochondrial DNA (mtDNA) control region sequences from 704 unrelated individuals residing in six major provinces in Korea. In addition to our earlier survey of the distribution of mtDNA haplogroup variation, a total of 560 different haplotypes characterized by 271 polymorphic sites were identified, of which 473 haplotypes were unique. The gene diversity and random match probability were 0.9989 and 0.0025, respectively. According to the pairwise comparison of the 704 control region sequences, the mean number of pairwise differences between individuals was 13.47±6.06. Based on the result of mtDNA control region sequences, pairwise FST genetic distances revealed genetic homogeneity of the Korean provinces on a peninsular level, except in samples from Jeju Island. This result indicates there may be a need to formulate a local mtDNA database for Jeju Island, to avoid bias in forensic parameter estimates caused by genetic heterogeneity of the population. Thus, the present data may help not only in personal identification but also in determining maternal lineages to provide an expanded and reliable Korean mtDNA database. These data will be available on the EMPOP database via accession number EMP00661. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
DNA barcoding of five common stored-product pest species of genus Cryptolestes (Coleoptera: Laemophloeidae).

PubMed

Wang, Y J; Li, Z H; Zhang, S F; Varadínová, Z; Jiang, F; Kučerová, Z; Stejskal, V; Opit, G; Cao, Y; Li, F J

2014-10-01

Several species of the genus Cryptolestes Ganglbauer, 1899 (Coleoptera: Laemophloeidae) are commonly found in stored products. In this study, five species of Cryptolestes, with almost worldwide distribution, were obtained from laboratories in China, Czech Republic and the USA: Cryptolestes ferrugineus (Stephens, 1831), Cryptolestes pusillus (Schönherr, 1817), Cryptolestes turcicus (Grouvelle, 1876), Cryptolestes pusilloides (Steel & Howe, 1952) and Cryptolestes capensis (Waltl, 1834). Molecular identification based on a 658 bp fragment from the mitochondrial DNA cytochrome c oxidase subunit I (COI) was adopted to overcome some problems of morphological identification of Cryptolestes species. The utility of COI sequences as DNA barcodes in discriminating the five Cryptolestes species was evaluated on adults and larvae by analysing Kimura 2-parameter distances, phylogenetic tree and haplotype networks. The results showed that molecular approaches based on DNA barcodes were able to accurately identify these species. This is the first study using DNA barcoding to identify Cryptolestes species and the gathered DNA sequences will complement the biological barcode database.
Multimodal biometric digital watermarking on immigrant visas for homeland security

NASA Astrophysics Data System (ADS)

Sasi, Sreela; Tamhane, Kirti C.; Rajappa, Mahesh B.

2004-08-01

Passengers with immigrant Visa's are a major concern to the International Airports due to the various fraud operations identified. To curb tampering of genuine Visa, the Visa's should contain human identification information. Biometric characteristic is a common and reliable way to authenticate the identity of an individual [1]. A Multimodal Biometric Human Identification System (MBHIS) that integrates iris code, DNA fingerprint, and the passport number on the Visa photograph using digital watermarking scheme is presented. Digital Watermarking technique is well suited for any system requiring high security [2]. Ophthalmologists [3], [4], [5] suggested that iris scan is an accurate and nonintrusive optical fingerprint. DNA sequence can be used as a genetic barcode [6], [7]. While issuing Visa at the US consulates, the DNA sequence isolated from saliva, the iris code and passport number shall be digitally watermarked in the Visa photograph. This information is also recorded in the 'immigrant database'. A 'forward watermarking phase' combines a 2-D DWT transformed digital photograph with the personal identification information. A 'detection phase' extracts the watermarked information from this VISA photograph at the port of entry, from which iris code can be used for identification and DNA biometric for authentication, if an anomaly arises.
Use of MALDI-TOF Mass Spectrometry and a Custom Database to Characterize Bacteria Indigenous to a Unique Cave Environment (Kartchner Caverns, AZ, USA)

PubMed Central

Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R.

2015-01-01

MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level. PMID:25590854
Use of MALDI-TOF mass spectrometry and a custom database to characterize bacteria indigenous to a unique cave environment (Kartchner Caverns, AZ, USA).

PubMed

Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R

2015-01-02

MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level.

Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data.

PubMed

Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

2009-04-01

We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).
FBIS: A regional DNA barcode archival & analysis system for Indian fishes.

PubMed

Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar

2012-01-01

DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. The database is available for free at http://mail.nbfgr.res.in/fbis/
Identification of small molecules capable of regulating conformational changes of telomeric G-quadruplex

NASA Astrophysics Data System (ADS)

Chen, Shuo-Bin; Liu, Guo-Cai; Gu, Lian-Quan; Huang, Zhi-Shu; Tan, Jia-Heng

2018-02-01

Design of small molecules targeted at human telomeric G-quadruplex DNA is an extremely active research area. Interestingly, the telomeric G-quadruplex is a highly polymorphic structure. Changes in its conformation upon small molecule binding may be a powerful method to achieve a desired biological effect. However, the rational development of small molecules capable of regulating conformational change of telomeric G-quadruplex structures is still challenging. In this study, we developed a reliable ligand-based pharmacophore model based on isaindigotone derivatives with conformational change activity toward telomeric G-quadruplex DNA. Furthermore, virtual screening of database was conducted using this pharmacophore model and benzopyranopyrimidine derivatives in the database were identified as a strong inducer of the telomeric G-quadruplex DNA conformation, transforming it from hybrid-type structure to parallel structure.
Rapid and reliable species identification of wild mushrooms by matrix assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF MS).

PubMed

Sugawara, Ryota; Yamada, Sayumi; Tu, Zhihao; Sugawara, Akiko; Suzuki, Kousuke; Hoshiba, Toshihiro; Eisaka, Sadao; Yamaguchi, Akihiro

2016-08-31

Mushrooms are a favourite natural food in many countries. However, some wild species cause food poisoning, sometimes lethal, due to misidentification caused by confusing fruiting bodies similar to those of edible species. The morphological inspection of mycelia, spores and fruiting bodies have been traditionally used for the identification of mushrooms. More recently, DNA sequencing analysis has been successfully applied to mushrooms and to many other species. This study focuses on a simpler and more rapid methodology for the identification of wild mushrooms via protein profiling based on matrix-assisted laser desorption/ionization mass spectrometry (MALDI-TOF MS). A preliminary study using 6 commercially available cultivated mushrooms suggested that a more reproducible spectrum was obtained from a portion of the cap than from the stem of a fruiting body by the extraction of proteins with a formic acid-acetonitrile mixture (1 + 1). We used 157 wild mushroom-fruiting bodies collected in the centre of Hokkaido from June to November 2014. Sequencing analysis of a portion of the ribosomal RNA gene provided 134 identifications of mushrooms by genus or species, however 23 samples containing 10 unknown species that had lower concordance rate of the nucleotide sequences in a BLAST search (less than 97%) and 13 samples that had unidentifiable poor or mixed sequencing signals remained unknown. MALDI-TOF MS analysis yielded a reproducible spectrum (frequency of matching score ≥ 2.0 was ≥6 spectra from 12 spectra measurements) for 114 of 157 samples. Profiling scores that matched each other within the database gave correct species identification (with scores of ≥2.0) for 110 samples (96%). An in-house prepared database was constructed from 106 independent species, except for overlapping identifications. We used 48 wild mushrooms that were collected in autumn 2015 to validate the in-house database. As a result, 21 mushrooms were identified at the species level with scores ≥2.0 and 5 mushrooms at the genus level with scores ≥1.7, although the signals of 2 mushrooms were insufficient for analysis. The remaining 20 samples were recognized as "unreliable identification" with scores <1.7. Subsequent DNA analysis confirmed that the correct species or genus identifications were achieved by MALDI-TOF MS for the 26 former samples, whereas the 18 mushrooms with poorly matched scores were species that were not included in the database. Thus, the proposed MALDI-TOF MS coupled with our database could be a powerful tool for the rapid and reliable identification of mushrooms; however, continuous updating of the database is necessary to enrich it with more abundant species. Copyright © 2016 Elsevier B.V. All rights reserved.
On-line resources for bacterial micro-evolution studies using MLVA or CRISPR typing.

PubMed

Grissa, Ibtissem; Bouchon, Patrick; Pourcel, Christine; Vergnaud, Gilles

2008-04-01

The control of bacterial pathogens requires the development of tools allowing the precise identification of strains at the subspecies level. It is now widely accepted that these tools will need to be DNA-based assays (in contrast to identification at the species level, where biochemical based assays are still widely used, even though very powerful 16S DNA sequence databases exist). Typing assays need to be cheap and amenable to the designing of international databases. The success of such subspecies typing tools will eventually be measured by the size of the associated reference databases accessible over the internet. Three methods have shown some potential in this direction, the so-called spoligotyping assay (Mycobacterium tuberculosis, 40,000 entries database), Multiple Loci Sequence Typing (MLST; up to a few thousands entries for the more than 20 bacterial species), and more recently Multiple Loci VNTR Analysis (MLVA; up to a few hundred entries, assays available for more than 20 pathogens). In the present report we will review the current status of the tools and resources we have developed along the past seven years to help in the setting-up or the use of MLVA assays or lately for analysing Clustered Regularly Interspaced Short Palindromic Repeats called CRISPRs which are the basis for spoligotyping assays.
High-quality mtDNA control region sequences from 680 individuals sampled across the Netherlands to establish a national forensic mtDNA reference database.

PubMed

Chaitanya, Lakshmi; van Oven, Mannis; Brauer, Silke; Zimmermann, Bettina; Huber, Gabriela; Xavier, Catarina; Parson, Walther; de Knijff, Peter; Kayser, Manfred

2016-03-01

The use of mitochondrial DNA (mtDNA) for maternal lineage identification often marks the last resort when investigating forensic and missing-person cases involving highly degraded biological materials. As with all comparative DNA testing, a match between evidence and reference sample requires a statistical interpretation, for which high-quality mtDNA population frequency data are crucial. Here, we determined, under high quality standards, the complete mtDNA control-region sequences of 680 individuals from across the Netherlands sampled at 54 sites, covering the entire country with 10 geographic sub-regions. The complete mtDNA control region (nucleotide positions 16,024-16,569 and 1-576) was amplified with two PCR primers and sequenced with ten different sequencing primers using the EMPOP protocol. Haplotype diversity of the entire sample set was very high at 99.63% and, accordingly, the random-match probability was 0.37%. No population substructure within the Netherlands was detected with our dataset. Phylogenetic analyses were performed to determine mtDNA haplogroups. Inclusion of these high-quality data in the EMPOP database (accession number: EMP00666) will improve its overall data content and geographic coverage in the interest of all EMPOP users worldwide. Moreover, this dataset will serve as (the start of) a national reference database for mtDNA applications in forensic and missing person casework in the Netherlands. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
DNA barcoding of odonates from the Upper Plata basin: Database creation and genetic diversity estimation.

PubMed

Koroiva, Ricardo; Pepinelli, Mateus; Rodrigues, Marciel Elio; Roque, Fabio de Oliveira; Lorenz-Lemke, Aline Pedroso; Kvist, Sebastian

2017-01-01

We present a DNA barcoding study of Neotropical odonates from the Upper Plata basin, Brazil. A total of 38 species were collected in a transition region of "Cerrado" and Atlantic Forest, both regarded as biological hotspots, and 130 cytochrome c oxidase subunit I (COI) barcodes were generated for the collected specimens. The distinct gap between intraspecific (0-2%) and interspecific variation (15% and above) in COI, and resulting separation of Barcode Index Numbers (BIN), allowed for successful identification of specimens in 94% of cases. The 6% fail rate was due to a shared BIN between two separate nominal species. DNA barcoding, based on COI, thus seems to be a reliable and efficient tool for identifying Neotropical odonate specimens down to the species level. These results underscore the utility of DNA barcoding to aid specimen identification in diverse biological hotspots, areas that require urgent action regarding taxonomic surveys and biodiversity conservation.
MethBank: a database integrating next-generation sequencing single-base-resolution DNA methylation programming data.

PubMed

Zou, Dong; Sun, Shixiang; Li, Rujiao; Liu, Jiang; Zhang, Jing; Zhang, Zhang

2015-01-01

DNA methylation plays crucial roles during embryonic development. Here we present MethBank (http://dnamethylome.org), a DNA methylome programming database that integrates the genome-wide single-base nucleotide methylomes of gametes and early embryos in different model organisms. Unlike extant relevant databases, MethBank incorporates the whole-genome single-base-resolution methylomes of gametes and early embryos at multiple different developmental stages in zebrafish and mouse. MethBank allows users to retrieve methylation levels, differentially methylated regions, CpG islands, gene expression profiles and genetic polymorphisms for a specific gene or genomic region. Moreover, it offers a methylome browser that is capable of visualizing high-resolution DNA methylation profiles as well as other related data in an interactive manner and thus is of great helpfulness for users to investigate methylation patterns and changes of gametes and early embryos at different developmental stages. Ongoing efforts are focused on incorporation of methylomes and related data from other organisms. Together, MethBank features integration and visualization of high-resolution DNA methylation data as well as other related data, enabling identification of potential DNA methylation signatures in different developmental stages and accordingly providing an important resource for the epigenetic and developmental studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genetics and Forensics: Making the National DNA Database

PubMed Central

Johnson, Paul; Williams, Robin; Martin, Paul

2005-01-01

This paper is based on a current study of the growing police use of the epistemic authority of molecular biology for the identification of criminal suspects in support of crime investigation. It discusses the development of DNA profiling and the establishment and development of the UK National DNA Database (NDNAD) as an instance of the ‘scientification of police work’ (Ericson and Shearing 1986) in which the police uses of science and technology have a recursive effect on their future development. The NDNAD, owned by the Association of Chief Police Officers of England and Wales, is the first of its kind in the world and currently contains the genetic profiles of more than 2 million people. The paper provides a framework for the examination of this socio-technical innovation, begins to tease out the dense and compact history of the database and accounts for the way in which changes and developments across disparate scientific, governmental and policing contexts, have all contributed to the range of uses to which it is put. PMID:16467921
Estimating Diversity of Florida Keys Zooplankton Using New Environmental DNA Methods

NASA Astrophysics Data System (ADS)

Djurhuus, A.; Goldsmith, D. B.; Sawaya, N. A.; Breitbart, M.

2016-02-01

Zooplankton are of great importance in marine food webs, where they serve to link the phytoplankton and bacteria with higher trophic levels. Zooplankton are a diverse group containing molluscs, crustaceans, fish larvae and many other taxa. The sheer number of species and often minor morphological distinctions between species makes it challenging and exceptionally time consuming to identify the species composition of marine zooplankton samples. As a part of the Marine Biodiversity Observation Network (MBON) project, we have developed and groundtruthed an alternative, relatively time-efficient method for zooplankton identification using environmental DNA (eDNA). Samples were collected from Molasses reef, Looe Key, and Western Sambo along the Florida Keys from five bi-monthly cruises on board the RV Walton Smith. Samples were collected for environmental DNA (eDNA) by filtering 1 L of water on to a 0.22 µm filter and zooplankton samples were collected using nets with three mesh sizes (64μm, 200μm, and 500μm) to catch different size fractions. Half of zooplankton samples were fixed in 70% ethanol and half in 10% formalin, for DNA extraction and morphological identification, respectively. Individuals representing visually abundant taxa were picked into individual wells for PCR with universal 18S rRNA gene primers and subsequent sequencing to build a reference barcode database for zooplankton species commonly found in the study region. PCR and Illumina MiSeq next generation sequencing was applied to the eDNA extracted from the 0.22 μm filters and sequences were be compared to our local custom database as well as publicly available databases to determine zooplankton community composition. Finally, composition and diversity analyses were performed to compare results obtained with the new eDNA approach to standard morphological classification of zooplankton communities. Results show that the eDNA approach can enable the determination of zooplankton diversity through collection of a single water sample, which, when combined with bacterial and archaeal diversity analyses, will help us understand the coupling between different trophic levels and the drivers of plankton dynamics in the sub-tropical Florida Keys.
A computational model to protect patient data from location-based re-identification.

PubMed

Malin, Bradley

2007-07-01

Health care organizations must preserve a patient's anonymity when disclosing personal data. Traditionally, patient identity has been protected by stripping identifiers from sensitive data such as DNA. However, simple automated methods can re-identify patient data using public information. In this paper, we present a solution to prevent a threat to patient anonymity that arises when multiple health care organizations disclose data. In this setting, a patient's location visit pattern, or "trail", can re-identify seemingly anonymous DNA to patient identity. This threat exists because health care organizations (1) cannot prevent the disclosure of certain types of patient information and (2) do not know how to systematically avoid trail re-identification. In this paper, we develop and evaluate computational methods that health care organizations can apply to disclose patient-specific DNA records that are impregnable to trail re-identification. To prevent trail re-identification, we introduce a formal model called k-unlinkability, which enables health care administrators to specify different degrees of patient anonymity. Specifically, k-unlinkability is satisfied when the trail of each DNA record is linkable to no less than k identified records. We present several algorithms that enable health care organizations to coordinate their data disclosure, so that they can determine which DNA records can be shared without violating k-unlinkability. We evaluate the algorithms with the trails of patient populations derived from publicly available hospital discharge databases. Algorithm efficacy is evaluated using metrics based on real world applications, including the number of suppressed records and the number of organizations that disclose records. Our experiments indicate that it is unnecessary to suppress all patient records that initially violate k-unlinkability. Rather, only portions of the trails need to be suppressed. For example, if each hospital discloses 100% of its data on patients diagnosed with cystic fibrosis, then 48% of the DNA records are 5-unlinkable. A naïve solution would suppress the 52% of the DNA records that violate 5-unlinkability. However, by applying our protection algorithms, the hospitals can disclose 95% of the DNA records, all of which are 5-unlinkable. Similar findings hold for all populations studied. This research demonstrates that patient anonymity can be formally protected in shared databases. Our findings illustrate that significant quantities of patient-specific data can be disclosed with provable protection from trail re-identification. The configurability of our methods allows health care administrators to quantify the effects of different levels of privacy protection and formulate policy accordingly.
New taxonomy and old collections: integrating DNA barcoding into the collection curation process.

PubMed

Puillandre, N; Bouchet, P; Boisselier-Dubayle, M-C; Brisset, J; Buge, B; Castelin, M; Chagnoux, S; Christophe, T; Corbari, L; Lambourdière, J; Lozouet, P; Marani, G; Rivasseau, A; Silva, N; Terryn, Y; Tillier, S; Utge, J; Samadi, S

2012-05-01

Because they house large biodiversity collections and are also research centres with sequencing facilities, natural history museums are well placed to develop DNA barcoding best practices. The main difficulty is generally the vouchering system: it must ensure that all data produced remain attached to the corresponding specimen, from the field to publication in articles and online databases. The Museum National d'Histoire Naturelle in Paris is one of the leading laboratories in the Marine Barcode of Life (MarBOL) project, which was used as a pilot programme to include barcode collections for marine molluscs and crustaceans. The system is based on two relational databases. The first one classically records the data (locality and identification) attached to the specimens. In the second one, tissue-clippings, DNA extractions (both preserved in 2D barcode tubes) and PCR data (including primers) are linked to the corresponding specimen. All the steps of the process [sampling event, specimen identification, molecular processing, data submission to Barcode Of Life Database (BOLD) and GenBank] are thus linked together. Furthermore, we have developed several web-based tools to automatically upload data into the system, control the quality of the sequences produced and facilitate the submission to online databases. This work is the result of a joint effort from several teams in the Museum National d'Histoire Naturelle (MNHN), but also from a collaborative network of taxonomists and molecular systematists outside the museum, resulting in the vouchering so far of ∼41,000 sequences and the production of ∼11,000 COI sequences. © 2012 Blackwell Publishing Ltd.
Indigenous species barcode database improves the identification of zooplankton

PubMed Central

Yang, Jianghua; Zhang, Wanwan; Sun, Jingying; Xie, Yuwei; Zhang, Yimin; Burton, G. Allen; Yu, Hongxia

2017-01-01

Incompleteness and inaccuracy of DNA barcode databases is considered an important hindrance to the use of metabarcoding in biodiversity analysis of zooplankton at the species-level. Species barcoding by Sanger sequencing is inefficient for organisms with small body sizes, such as zooplankton. Here mitochondrial cytochrome c oxidase I (COI) fragment barcodes from 910 freshwater zooplankton specimens (87 morphospecies) were recovered by a high-throughput sequencing platform, Ion Torrent PGM. Intraspecific divergence of most zooplanktons was < 5%, except Branchionus leydign (Rotifer, 14.3%), Trichocerca elongate (Rotifer, 11.5%), Lecane bulla (Rotifer, 15.9%), Synchaeta oblonga (Rotifer, 5.95%) and Schmackeria forbesi (Copepod, 6.5%). Metabarcoding data of 28 environmental samples from Lake Tai were annotated by both an indigenous database and NCBI Genbank database. The indigenous database improved the taxonomic assignment of metabarcoding of zooplankton. Most zooplankton (81%) with barcode sequences in the indigenous database were identified by metabarcoding monitoring. Furthermore, the frequency and distribution of zooplankton were also consistent between metabarcoding and morphology identification. Overall, the indigenous database improved the taxonomic assignment of zooplankton. PMID:28977035
DNA barcoding of vouchered xylarium wood specimens of nine endangered Dalbergia species.

PubMed

Yu, Min; Jiao, Lichao; Guo, Juan; Wiedenhoeft, Alex C; He, Tuo; Jiang, Xiaomei; Yin, Yafang

2017-12-01

ITS2+ trnH - psbA was the best combination of DNA barcode to resolve the Dalbergia wood species studied. We demonstrate the feasibility of building a DNA barcode reference database using xylarium wood specimens. The increase in illegal logging and timber trade of CITES-listed tropical species necessitates the development of unambiguous identification methods at the species level. For these methods to be fully functional and deployable for law enforcement, they must work using wood or wood products. DNA barcoding of wood has been promoted as a promising tool for species identification; however, the main barrier to extensive application of DNA barcoding to wood is the lack of a comprehensive and reliable DNA reference library of barcodes from wood. In this study, xylarium wood specimens of nine Dalbergia species were selected from the Wood Collection of the Chinese Academy of Forestry and DNA was then extracted from them for further PCR amplification of eight potential DNA barcode sequences (ITS2, matK, trnL, trnH-psbA, trnV-trnM1, trnV-trnM2, trnC-petN, and trnS-trnG). The barcodes were tested singly and in combination for species-level discrimination ability by tree-based [neighbor-joining (NJ)] and distance-based (TaxonDNA) methods. We found that the discrimination ability of DNA barcodes in combination was higher than any single DNA marker among the Dalbergia species studied, with the best two-marker combination of ITS2+trnH-psbA analyzed with NJ trees performing the best (100% accuracy). These barcodes are relatively short regions (<350 bp) and amplification reactions were performed with high success (≥90%) using wood as the source material, a necessary factor to apply DNA barcoding to timber trade. The present results demonstrate the feasibility of using vouchered xylarium specimens to build DNA barcoding reference databases.
Integrating early detection with DNA barcoding: species identification of a non-native monitor lizard (Squamata: Varanidae) carcass in Mississippi, U.S.A.

USGS Publications Warehouse

Reed, Robert N.; Hopken, Matthew W.; Steen, David A.; Falk, Bryan G.; Piaggio, Antoinette J.

2016-01-01

Early detection of invasive species is critical to increasing the probability of successful management. At the primary stage of an invasion, invasive species are easier to control as the population is likely represented by just a few individuals. Detection of these first few individuals can be challenging, particularly if they are cryptic or otherwise characterized by low detectability. The engagement of members of the public may be critical to early detection as there are far more citizen s on the landscape than trained biologists. However, it can be difficult to assess the credibility of public reporting, especially when a diagnostic digital image or a physical specimen in good condition are lacking. DNA barcoding can be used for verification when morphological identification of a specimen is not possible or uncertain (i.e., degraded or partial specimen). DNA barcoding relies on obtaining a DNA sequence from a relatively small fragment of mitochondrial DNA and comparing it to a database of sequences containing a variety of expertly identified species. He rein we report the successful identification of a degraded specimen of a non-native, potentially invasive reptile species (Varanus niloticus) via DNA barcoding, after discovery and reporting by a member of the public.
Spatial heterogeneity in the Mediterranean Biodiversity Hotspot affects barcoding accuracy of its freshwater fishes.

PubMed

Geiger, M F; Herder, F; Monaghan, M T; Almada, V; Barbieri, R; Bariche, M; Berrebi, P; Bohlen, J; Casal-Lopez, M; Delmastro, G B; Denys, G P J; Dettai, A; Doadrio, I; Kalogianni, E; Kärst, H; Kottelat, M; Kovačić, M; Laporte, M; Lorenzoni, M; Marčić, Z; Özuluğ, M; Perdices, A; Perea, S; Persat, H; Porcelotti, S; Puzzi, C; Robalo, J; Šanda, R; Schneider, M; Šlechtová, V; Stoumboudi, M; Walter, S; Freyhof, J

2014-11-01

Incomplete knowledge of biodiversity remains a stumbling block for conservation planning and even occurs within globally important Biodiversity Hotspots (BH). Although technical advances have boosted the power of molecular biodiversity assessments, the link between DNA sequences and species and the analytics to discriminate entities remain crucial. Here, we present an analysis of the first DNA barcode library for the freshwater fish fauna of the Mediterranean BH (526 spp.), with virtually complete species coverage (498 spp., 98% extant species). In order to build an identification system supporting conservation, we compared species determination by taxonomists to multiple clustering analyses of DNA barcodes for 3165 specimens. The congruence of barcode clusters with morphological determination was strongly dependent on the method of cluster delineation, but was highest with the general mixed Yule-coalescent (GMYC) model-based approach (83% of all species recovered as GMYC entity). Overall, genetic morphological discontinuities suggest the existence of up to 64 previously unrecognized candidate species. We found reduced identification accuracy when using the entire DNA-barcode database, compared with analyses on databases for individual river catchments. This scale effect has important implications for barcoding assessments and suggests that fairly simple identification pipelines provide sufficient resolution in local applications. We calculated Evolutionarily Distinct and Globally Endangered scores in order to identify candidate species for conservation priority and argue that the evolutionary content of barcode data can be used to detect priority species for future IUCN assessments. We show that large-scale barcoding inventories of complex biotas are feasible and contribute directly to the evaluation of conservation priorities. © 2014 John Wiley & Sons Ltd.
DNA barcode and identification of the varieties and provenances of Taiwan's domestic and imported made teas using ribosomal internal transcribed spacer 2 sequences.

PubMed

Lee, Shih-Chieh; Wang, Chia-Hsiang; Yen, Cheng-En; Chang, Chieh

2017-04-01

The major aim of made tea identification is to identify the variety and provenance of the tea plant. The present experiment used 113 tea plants [Camellia sinensis (L.) O. Kuntze] housed at the Tea Research and Extension Substation, from which 113 internal transcribed spacer 2 (ITS2) fragments, 104 trnL intron, and 98 trnL-trnF intergenic sequence region DNA sequences were successfully sequenced. The similarity of the ITS2 nucleotide sequences between tea plants housed at the Tea Research and Extension Substation was 0.379-0.994. In this polymerase chain reaction-amplified noncoding region, no varieties possessed identical sequences. Compared with the trnL intron and trnL-trnF intergenic sequence fragments of chloroplast cpDNA, the proportion of ITS2 nucleotide sequence variation was large and is more suitable for establishing a DNA barcode database to identify tea plant varieties. After establishing the database, 30 imported teas and 35 domestic made teas were used in this model system to explore the feasibility of using ITS2 sequences to identify the varieties and provenances of made teas. A phylogenetic tree was constructed using ITS2 sequences with the unweighted pair group method with arithmetic mean, which indicated that the same variety of tea plant is likely to be successfully categorized into one cluster, but contamination from other tea plants was also detected. This result provides molecular evidence that the similarity between important tea varieties in Taiwan remains high. We suggest a direct, wide collection of made tea and original samples of tea plants to establish an ITS2 sequence molecular barcode identification database to identify the varieties and provenances of tea plants. The DNA barcode comparison method can satisfy the need for a rapid, low-cost, frontline differentiation of the large amount of made teas from Taiwan and abroad, and can provide molecular evidence of their varieties and provenances. Copyright © 2016. Published by Elsevier B.V.
Identification of human remains from the Second World War mass graves uncovered in Bosnia and Herzegovina

PubMed Central

Marjanović, Damir; Hadžić Metjahić, Negra; Čakar, Jasmina; Džehverović, Mirela; Dogan, Serkan; Ferić, Elma; Džijan, Snježana; Škaro, Vedrana; Projić, Petar; Madžar, Tomislav; Rod, Eduard; Primorac, Dragan

2015-01-01

Aim To present the results obtained in the identification of human remains from World War II found in two mass graves in Ljubuški, Bosnia and Herzegovina. Methods Samples from 10 skeletal remains were collected. Teeth and femoral fragments were collected from 9 skeletons and only a femoral fragment from 1 skeleton. DNA was isolated from bone and teeth samples using an optimized phenol/chloroform DNA extraction procedure. All samples required a pre-extraction decalcification with EDTA and additional post-extraction DNA purification using filter columns. Additionally, DNA from 12 reference samples (buccal swabs from potential living relatives) was extracted using the Qiagen DNA extraction method. QuantifilerTM Human DNA Quantification Kit was used for DNA quantification. PowerPlex ESI kit was used to simultaneously amplify 15 autosomal short tandem repeat (STR) loci, and PowerPlex Y23 was used to amplify 23 Y chromosomal STR loci. Matching probabilities were estimated using a standard statistical approach. Results A total of 10 samples were processed, 9 teeth and 1 femoral fragment. Nine of 10 samples were profiled using autosomal STR loci, which resulted in useful DNA profiles for 9 skeletal remains. A comparison of established victims' profiles against a reference sample database yielded 6 positive identifications. Conclusion DNA analysis may efficiently contribute to the identification of remains even seven decades after the end of the World War II. The significant percentage of positively identified remains (60%), even when the number of the examined possible living relatives was relatively small (only 12), proved the importance of cooperation with the members of the local community, who helped to identify the closest missing persons’ relatives and collect referent samples from them. PMID:26088850
Identification of human remains from the Second World War mass graves uncovered in Bosnia and Herzegovina.

PubMed

Marjanović, Damir; Hadžić Metjahić, Negra; Čakar, Jasmina; Džehverović, Mirela; Dogan, Serkan; Ferić, Elma; Džijan, Snježana; Škaro, Vedrana; Projić, Petar; Madžar, Tomislav; Rod, Eduard; Primorac, Dragan

2015-06-01

To present the results obtained in the identification of human remains from World War II found in two mass graves in Ljubuški, Bosnia and Herzegovina. Samples from 10 skeletal remains were collected. Teeth and femoral fragments were collected from 9 skeletons and only a femoral fragment from 1 skeleton. DNA was isolated from bone and teeth samples using an optimized phenol/chloroform DNA extraction procedure. All samples required a pre-extraction decalcification with EDTA and additional post-extraction DNA purification using filter columns. Additionally, DNA from 12 reference samples (buccal swabs from potential living relatives) was extracted using the Qiagen DNA extraction method. QuantifilerTM Human DNA Quantification Kit was used for DNA quantification. PowerPlex ESI kit was used to simultaneously amplify 15 autosomal short tandem repeat (STR) loci, and PowerPlex Y23 was used to amplify 23 Y chromosomal STR loci. Matching probabilities were estimated using a standard statistical approach. A total of 10 samples were processed, 9 teeth and 1 femoral fragment. Nine of 10 samples were profiled using autosomal STR loci, which resulted in useful DNA profiles for 9 skeletal remains. A comparison of established victims' profiles against a reference sample database yielded 6 positive identifications. DNA analysis may efficiently contribute to the identification of remains even seven decades after the end of the World War II. The significant percentage of positively identified remains (60%), even when the number of the examined possible living relatives was relatively small (only 12), proved the importance of cooperation with the members of the local community, who helped to identify the closest missing persons' relatives and collect referent samples from them.
The first successful use of a low stringency familial match in a French criminal investigation.

PubMed

Pham-Hoai, Emmanuel; Crispino, Frank; Hampikian, Greg

2014-05-01

We describe how a very simple application of familial searching resolved a decade-old, high-profile rape/murder in France. This was the first use of familial searching in a criminal case using the French STR DNA database, which contains approximately 1,800,000 profiles. When an unknown forensic profile (18 loci) was searched against the French arrestee/offender database using CODIS configured for a low stringency search, a single low stringency match was identified. This profile was attributed to the father of the man suspected to be the source of the semen recovered from the murder victim Elodie Kulik. The identification was confirmed using Y-chromosome DNA from the putative father, an STR profile from the mother, and finally a tissue sample from the exhumed body of the man who left the semen. Because of this identification, the investigators are now pursuing possible co-conspirators. © 2014 American Academy of Forensic Sciences.

Raw Cow Milk Bacterial Population Shifts Attributable to Refrigeration

PubMed Central

Lafarge, Véronique; Ogier, Jean-Claude; Girard, Victoria; Maladen, Véronique; Leveau, Jean-Yves; Gruss, Alexandra; Delacroix-Buchet, Agnès

2004-01-01

We monitored the dynamic changes in the bacterial population in milk associated with refrigeration. Direct analyses of DNA by using temporal temperature gel electrophoresis (TTGE) and denaturing gradient gel electrophoresis (DGGE) allowed us to make accurate species assignments for bacteria with low-GC-content (low-GC%) (<55%) and medium- or high-GC% (>55%) genomes, respectively. We examined raw milk samples before and after 24-h conservation at 4°C. Bacterial identification was facilitated by comparison with an extensive bacterial reference database (∼150 species) that we established with DNA fragments of pure bacterial strains. Cloning and sequencing of fragments missing from the database were used to achieve complete species identification. Considerable evolution of bacterial populations occurred during conservation at 4°C. TTGE and DGGE are shown to be a powerful tool for identifying the main bacterial species of the raw milk samples and for monitoring changes in bacterial populations during conservation at 4°C. The emergence of psychrotrophic bacteria such as Listeria spp. or Aeromonas hydrophila is demonstrated. PMID:15345453
DNA barcoding insect–host plant associations

PubMed Central

Jurado-Rivera, José A.; Vogler, Alfried P.; Reid, Chris A.M.; Petitpierre, Eduard; Gómez-Zurita, Jesús

2008-01-01

Short-sequence fragments (‘DNA barcodes’) used widely for plant identification and inventorying remain to be applied to complex biological problems. Host–herbivore interactions are fundamental to coevolutionary relationships of a large proportion of species on the Earth, but their study is frequently hampered by limited or unreliable host records. Here we demonstrate that DNA barcodes can greatly improve this situation as they (i) provide a secure identification of host plant species and (ii) establish the authenticity of the trophic association. Host plants of leaf beetles (subfamily Chrysomelinae) from Australia were identified using the chloroplast trnL(UAA) intron as barcode amplified from beetle DNA extracts. Sequence similarity and phylogenetic analyses provided precise identifications of each host species at tribal, generic and specific levels, depending on the available database coverage in various plant lineages. The 76 species of Chrysomelinae included—more than 10 per cent of the known Australian fauna—feed on 13 plant families, with preference for Australian radiations of Myrtaceae (eucalypts) and Fabaceae (acacias). Phylogenetic analysis of beetles shows general conservation of host association but with rare host shifts between distant plant lineages, including a few cases where barcodes supported two phylogenetically distant host plants. The study demonstrates that plant barcoding is already feasible with the current publicly available data. By sequencing plant barcodes directly from DNA extractions made from herbivorous beetles, strong physical evidence for the host association is provided. Thus, molecular identification using short DNA fragments brings together the detection of species and the analysis of their interactions. PMID:19004756
Is the extraction by Whatman FTA filter matrix technology and sequencing of large ribosomal subunit D1-D2 region sufficient for identification of clinical fungi?

PubMed

Kiraz, Nuri; Oz, Yasemin; Aslan, Huseyin; Erturan, Zayre; Ener, Beyza; Akdagli, Sevtap Arikan; Muslumanoglu, Hamza; Cetinkaya, Zafer

2015-10-01

Although conventional identification of pathogenic fungi is based on the combination of tests evaluating their morphological and biochemical characteristics, they can fail to identify the less common species or the differentiation of closely related species. In addition these tests are time consuming, labour-intensive and require experienced personnel. We evaluated the feasibility and sufficiency of DNA extraction by Whatman FTA filter matrix technology and DNA sequencing of D1-D2 region of the large ribosomal subunit gene for identification of clinical isolates of 21 yeast and 160 moulds in our clinical mycology laboratory. While the yeast isolates were identified at species level with 100% homology, 102 (63.75%) clinically important mould isolates were identified at species level, 56 (35%) isolates at genus level against fungal sequences existing in DNA databases and two (1.25%) isolates could not be identified. Consequently, Whatman FTA filter matrix technology was a useful method for extraction of fungal DNA; extremely rapid, practical and successful. Sequence analysis strategy of D1-D2 region of the large ribosomal subunit gene was found considerably sufficient in identification to genus level for the most clinical fungi. However, the identification to species level and especially discrimination of closely related species may require additional analysis. © 2015 Blackwell Verlag GmbH.
Familial searching: a specialist forensic DNA profiling service utilising the National DNA Database to identify unknown offenders via their relatives--the UK experience.

PubMed

Maguire, C N; McCallum, L A; Storey, C; Whitaker, J P

2014-01-01

The National DNA Database (NDNAD) of England and Wales was established on April 10th 1995. The NDNAD is governed by a variety of legislative instruments that mean that DNA samples can be taken if an individual is arrested and detained in a police station. The biological samples and the DNA profiles derived from them can be used for purposes related to the prevention and detection of crime, the investigation of an offence and for the conduct of a prosecution. Following the South East Asian Tsunami of December 2004, the legislation was amended to allow the use of the NDNAD to assist in the identification of a deceased person or of a body part where death has occurred from natural causes or from a natural disaster. The UK NDNAD now contains the DNA profiles of approximately 6 million individuals representing 9.6% of the UK population. As the science of DNA profiling advanced, the National DNA Database provided a potential resource for increased intelligence beyond the direct matching for which it was originally created. The familial searching service offered to the police by several UK forensic science providers exploits the size and geographic coverage of the NDNAD and the fact that close relatives of an offender may share a significant proportion of that offender's DNA profile and will often reside in close geographic proximity to him or her. Between 2002 and 2011 Forensic Science Service Ltd. (FSS) provided familial search services to support 188 police investigations, 70 of which are still active cases. This technique, which may be used in serious crime cases or in 'cold case' reviews when there are few or no investigative leads, has led to the identification of 41 perpetrators or suspects. In this paper we discuss the processes, utility, and governance of the familial search service in which the NDNAD is searched for close genetic relatives of an offender who has left DNA evidence at a crime scene, but whose DNA profile is not represented within the NDNAD. We discuss the scientific basis of the familial search approach, other DNA-based methods for eliminating individuals from the candidate lists generated by these NDNAD searches, the value of filtering these lists by age, ethnic appearance and geography and the governance required by the NDNAD Strategy Board when a police force commissions a familial search. We present the FSS data in relation to the utility of the familial searching service and demonstrate the power of the technique by reference to casework examples. We comment on the uptake of familial searching of DNA databases in the USA, the Netherlands, Australia, and New Zealand. Finally, following the adverse ruling by the European Court of Human Rights against the UK in regard to the S & Marper cases and the consequent introduction of the Protection of Freedoms Act (2012), we discuss the impact that changes to regulations concerning the storage of DNA samples will have on the continuing provision of familial searching of the National DNA Database in England and Wales. Published by Elsevier Ireland Ltd.
Classification of Sharks in the Egyptian Mediterranean Waters Using Morphological and DNA Barcoding Approaches

PubMed Central

Moftah, Marie; Abdel Aziz, Sayeda H.; Elramah, Sara; Favereaux, Alexandre

2011-01-01

The identification of species constitutes the first basic step in phylogenetic studies, biodiversity monitoring and conservation. DNA barcoding, i.e. the sequencing of a short standardized region of DNA, has been proposed as a new tool for animal species identification. The present study provides an update on the composition of shark in the Egyptian Mediterranean waters off Alexandria, since the latest study to date was performed 30 years ago, DNA barcoding was used in addition to classical taxonomical methodologies. Thus, 51 specimen were DNA barcoded for a 667 bp region of the mitochondrial COI gene. Although DNA barcoding aims at developing species identification systems, some phylogenetic signals were apparent in the data. In the neighbor-joining tree, 8 major clusters were apparent, each of them containing individuals belonging to the same species, and most with 100% bootstrap value. This study is the first to our knowledge to use DNA barcoding of the mitochondrial COI gene in order to confirm the presence of species Squalus acanthias, Oxynotus centrina, Squatina squatina, Scyliorhinus canicula, Scyliorhinus stellaris, Mustelus mustelus, Mustelus punctulatus and Carcharhinus altimus in the Egyptian Mediterranean waters. Finally, our study is the starting point of a new barcoding database concerning shark composition in the Egyptian Mediterranean waters (Barcoding of Egyptian Mediterranean Sharks [BEMS], http://www.boldsystems.org/views/projectlist.php?&#Barcoding%20Fish%20%28FishBOL%29). PMID:22087242
Rapid Detection & Identification of Bacillus Species using MALDI-TOF/TOF and Biomarker Database

DTIC Science & Technology

2006-06-01

rRNA sequence analysis. Multilocus enzyme electrophoresis ( MEE ) and comparative DNA sequence analysis suggest that they may represent a single species...adaptation of the MEE method [63] but with greater discrimination [64]. All of these new PCR-based subtyping methods are certainly superior and more...Demirev, P.A., Lin, J.S., Pineda , F.J., and Fenselau, C. (2001). Bioinformatics and mass spectrometry for microorganism identification: proteome-wide
BEAUTY-X: enhanced BLAST searches for DNA queries.

PubMed

Worley, K C; Culpepper, P; Wiese, B A; Smith, R F

1998-01-01

BEAUTY (BLAST Enhanced Alignment Utility) is an enhanced version of the BLAST database search tool that facilitates identification of the functions of matched sequences. Three recent improvements to the BEAUTY program described here make the enhanced output (1) available for DNA queries, (2) available for searches of any protein database, and (3) more up-to-date, with periodic updates of the domain information. BEAUTY searches of the NCBI and EMBL non-redundant protein sequence databases are available from the BCM Search Launcher Web pages (http://gc.bcm.tmc. edu:8088/search-launcher/launcher.html). BEAUTY Post-Processing of submitted search results is available using the BCM Search Launcher Batch Client (version 2.6) (ftp://gc.bcm.tmc. edu/pub/software/search-launcher/). Example figures are available at http://dot.bcm.tmc. edu:9331/papers/beautypp.html (kworley,culpep)@bcm.tmc.edu
The National DNA Data Bank of Canada: a Quebecer perspective

PubMed Central

Milot, Emmanuel; Lecomte, Marie M. J.; Germain, Hugo; Crispino, Frank

2013-01-01

The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes), the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory), but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification. PMID:24312124
The National DNA Data Bank of Canada: a Quebecer perspective.

PubMed

Milot, Emmanuel; Lecomte, Marie M J; Germain, Hugo; Crispino, Frank

2013-11-20

The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes), the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory), but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification.
The Fulton County Medical Examiner's experience with the Federal Bureau of Investigation National Missing Person DNA Database Program, 2004-2007.

PubMed

Heninger, Michael; Hanzlick, Randy

2011-03-01

Medical examiners and coroners occasionally encounter unidentified human bodies, which remain unidentified for extended periods. In such cases, when traditional methods of identification have failed or cannot be used, DNA profiling may be used. The Federal Bureau of Investigation has a National Missing Person DNA database (NMPDD) laboratory to which samples may be submitted on such cases and from possible relatives or environments of unidentified decedents. This article describes the experience of the Fulton County Medical Examiner (FCME) in submitting samples to the NMPDD laboratory. A database was established at the FCME to track the submission of samples from unidentified decedents to the NMPDD laboratory for DNA testing along with the results and turnaround times. In December 2004, the FCME inventoried all cases for which samples were available and began to submit them to the NMPDD laboratory for testing. DNA testing and isolation rates, sample type, and turnaround times were tabulated in October 2006 for samples submitted between December 16, 2004 and December 16, 2005. An overall summary of data was also prepared concerning the status of all samples submitted as of April 17, 2007. During the 1-year study period, samples from 77 unidentified decedents were submitted to the laboratory. As of October 2006 (22 months after submission of the first samples and 10 months after submission of the last samples), testing had been completed on 53% of the samples submitted, and 68% of those tested resulted in a mitochondrial DNA profile. Turnaround times ranged from 66 to 557 days, improved with time, and had a mean of 107 days for specimens submitted during the latter part of the study period. As of April 17, 2007, we had submitted samples involving 84 unidentified decedents. Seventy-five percent of the samples have now been tested. Data from the NMPDD laboratory have resulted in 4 identifications by comparison with putative relatives, 4 exclusions, and no cold hits through comparison NMPDD DNA profiles from missing persons. More extensive data are presented in the body of this article. The NMPDD laboratory provides useful and free services to medical examiners, coroners, and law enforcement agencies that require DNA services regarding missing and unidentified persons. Turnaround times have improved. The success of the system in getting cold hits will be heavily dependent on law enforcement filing missing persons reports and submission of reference samples from putative relatives of the decedent. We recommend collecting specimens for DNA analysis early on in the postmortem investigation, submitting samples to the NMPDD laboratory or one of its participating laboratories when traditional methods for identification cannot be used or have failed, not burying bodies until a DNA profile has been obtained, and not cremating unidentified remains.
FBIS: A regional DNA barcode archival & analysis system for Indian fishes

PubMed Central

Nagpure, Naresh Sahebrao; Rashid, Iliyas; Pathak, Ajey Kumar; Singh, Mahender; Singh, Shri Prakash; Sarkar, Uttam Kumar

2012-01-01

DNA barcode is a new tool for taxon recognition and classification of biological organisms based on sequence of a fragment of mitochondrial gene, cytochrome c oxidase I (COI). In view of the growing importance of the fish DNA barcoding for species identification, molecular taxonomy and fish diversity conservation, we developed a Fish Barcode Information System (FBIS) for Indian fishes, which will serve as a regional DNA barcode archival and analysis system. The database presently contains 2334 sequence records of COI gene for 472 aquatic species belonging to 39 orders and 136 families, collected from available published data sources. Additionally, it contains information on phenotype, distribution and IUCN Red List status of fishes. The web version of FBIS was designed using MySQL, Perl and PHP under Linux operating platform to (a) store and manage the acquisition (b) analyze and explore DNA barcode records (c) identify species and estimate genetic divergence. FBIS has also been integrated with appropriate tools for retrieving and viewing information about the database statistics and taxonomy. It is expected that FBIS would be useful as a potent information system in fish molecular taxonomy, phylogeny and genomics. Availability The database is available for free at http://mail.nbfgr.res.in/fbis/ PMID:22715304
DNA barcoding commercially important aquatic invertebrates of Turkey.

PubMed

Keskin, Emre; Atar, Hasan Hüseyin

2013-08-01

DNA barcoding was used in order to identify aquatic invertebrates sampled from fisheries bycatch and discards. A total of 440 unique cytochrome c oxidase sub unit I (COI) barcodes were generated for 22 species from three important phyla (Arthropoda, Cnidaria, and Mollusca). All the species were sequenced and submitted to GenBank and Barcode of Life Database (BOLD) databases using 654 bp-long fragment of mitochondrial COI gene. Two of them (Pontastacus leptodactylus and Rapana bezoar) were first records of the species for the BOLD database and six of them (Carcinus aestuarii, Loligo vulgaris, Melicertus kerathurus, Nephrops norvegicus, Scyllarides latus, and Scyllarus arctus) were first standard (>648 bp) COI barcode records for the GenBank database. COI barcodes were analyzed for nucleotide composition, nucleotide pair frequencies, and Kimura's two-parameter genetic distance. Mean genetic distance among species was found increasing at higher taxonomic levels. Neighbor-joining trees generated were congruent with morphometric-based taxonomic classification. Findings of this study clearly demonstrate that DNA barcodes could be used as an efficient molecular tool in identification of not only target species from fisheries but also bycatch and discard species, and so it could provide us leverage for a better understanding in monitoring and management of fisheries and biodiversity.
Identification and reassessment of the specific status of some tropical freshwater midges (Diptera: Chironomidae) using DNA barcode data.

PubMed

Pramual, Pairot; Simwisat, Kusumart; Martin, Jon

2016-01-28

Chironomidae are a highly diverse group of insects. Members of this family are often included in programs monitoring the health of freshwater ecosystems. However, a difficulty in morphological identification, particularly of larval stages is the major obstacle to this application. In this study, we tested the efficiency of mitochondrial cytochrome c oxidase I (COI) sequences as the DNA barcoding region for species identification of Chironomidae in Thailand. The results revealed 14 species with a high success rate (>90%) for the correct species identification, which suggests the potential usefulness of the technique. However, some morphological species possess high (>3%) intraspecific genetic divergence that suggests these species could be species complexes and need further morphological or cytological examination. Sequence-based species delimitation analyses indicated that most specimens identified as Chironomus kiiensis, Tokunaga 1936, in Japan are conspecific with C. striatipennis, Kieffer 1912, although a small number form a separate cluster. A review of the descriptions of Kiefferulus tainanus (Kieffer 1912) and its junior synonym, K. biroi (Kieffer 1918), following our results, suggests that this synonymy is probably not correct and that K. tainanus occurs in Japan, China and Singapore, while K. biroi occurs in India and Thailand. Our results therefore revealed the usefulness of DNA barcoding for correct species identification of Chironomidae, particularly the immature stages. In addition, DNA barcodes could also uncover hidden diversity that can guide further taxonomic study, and offer a more efficient way to identify species than morphological analysis where large numbers of specimens are involved, provided the identifications of DNA barcodes in the databases are correct. Our studies indicate that this is not the case, and we identify cases of misidentifications for C. flaviplumus, Tokunaga 1940 and K. tainanus.
DNA barcoding of Clarias gariepinus, Coptodon zillii and Sarotherodon melanotheron from Southwestern Nigeria

PubMed Central

Falade, Mofolusho O.; Opene, Anthony J.; Benson, Otarigho

2016-01-01

DNA barcoding has been adopted as a gold standard rapid, precise and unifying identification system for animal species and provides a database of genetic sequences that can be used as a tool for universal species identification. In this study, we employed mitochondrial genes 16S rRNA (16S) and cytochrome oxidase subunit I (COI) for the identification of some Nigerian freshwater catfish and Tilapia species. Approximately 655 bp were amplified from the 5′ region of the mitochondrial cytochrome C oxidase subunit I (COI) gene whereas 570 bp were amplified for the 16S rRNA gene. Nucleotide divergences among sequences were estimated based on Kimura 2-parameter distances and the genetic relationships were assessed by constructing phylogenetic trees using the neighbour-joining (NJ) and maximum likelihood (ML) methods. Analyses of consensus barcode sequences for each species, and alignment of individual sequences from within a given species revealed highly consistent barcodes (99% similarity on average), which could be compared with deposited sequences in public databases. The nucleotide distance between species belonging to different genera based on COI ranged from 0.17% between Sarotherodon melanotheron and Coptodon zillii to 0.49% between Clarias gariepinus and C. zillii, indicating that S. melanotheron and C. zillii are closely related. Based on the data obtained, the utility of COI gene was confirmed in accurate identification of three fish species from Southwest Nigeria. PMID:27990256
Bioethical Biobanks: Three Concerns in Designing and Using Law Enforcement DNA Identification Databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

D.H. Kaye

2006-10-19

Federal and state law enforcement authorities have amassed large collections of DNA samples and the identifying profiles derived from them. These databases help to identify the guilty and to exonerate the innocent, but as the databanks grow, so do fears about civil liberties. The research reported here discusses three legal and social policy issues that have been raised in regard to these biobanks—the choice of loci to type for identifying individuals, the indefinite retention of DNA samples, and the use of the DNA samples or the identifying profiles for research purposes. It also considers the possible value of the databasesmore » for research into the genetics of human behavior and the ethics of using them for this purpose. It rejects the broad claim that such research is inherently unethical but proposes procedures for ensuring that the value of the proposed research justifies any psychosocial or other risks to the subjects of the research.« less
Dangers resulting from DNA profiling of biological materials derived from patients after allogeneic hematopoietic stem cell transplantation (allo-HSCT) with regard to forensic genetic analysis.

PubMed

Jacewicz, R; Lewandowski, K; Rupa-Matysek, J; Jędrzejczyk, M; Berent, J

The study documents the risk that comes with DNA analysis of materials derived from patients after allogeneic hematopoietic stem cell transplantation (allo-HSCT) in forensic genetics. DNA chimerism was studied in 30 patients after allo-HSCT, based on techniques applied in contemporary forensic genetics, i.e. real-time PCR and multiplex PCR-STR with the use of autosomal DNA as well as Y-DNA markers. The results revealed that the DNA profile of the recipient's blood was identical with the donor's in the majority of cases. Therefore, blood analysis can lead to false conclusions in personal identification as well as kinship analysis. An investigation of buccal swabs revealed a mixture of DNA in the majority of recipients. Consequently, personal identification on the basis of stain analysis of the same origin may be impossible. The safest (but not ideal) material turned out to be the hair root. Its analysis based on autosomal DNA revealed 100% of the recipient's profile. However, an analysis based on Y-chromosome markers performed in female allo-HSCT recipients with male donors demonstrated the presence of donor DNA in hair cells - similarly to the blood and buccal swabs. In the light of potential risks arising from DNA profiling of biological materials derived from persons after allotransplantation in judicial aspects, certain procedures were proposed to eliminate such dangers. The basic procedures include abandoning the approach based exclusively on blood collection, both for kinship analysis and personal identification; asking persons who are to be tested about their history of allo-HSCT before sample collection and profile entry in the DNA database, and verification of DNA profiling based on hair follicles in uncertain cases.
DNA-Based Identification of Forensically Important Blow Flies (Diptera: Calliphoridae) From India.

PubMed

Bharti, Meenakshi; Singh, Baneshwar

2017-09-01

Correct species identification is the first and the most important criteria in entomological evidence-based postmortem interval (PMI) estimation. Although morphological keys are available for species identification of adult blow flies, keys for immature stages are either lacking or are incomplete. In this study, cytochrome oxidase subunit 1 (COI) reference data were developed from nine species (belonging to three subfamilies, namely, Calliphorinae, Luciliinae, and Chrysomyinae) of blow flies from India. Seven of the nine species included in this study were found suitable for DNA-based identification using COI gene, because they showed nonoverlapping intra- (0.0-0.3%) and inter-(1.96-18.14%) specific diversity, and formed well-supported monophyletic clade in phylogenetic analysis. The remaining two species (i.e., Chrysomya megacephala (Fabricius) and Chrysomya chani Kurahashi) cannot be distinguished reliably using our database because they had a very low interspecific diversity (0.11%), and Ch. megacephala was paraphyletic with respect to Ch. chani in the phylogenetic analysis. We conclude that the COI gene is a useful marker for DNA-based identification of blow flies from India. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Mini-DNA barcode in identification of the ornamental fish: A case study from Northeast India.

PubMed

Dhar, Bishal; Ghosh, Sankar Kumar

2017-09-05

The ornamental fishes were exported under the trade names or generic names, thus creating problems in species identification. In this regard, DNA barcoding could effectively elucidate the actual species status. However, the problem arises if the specimen is having taxonomic disputes, falsified by trade/generic names, etc., On the other hand, barcoding the archival museum specimens would be of greater benefit to address such issues as it would create firm, error-free reference database for rapid identification of any species. This can be achieved only by generating short sequences as DNA from chemically preserved are mostly degraded. Here we aimed to identify a short stretch of informative sites within the full-length barcode segment, capable of delineating diverse group of ornamental fish species, commonly traded from NE India. We analyzed 287 full-length barcode sequences from the major fish orders and compared the interspecific K2P distance with nucleotide substitutions patterns and found a strong correlation of interspecies distance with transversions (0.95, p<0.001). We, therefore, proposed a short stretch of 171bp (transversion rich) segment as mini-barcode. The proposed segment was compared with the full-length barcodes and found to delineate the species effectively. Successful PCR amplification and sequencing of the 171bp segment using designed primers for different orders validated it as mini-barcodes for ornamental fishes. Thus, our findings would be helpful in strengthening the global database with the sequence of archived fish species as well as an effective identification tool of the traded ornamental fish species, as a less time consuming, cost effective field-based application. Copyright © 2017 Elsevier B.V. All rights reserved.
DNA barcoding of odonates from the Upper Plata basin: Database creation and genetic diversity estimation

PubMed Central

Pepinelli, Mateus; Rodrigues, Marciel Elio; Roque, Fabio de Oliveira; Lorenz-Lemke, Aline Pedroso; Kvist, Sebastian

2017-01-01

We present a DNA barcoding study of Neotropical odonates from the Upper Plata basin, Brazil. A total of 38 species were collected in a transition region of “Cerrado” and Atlantic Forest, both regarded as biological hotspots, and 130 cytochrome c oxidase subunit I (COI) barcodes were generated for the collected specimens. The distinct gap between intraspecific (0–2%) and interspecific variation (15% and above) in COI, and resulting separation of Barcode Index Numbers (BIN), allowed for successful identification of specimens in 94% of cases. The 6% fail rate was due to a shared BIN between two separate nominal species. DNA barcoding, based on COI, thus seems to be a reliable and efficient tool for identifying Neotropical odonate specimens down to the species level. These results underscore the utility of DNA barcoding to aid specimen identification in diverse biological hotspots, areas that require urgent action regarding taxonomic surveys and biodiversity conservation. PMID:28763495
Short tandem repeat DNA typing provides an international reference standard for authentication of human cell lines.

PubMed

Dirks, Wilhelm Gerhard; Faehnrich, Silke; Estella, Isabelle Annick Janine; Drexler, Hans Guenter

2005-01-01

Cell lines have wide applications as model systems in the medical and pharmaceutical industry. Much drug and chemical testing is now first carried out exhaustively on in vitro systems, reducing the need for complicated and invasive animal experiments. The basis for any research, development or production program involving cell lines is the choice of an authentic cell line. Microsatellites in the human genome that harbour short tandem repeat (STR) DNA markers allow individualisation of established cell lines at the DNA level. Fluorescence polymerase chain reaction amplification of eight highly polymorphic microsatellite STR loci plus gender determination was found to be the best tool to screen the uniqueness of DNA profiles in a fingerprint database. Our results demonstrate that cross-contamination and misidentification remain chronic problems in the use of human continuous cell lines. The combination of rapidly generated DNA types based on single-locus STR and their authentication or individualisation by screening the fingerprint database constitutes a highly reliable and robust method for the identification and verification of cell lines.

Use of rbcL and trnL-F as a Two-Locus DNA Barcode for Identification of NW-European Ferns: An Ecological Perspective

PubMed Central

de Groot, G. Arjen; During, Heinjo J.; Maas, Jan W.; Schneider, Harald; Vogel, Johannes C.; Erkens, Roy H. J.

2011-01-01

Although consensus has now been reached on a general two-locus DNA barcode for land plants, the selected combination of markers (rbcL + matK) is not applicable for ferns at the moment. Yet especially for ferns, DNA barcoding is potentially of great value since fern gametophytes—while playing an essential role in fern colonization and reproduction—generally lack the morphological complexity for morphology-based identification and have therefore been underappreciated in ecological studies. We evaluated the potential of a combination of rbcL with a noncoding plastid marker, trnL-F, to obtain DNA-identifications for fern species. A regional approach was adopted, by creating a reference database of trusted rbcL and trnL-F sequences for the wild-occurring homosporous ferns of NW-Europe. A combination of parsimony analyses and distance-based analyses was performed to evaluate the discriminatory power of the two-region barcode. DNA was successfully extracted from 86 tiny fern gametophytes and was used as a test case for the performance of DNA-based identification. Primer universality proved high for both markers. Based on the combined rbcL + trnL-F dataset, all genera as well as all species with non-equal chloroplast genomes formed their own well supported monophyletic clade, indicating a high discriminatory power. Interspecific distances were larger than intraspecific distances for all tested taxa. Identification tests on gametophytes showed a comparable result. All test samples could be identified to genus level, species identification was well possible unless they belonged to a pair of Dryopteris species with completely identical chloroplast genomes. Our results suggest a high potential of the combined use of rbcL and trnL-F as a two-locus cpDNA barcode for identification of fern species. A regional approach may be preferred for ecological tests. We here offer such a ready-to-use barcoding approach for ferns, which opens the way for answering a whole range of questions previously unaddressed in fern gametophyte ecology. PMID:21298108
Establishment of a matrix-assisted laser desorption ionization time-of-flight mass spectrometry database for rapid identification of infectious achlorophyllous green micro-algae of the genus Prototheca.

PubMed

Murugaiyan, J; Ahrholdt, J; Kowbel, V; Roesler, U

2012-05-01

The possibility of using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) for rapid identification of pathogenic and non-pathogenic species of the genus Prototheca has been recently demonstrated. A unique reference database of MALDI-TOF MS profiles for type and reference strains of the six generally accepted Prototheca species was established. The database quality was reinforced after the acquisition of 27 spectra for selected Prototheca strains, with three biological and technical replicates for each of 18 type and reference strains of Prototheca and four strains of Chlorella. This provides reproducible and unique spectra covering a wide m/z range (2000-20 000 Da) for each of the strains used in the present study. The reproducibility of the spectra was further confirmed by employing composite correlation index calculation and main spectra library (MSP) dendrogram creation, available with MALDI Biotyper software. The MSP dendrograms obtained were comparable with the 18S rDNA sequence-based dendrograms. These reference spectra were successfully added to the Bruker database, and the efficiency of identification was evaluated by cross-reference-based and unknown Prototheca identification. It is proposed that the addition of further strains would reinforce the reference spectra library for rapid identification of Prototheca strains to the genus and species/genotype level. © 2011 The Authors. Clinical Microbiology and Infection © 2011 European Society of Clinical Microbiology and Infectious Diseases.
Cloud-based adaptive exon prediction for DNA analysis.

PubMed

Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen

2018-02-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
The problems and promise of DNA barcodes for species diagnosis of primate biomaterials

PubMed Central

Lorenz, Joseph G; Jackson, Whitney E; Beck, Jeanne C; Hanner, Robert

2005-01-01

The Integrated Primate Biomaterials and Information Resource (www.IPBIR.org) provides essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing DNA and RNA derived from primate cell cultures. The IPBIR uses mitochondrial cytochrome c oxidase subunit I sequences to verify the identity of samples for quality control purposes in the accession, cell culture, DNA extraction processes and prior to shipping to end users. As a result, IPBIR is accumulating a database of ‘DNA barcodes’ for many species of primates. However, this quality control process is complicated by taxon specific patterns of ‘universal primer’ failure, as well as the amplification or co-amplification of nuclear pseudogenes of mitochondrial origins. To overcome these difficulties, taxon specific primers have been developed, and reverse transcriptase PCR is utilized to exclude these extraneous sequences from amplification. DNA barcoding of primates has applications to conservation and law enforcement. Depositing barcode sequences in a public database, along with primer sequences, trace files and associated quality scores, makes this species identification technique widely accessible. Reference DNA barcode sequences should be derived from, and linked to, specimens of known provenance in web-accessible collections in order to validate this system of molecular diagnostics. PMID:16214744
REDIdb: the RNA editing database.

PubMed

Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

2007-01-01

The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.
Identification of Medically Important Yeasts Using PCR-Based Detection of DNA Sequence Polymorphisms in the Internal Transcribed Spacer 2 Region of the rRNA Genes

PubMed Central

Chen, Y. C.; Eisner, J. D.; Kattar, M. M.; Rassoulian-Barrett, S. L.; LaFe, K.; Yarfitz, S. L.; Limaye, A. P.; Cookson, B. T.

2000-01-01

Identification of medically relevant yeasts can be time-consuming and inaccurate with current methods. We evaluated PCR-based detection of sequence polymorphisms in the internal transcribed spacer 2 (ITS2) region of the rRNA genes as a means of fungal identification. Clinical isolates (401), reference strains (6), and type strains (27), representing 34 species of yeasts were examined. The length of PCR-amplified ITS2 region DNA was determined with single-base precision in less than 30 min by using automated capillary electrophoresis. Unique, species-specific PCR products ranging from 237 to 429 bp were obtained from 92% of the clinical isolates. The remaining 8%, divided into groups with ITS2 regions which differed by ≤2 bp in mean length, all contained species-specific DNA sequences easily distinguishable by restriction enzyme analysis. These data, and the specificity of length polymorphisms for identifying yeasts, were confirmed by DNA sequence analysis of the ITS2 region from 93 isolates. Phenotypic and ITS2-based identification was concordant for 427 of 434 yeast isolates examined using sequence identity of ≥99%. Seven clinical isolates contained ITS2 sequences that did not agree with their phenotypic identification, and ITS2-based phylogenetic analyses indicate the possibility of new or clinically unusual species in the Rhodotorula and Candida genera. This work establishes an initial database, validated with over 400 clinical isolates, of ITS2 length and sequence polymorphisms for 34 species of yeasts. We conclude that size and restriction analysis of PCR-amplified ITS2 region DNA is a rapid and reliable method to identify clinically significant yeasts, including potentially new or emerging pathogenic species. PMID:10834993
Definition of the Beijing/W lineage of Mycobacterium tuberculosis on the basis of genetic markers.

PubMed

Kremer, Kristin; Glynn, Judith R; Lillebaek, Troels; Niemann, Stefan; Kurepina, Natalia E; Kreiswirth, Barry N; Bifani, Pablo J; van Soolingen, Dick

2004-09-01

Mycobacterium tuberculosis Beijing genotype strains are highly prevalent in Asian countries and in the territory of the former Soviet Union. They are increasingly reported in other areas of the world and are frequently associated with tuberculosis outbreaks and drug resistance. Beijing genotype strains, including W strains, have been characterized by their highly similar multicopy IS6110 restriction fragment length polymorphism (RFLP) patterns, deletion of spacers 1 to 34 in the direct repeat region (Beijing spoligotype), and insertion of IS6110 in the genomic dnaA-dnaN locus. In this study the suitability and comparability of these three genetic markers to identify members of the Beijing lineage were evaluated. In a well-characterized collection of 1,020 M. tuberculosis isolates representative of the IS6110 RFLP genotypes found in The Netherlands, strains of two clades had spoligotypes characteristic of the Beijing lineage. A set of 19 Beijing reference RFLP patterns was selected to retrieve all Beijing strains from the Dutch database. These reference patterns gave a sensitivity of 98.1% and a specificity of 99.7% for identifying Beijing strains (defined by spoligotyping) in an international database of 1,084 strains. The usefulness of the reference patterns was also assessed with large DNA fingerprint databases in two other European countries and for identification strains from the W lineage found in the United States. A standardized definition for the identification of M. tuberculosis strains belonging to the Beijing/W lineage, as described in this work, will facilitate further studies on the spread and characterization of this widespread genotype family of M. tuberculosis strains.
Forensic DNA methylation profiling from evidence material for investigative leads

PubMed Central

Lee, Hwan Young; Lee, Soong Deok; Shin, Kyoung-Jin

2016-01-01

DNA methylation is emerging as an attractive marker providing investigative leads to solve crimes in forensic genetics. The identification of body fluids that utilizes tissue-specific DNA methylation can contribute to solving crimes by predicting activity related to the evidence material. The age estimation based on DNA methylation is expected to reduce the number of potential suspects, when the DNA profile from the evidence does not match with any known person, including those stored in the forensic database. Moreover, the variation in DNA implicates environmental exposure, such as cigarette smoking and alcohol consumption, thereby suggesting the possibility to be used as a marker for predicting the lifestyle of potential suspect. In this review, we describe recent advances in our understanding of DNA methylation variations and the utility of DNA methylation as a forensic marker for advanced investigative leads from evidence materials. [BMB Reports 2016; 49(7): 359-369] PMID:27099236
The future of forensic DNA analysis

PubMed Central

Butler, John M.

2015-01-01

The author's thoughts and opinions on where the field of forensic DNA testing is headed for the next decade are provided in the context of where the field has come over the past 30 years. Similar to the Olympic motto of ‘faster, higher, stronger’, forensic DNA protocols can be expected to become more rapid and sensitive and provide stronger investigative potential. New short tandem repeat (STR) loci have expanded the core set of genetic markers used for human identification in Europe and the USA. Rapid DNA testing is on the verge of enabling new applications. Next-generation sequencing has the potential to provide greater depth of coverage for information on STR alleles. Familial DNA searching has expanded capabilities of DNA databases in parts of the world where it is allowed. Challenges and opportunities that will impact the future of forensic DNA are explored including the need for education and training to improve interpretation of complex DNA profiles. PMID:26101278
Mitochondrial heteroplasmy and DNA barcoding in Hawaiian Hylaeus (Nesoprosopis) bees (Hymenoptera: Colletidae).

PubMed

Magnacca, Karl N; Brown, Mark J F

2010-06-11

The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna.
Mitochondrial heteroplasmy and DNA barcoding in Hawaiian Hylaeus (Nesoprosopis) bees (Hymenoptera: Colletidae)

PubMed Central

2010-01-01

Background The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Results Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Conclusions Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna. PMID:20540728
Uncommonly isolated clinical Pseudomonas: identification and phylogenetic assignation.

PubMed

Mulet, M; Gomila, M; Ramírez, A; Cardew, S; Moore, E R B; Lalucat, J; García-Valdés, E

2017-02-01

Fifty-two Pseudomonas strains that were difficult to identify at the species level in the phenotypic routine characterizations employed by clinical microbiology laboratories were selected for genotypic-based analysis. Species level identifications were done initially by partial sequencing of the DNA dependent RNA polymerase sub-unit D gene (rpoD). Two other gene sequences, for the small sub-unit ribosonal RNA (16S rRNA) and for DNA gyrase sub-unit B (gyrB) were added in a multilocus sequence analysis (MLSA) study to confirm the species identifications. These sequences were analyzed with a collection of reference sequences from the type strains of 161 Pseudomonas species within an in-house multi-locus sequence analysis database. Whole-cell matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analyses of these strains complemented the DNA sequenced-based phylogenetic analyses and were observed to be in accordance with the results of the sequence data. Twenty-three out of 52 strains were assigned to 12 recognized species not commonly detected in clinical specimens and 29 (56 %) were considered representatives of at least ten putative new species. Most strains were distributed within the P. fluorescens and P. aeruginosa lineages. The value of rpoD sequences in species-level identifications for Pseudomonas is emphasized. The correct species identifications of clinical strains is essential for establishing the intrinsic antibiotic resistance patterns and improved treatment plans.
Molecular genetic identification of skeletal remains from the Second World War Konfin I mass grave in Slovenia

PubMed Central

Gornjak Pogorelc, Barbara; Balažic, Jože

2010-01-01

This paper describes molecular genetic identification of one third of the skeletal remains of 88 victims of postwar (June 1945) killings found in the Konfin I mass grave in Slovenia. Living relatives were traced for 36 victims. We analyzed 84 right femurs and compared their genetic profiles to the genetic material of living relatives. We cleaned the bones, removed surface contamination, and ground the bones into powder. Prior to DNA isolation using Biorobot EZ1 (Qiagen), the powder was decalcified. The nuclear DNA of the samples was quantified using the real-time polymerase chain reaction method. We extracted 0.8 to 100 ng DNA/g of bone powder from 82 bones. Autosomal genetic profiles and Y-chromosome haplotypes were obtained from 98% of the bones, and mitochondrial DNA (mtDNA) haplotypes from 95% of the bones for the HVI region and from 98% of the bones for the HVII region. Genetic profiles of the nuclear and mtDNA were determined for reference persons. For traceability in the event of contamination, we created an elimination database including genetic profiles of the nuclear and mtDNA of all persons that had been in contact with the skeletal remains. When comparing genetic profiles, we matched 28 of the 84 bones analyzed with living relatives (brothers, sisters, sons, daughters, nephews, or cousins). The statistical analyses showed a high confidence of correct identification for all 28 victims in the Konfin I mass grave (posterior probability ranged from 99.9% to more than 99.999999%). PMID:20217112
Molecular genetic identification of skeletal remains from the Second World War Konfin I mass grave in Slovenia.

PubMed

Zupanic Pajnic, Irena; Gornjak Pogorelc, Barbara; Balazic, Joze

2010-07-01

This paper describes molecular genetic identification of one third of the skeletal remains of 88 victims of postwar (June 1945) killings found in the Konfin I mass grave in Slovenia. Living relatives were traced for 36 victims. We analyzed 84 right femurs and compared their genetic profiles to the genetic material of living relatives. We cleaned the bones, removed surface contamination, and ground the bones into powder. Prior to DNA isolation using Biorobot EZ1 (Qiagen), the powder was decalcified. The nuclear DNA of the samples was quantified using the real-time polymerase chain reaction method. We extracted 0.8 to 100 ng DNA/g of bone powder from 82 bones. Autosomal genetic profiles and Y-chromosome haplotypes were obtained from 98% of the bones, and mitochondrial DNA (mtDNA) haplotypes from 95% of the bones for the HVI region and from 98% of the bones for the HVII region. Genetic profiles of the nuclear and mtDNA were determined for reference persons. For traceability in the event of contamination, we created an elimination database including genetic profiles of the nuclear and mtDNA of all persons that had been in contact with the skeletal remains. When comparing genetic profiles, we matched 28 of the 84 bones analyzed with living relatives (brothers, sisters, sons, daughters, nephews, or cousins). The statistical analyses showed a high confidence of correct identification for all 28 victims in the Konfin I mass grave (posterior probability ranged from 99.9% to more than 99.999999%).
DNA barcode data accurately assign higher spider taxa

PubMed Central

Coddington, Jonathan A.; Agnarsson, Ingi; Cheng, Ren-Chung; Čandek, Klemen; Driskell, Amy; Frick, Holger; Gregorič, Matjaž; Kostanjšek, Rok; Kropf, Christian; Kweskin, Matthew; Lokovšek, Tjaša; Pipan, Miha; Vidergar, Nina

2016-01-01

The use of unique DNA sequences as a method for taxonomic identification is no longer fundamentally controversial, even though debate continues on the best markers, methods, and technology to use. Although both existing databanks such as GenBank and BOLD, as well as reference taxonomies, are imperfect, in best case scenarios “barcodes” (whether single or multiple, organelle or nuclear, loci) clearly are an increasingly fast and inexpensive method of identification, especially as compared to manual identification of unknowns by increasingly rare expert taxonomists. Because most species on Earth are undescribed, a complete reference database at the species level is impractical in the near term. The question therefore arises whether unidentified species can, using DNA barcodes, be accurately assigned to more inclusive groups such as genera and families—taxonomic ranks of putatively monophyletic groups for which the global inventory is more complete and stable. We used a carefully chosen test library of CO1 sequences from 49 families, 313 genera, and 816 species of spiders to assess the accuracy of genus and family-level assignment. We used BLAST queries of each sequence against the entire library and got the top ten hits. The percent sequence identity was reported from these hits (PIdent, range 75–100%). Accurate assignment of higher taxa (PIdent above which errors totaled less than 5%) occurred for genera at PIdent values >95 and families at PIdent values ≥ 91, suggesting these as heuristic thresholds for accurate generic and familial identifications in spiders. Accuracy of identification increases with numbers of species/genus and genera/family in the library; above five genera per family and fifteen species per genus all higher taxon assignments were correct. We propose that using percent sequence identity between conventional barcode sequences may be a feasible and reasonably accurate method to identify animals to family/genus. However, the quality of the underlying database impacts accuracy of results; many outliers in our dataset could be attributed to taxonomic and/or sequencing errors in BOLD and GenBank. It seems that an accurate and complete reference library of families and genera of life could provide accurate higher level taxonomic identifications cheaply and accessibly, within years rather than decades. PMID:27547527
Validation of a New Web Application for Identification of Fungi by Use of Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry

PubMed Central

Becker, P.; Gabriel, F.; Cassagne, C.; Accoceberry, I.; Gari-Toussaint, M.; Hasseine, L.; De Geyter, D.; Pierard, D.; Surmont, I.; Djenad, F.; Donnadieu, J. L.; Piarroux, M.; Hendrickx, M.; Piarroux, R.

2017-01-01

ABSTRACT Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) mass spectrometry has emerged as a reliable technique to identify molds involved in human diseases, including dermatophytes, provided that exhaustive reference databases are available. This study assessed an online identification application based on original algorithms and an extensive in-house reference database comprising 11,851 spectra (938 fungal species and 246 fungal genera). Validation criteria were established using an initial panel of 422 molds, including dermatophytes, previously identified via DNA sequencing (126 species). The application was further assessed using a separate panel of 501 cultured clinical isolates (88 mold taxa including dermatophytes) derived from five hospital laboratories. A total of 438 (87.35%) isolates were correctly identified at the species level, while 26 (5.22%) were assigned to the correct genus but the wrong species and 37 (7.43%) were not identified, since the defined threshold of 20 was not reached. The use of the Bruker Daltonics database included in the MALDI Biotyper software resulted in a much higher rate of unidentified isolates (39.76 and 74.30% using the score thresholds 1.7 and 2.0, respectively). Moreover, the identification delay of the online application remained compatible with real-time online queries (0.15 s per spectrum), and the application was faster than identifications using the MALDI Biotyper software. This is the first study to assess an online identification system based on MALDI-TOF spectrum analysis. We have successfully applied this approach to identify molds, including dermatophytes, for which diversity is insufficiently represented in commercial databases. This free-access application is available to medical mycologists to improve fungal identification. PMID:28637907
Validation of a New Web Application for Identification of Fungi by Use of Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry.

PubMed

Normand, A C; Becker, P; Gabriel, F; Cassagne, C; Accoceberry, I; Gari-Toussaint, M; Hasseine, L; De Geyter, D; Pierard, D; Surmont, I; Djenad, F; Donnadieu, J L; Piarroux, M; Ranque, S; Hendrickx, M; Piarroux, R

2017-09-01

Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry has emerged as a reliable technique to identify molds involved in human diseases, including dermatophytes, provided that exhaustive reference databases are available. This study assessed an online identification application based on original algorithms and an extensive in-house reference database comprising 11,851 spectra (938 fungal species and 246 fungal genera). Validation criteria were established using an initial panel of 422 molds, including dermatophytes, previously identified via DNA sequencing (126 species). The application was further assessed using a separate panel of 501 cultured clinical isolates (88 mold taxa including dermatophytes) derived from five hospital laboratories. A total of 438 (87.35%) isolates were correctly identified at the species level, while 26 (5.22%) were assigned to the correct genus but the wrong species and 37 (7.43%) were not identified, since the defined threshold of 20 was not reached. The use of the Bruker Daltonics database included in the MALDI Biotyper software resulted in a much higher rate of unidentified isolates (39.76 and 74.30% using the score thresholds 1.7 and 2.0, respectively). Moreover, the identification delay of the online application remained compatible with real-time online queries (0.15 s per spectrum), and the application was faster than identifications using the MALDI Biotyper software. This is the first study to assess an online identification system based on MALDI-TOF spectrum analysis. We have successfully applied this approach to identify molds, including dermatophytes, for which diversity is insufficiently represented in commercial databases. This free-access application is available to medical mycologists to improve fungal identification. Copyright © 2017 American Society for Microbiology.
Advances in yeast systematics and phylogeny and their use as predictors of biotechnologically important metabolic pathways

USDA-ARS?s Scientific Manuscript database

Detection, identification, and classification of yeasts have undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of t...
Genotyping and interpretation of STR-DNA: Low-template, mixtures and database matches-Twenty years of research and development.

PubMed

Gill, Peter; Haned, Hinda; Bleka, Oyvind; Hansson, Oskar; Dørum, Guro; Egeland, Thore

2015-09-01

The introduction of Short Tandem Repeat (STR) DNA was a revolution within a revolution that transformed forensic DNA profiling into a tool that could be used, for the first time, to create National DNA databases. This transformation would not have been possible without the concurrent development of fluorescent automated sequencers, combined with the ability to multiplex several loci together. Use of the polymerase chain reaction (PCR) increased the sensitivity of the method to enable the analysis of a handful of cells. The first multiplexes were simple: 'the quad', introduced by the defunct UK Forensic Science Service (FSS) in 1994, rapidly followed by a more discriminating 'six-plex' (Second Generation Multiplex) in 1995 that was used to create the world's first national DNA database. The success of the database rapidly outgrew the functionality of the original system - by the year 2000 a new multiplex of ten-loci was introduced to reduce the chance of adventitious matches. The technology was adopted world-wide, albeit with different loci. The political requirement to introduce pan-European databases encouraged standardisation - the development of European Standard Set (ESS) of markers comprising twelve-loci is the latest iteration. Although development has been impressive, the methods used to interpret evidence have lagged behind. For example, the theory to interpret complex DNA profiles (low-level mixtures), had been developed fifteen years ago, but only in the past year or so, are the concepts starting to be widely adopted. A plethora of different models (some commercial and others non-commercial) have appeared. This has led to a confusing 'debate' about the 'best' to use. The different models available are described along with their advantages and disadvantages. A section discusses the development of national DNA databases, along with details of an associated controversy to estimate the strength of evidence of matches. Current methodology is limited to searches of complete profiles - another example where the interpretation of matches has not kept pace with development of theory. STRs have also transformed the area of Disaster Victim Identification (DVI) which frequently requires kinship analysis. However, genotyping efficiency is complicated by complex, degraded DNA profiles. Finally, there is now a detailed understanding of the causes of stochastic effects that cause DNA profiles to exhibit the phenomena of drop-out and drop-in, along with artefacts such as stutters. The phenomena discussed include: heterozygote balance; stutter; degradation; the effect of decreasing quantities of DNA; the dilution effect. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Genomics dataset on unclassified published organism (patent US 7547531).

PubMed

Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier

2016-12-01

Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.

Rapid identification and classification of bacteria by 16S rDNA restriction fragment melting curve analyses (RFMCA).

PubMed

Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T

2007-08-01

The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control.
Oral and Craniofacial Clinical Signs Associated to Genetic Conditions in Human Identification Part I: A Review

PubMed Central

Ayoub, Fouad; Aoun, Nicole; el Husseini, Hassan; Jassar, Houssam; Sayah, Fida; Salameh, Ziad

2015-01-01

Background: Forensic dentistry is one of the most reliable methods used in human identification when other technique as fingerprint, DNA, visual identification cannot be used. Genetic disorders have several manifestations that can target the intra-oral cavity, the cranio-facial area or any location in the human body. Materials and Methods: A literature search of the scientific database (Medline and Science Direct) for the years 1990 to 2014 was carried out to find out all the available papers that indicate oral, cranio-facial signs, genetic and human identification. Results: A table with 10 genetic conditions was described with oral and cranio-facial signs that can help forensic specialist in human identification. Conclusion: This review showed a correlation between genetics, facial and intra-oral signs that would help forensic ondontologist in the identification procedures. PMID:26028912
Cloud-based adaptive exon prediction for DNA analysis

PubMed Central

Putluri, Srinivasareddy; Fathima, Shaik Yasmeen

2018-01-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Sea cucumber species identification of family Caudinidae from Surabaya based on morphological and mitochondrial DNA evidence

NASA Astrophysics Data System (ADS)

Amin, Muhammad Hilman Fu'adil; Pidada, Ida Bagus Rai; Sugiharto, Widyatmoko, Johan Nuari; Irawan, Bambang

2016-03-01

Species identification and taxonomy of sea cucumber remains a challenge problem in some taxa. Caudinidae family of sea cucumber was comerciallized in Surabaya, and it was used as sea cucumber chips. Members of Caudinid sea cucumber have similiar morphology, so it is hard to identify this sea cucumber only from morphological appearance. DNA barcoding is useful method to overcome this problem. The aim of this study was to determine Caudinid specimen of sea cucumber in East Java by morphological and molecular approach. Sample was collected from east coast of Surabaya, then preserved in absolute ethanol. After DNA isolation, Cytochrome Oxydase I (COI) gene amplification was performed using Echinoderm universal primer and PCR product was sequenced. Sequencing result was analyzed and identified in NCBI database using BLAST. Results showed that Caudinid specimen in have closely related to Acaudina molpadioides sequence in GenBank with 86% identity. Morphological data, especially based on ossicle, also showed that the specimen is Acaudina molpadioides.
GHEP-ISFG collaborative simulated exercise for DVI/MPI: Lessons learned about large-scale profile database comparisons.

PubMed

Vullo, Carlos M; Romero, Magdalena; Catelli, Laura; Šakić, Mustafa; Saragoni, Victor G; Jimenez Pleguezuelos, María Jose; Romanini, Carola; Anjos Porto, Maria João; Puente Prieto, Jorge; Bofarull Castro, Alicia; Hernandez, Alexis; Farfán, María José; Prieto, Victoria; Alvarez, David; Penacino, Gustavo; Zabalza, Santiago; Hernández Bolaños, Alejandro; Miguel Manterola, Irati; Prieto, Lourdes; Parsons, Thomas

2016-03-01

The GHEP-ISFG Working Group has recognized the importance of assisting DNA laboratories to gain expertise in handling DVI or missing persons identification (MPI) projects which involve the need for large-scale genetic profile comparisons. Eleven laboratories participated in a DNA matching exercise to identify victims from a hypothetical conflict with 193 missing persons. The post mortem database was comprised of 87 skeletal remain profiles from a secondary mass grave displaying a minimal number of 58 individuals with evidence of commingling. The reference database was represented by 286 family reference profiles with diverse pedigrees. The goal of the exercise was to correctly discover re-associations and family matches. The results of direct matching for commingled remains re-associations were correct and fully concordant among all laboratories. However, the kinship analysis for missing persons identifications showed variable results among the participants. There was a group of laboratories with correct, concordant results but nearly half of the others showed discrepant results exhibiting likelihood ratio differences of several degrees of magnitude in some cases. Three main errors were detected: (a) some laboratories did not use the complete reference family genetic data to report the match with the remains, (b) the identity and/or non-identity hypotheses were sometimes wrongly expressed in the likelihood ratio calculations, and (c) many laboratories did not properly evaluate the prior odds for the event. The results suggest that large-scale profile comparisons for DVI or MPI is a challenge for forensic genetics laboratories and the statistical treatment of DNA matching and the Bayesian framework should be better standardized among laboratories. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Attitudes regarding the national forensic DNA database: Survey data from the general public, prison inmates and prosecutors' offices in the Republic of Serbia.

PubMed

Teodorović, Smilja; Mijović, Dragan; Radovanović Nenadić, Una; Savić, Marina

2017-05-01

Worldwide, the establishment of national forensic DNA databases has transformed personal identification in the criminal justice system over the past two decades. It has also stimulated much debate centering on ethical issues, human rights, individual privacy, lack of safeguards and other standards. Therefore, a balance between effectiveness and intrusiveness of a national DNA repository is an imperative and needs to be achieved through a suitable legal framework. On its path to the European Union (EU), the Republic of Serbia is required to harmonize its national policies and legislation with the EU. Specifically, Chapter 24 of the EU acquis communautaire (Justice, Freedom and Security) stipulates the compulsory creation of a forensic DNA registry and adoption of corresponding legislation. This process is expected to occur in 2016. Thus, in light of launching the national DNA database, the goal of this work is to instigate a consultation with the Serbian public regarding their views on various aspects of the forensic DNA databank. Importantly, this study specifically assessed the opinions of distinct categories of citizens, including the general public, the prosecutors' offices staff, prisoners, prison guards, and students majoring in criminalistics. Our findings set a baseline for Serbian attitudes towards DNA databank custody, DNA sample and profile inclusion and retention criteria, ethical issues and concerns. Furthermore, results clearly demonstrate a permissive outlook of the respondents who are professional "beneficiaries" of genetic profiling and a restrictive position taken by the respondents whose genetic material has been acquired by the government. We believe that this opinion poll will be essential in discussions regarding a national DNA database, as well as in motivating further research on the reasons behind the observed views and subsequent development of educational strategies. All of these are, in turn, expected to aid the creation of suitable legislation and to increase societal confidence that the repository will be used in the legal system without interference with individual rights and freedoms. Copyright © 2017 Elsevier B.V. All rights reserved.
Application of DNA Barcodes in Asian Tropical Trees--A Case Study from Xishuangbanna Nature Reserve, Southwest China.

PubMed

Huang, Xiao-cui; Ci, Xiu-qin; Conran, John G; Li, Jie

2015-01-01

Within a regional floristic context, DNA barcoding is more useful to manage plant diversity inventories on a large scale and develop valuable conservation strategies. However, there are no DNA barcode studies from tropical areas of China, which represents one of the biodiversity hotspots around the world. A DNA barcoding database of an Asian tropical trees with high diversity was established at Xishuangbanna Nature Reserve, Yunnan, southwest China using rbcL and matK as standard barcodes, as well as trnH-psbA and ITS as supplementary barcodes. The performance of tree species identification success was assessed using 2,052 accessions from four plots belonging to two vegetation types in the region by three methods: Neighbor-Joining, Maximum-Likelihood and BLAST. We corrected morphological field identification errors (9.6%) for the three plots using rbcL and matK based on Neighbor-Joining tree. The best barcode region for PCR and sequencing was rbcL (97.6%, 90.8%), followed by trnH-psbA (93.6%, 85.6%), while matK and ITS obtained relative low PCR and sequencing success rates. However, ITS performed best for both species (44.6-58.1%) and genus (72.8-76.2%) identification. With trnH-psbA slightly less effective for species identification. The two standard barcode rbcL and matK gave poor results for species identification (24.7-28.5% and 31.6-35.3%). Compared with other studies from comparable tropical forests (e.g. Cameroon, the Amazon and India), the overall performance of the four barcodes for species identification was lower for the Xishuangbanna Nature Reserve, possibly because of species/genus ratios and species composition between these tropical areas. Although the core barcodes rbcL and matK were not suitable for species identification of tropical trees from Xishuangbanna Nature Reserve, they could still help with identification at the family and genus level. Considering the relative sequence recovery and the species identification performance, we recommend the use of trnH-psbA and ITS in combination as the preferred barcodes for tropical tree species identification in China.
DNA reference libraries of French Guianese mosquitoes for barcoding and metabarcoding

PubMed Central

Leroy, Céline; Guidez, Amandine; Dusfour, Isabelle; Girod, Romain; Dejean, Alain; Murienne, Jérôme

2017-01-01

The mosquito family (Diptera: Culicidae) constitutes the most medically important group of arthropods because certain species are vectors of human pathogens. In some parts of the world, the diversity is so high that the accurate delimitation and/or identification of species is challenging. A DNA-based identification system for all animals has been proposed, the so-called DNA barcoding approach. In this study, our objectives were (i) to establish DNA barcode libraries for the mosquitoes of French Guiana based on the COI and the 16S markers, (ii) to compare distance-based and tree-based methods of species delimitation to traditional taxonomy, and (iii) to evaluate the accuracy of each marker in identifying specimens. A total of 266 specimens belonging to 75 morphologically identified species or morphospecies were analyzed allowing us to delimit 86 DNA clusters with only 21 of them already present in the BOLD database. We thus provide a substantial contribution to the global mosquito barcoding initiative. Our results confirm that DNA barcodes can be successfully used to delimit and identify mosquito species with only a few cases where the marker could not distinguish closely related species. Our results also validate the presence of new species identified based on morphology, plus potential cases of cryptic species. We found that both COI and 16S markers performed very well, with successful identifications at the species level of up to 98% for COI and 97% for 16S when compared to traditional taxonomy. This shows great potential for the use of metabarcoding for vector monitoring and eco-epidemiological studies. PMID:28575090
Disaster victim investigation recommendations from two simulated mass disaster scenarios utilized for user acceptance testing CODIS 6.0.

PubMed

Bradford, Laurie; Heal, Jennifer; Anderson, Jeff; Faragher, Nichole; Duval, Kristin; Lalonde, Sylvain

2011-08-01

Members of the National DNA Data Bank (NDDB) of Canada designed and searched two simulated mass disaster (MD) scenarios for User Acceptance Testing (UAT) of the Combined DNA Index System (CODIS) 6.0, developed by the Federal Bureau of Investigation (FBI) and the US Department of Justice. A simulated airplane MD and inland Tsunami MD were designed representing a closed and open environment respectively. An in-house software program was written to randomly generate DNA profiles from a mock Caucasian population database. As part of the UAT, these two MDs were searched separately using CODIS 6.0. The new options available for identity and pedigree searching in addition to the inclusion of mitochondrial DNA (mtDNA) and Y-STR (short tandem repeat) information in CODIS 6.0, led to rapid identification of all victims. A Joint Pedigree Likelihood Ratio (JPLR) was calculated from the pedigree searches and ranks were stored in Rank Manager providing confidence to the user in assigning an Unidentified Human Remain (UHR) to a pedigree tree. Analyses of the results indicated that primary relatives were more useful in Disaster Victim Identification (DVI) compared to secondary or tertiary relatives and that inclusion of mtDNA and/or Y-STR technologies helped to link family units together as shown by the software searches. It is recommended that UHRs have as many informative loci possible to assist with their identification. CODIS 6.0 is a valuable technological tool for rapidly and confidently identifying victims of mass disasters. Crown Copyright © 2010. Published by Elsevier Ireland Ltd. All rights reserved.
Application of MALDI-TOF MS Systems in the Rapid Identification of Campylobacter spp. of Public Health Importance.

PubMed

Hsieh, Ying-Hsin; Wang, Yun F; Moura, Hercules; Miranda, Nancy; Simpson, Steven; Gowrishankar, Ramnath; Barr, John; Kerdahi, Khalil; Sulaiman, Irshad M

2018-05-01

Campylobacteriosis is an infectious gastrointestinal disease caused by Campylobacter spp. In most cases, it is either underdiagnosed or underreported due to poor diagnostics and limited databases. Several DNA-based molecular diagnostic techniques, including 16S ribosomal RNA (rRNA) sequence typing, have been widely used in the species identification of Campylobacter. Nevertheless, these assays are time-consuming and require a high quality of bacterial DNA. Matrix-assisted laser desorption ionization (MALDI) time-of-flight (TOF) MS is an emerging diagnostic technology that can provide the rapid identification of microorganisms by using their intact cells without extraction or purification. In this study, we analyzed 24 American Type Culture Collection reference isolates of 16 Campylobacter spp. and five unknown clinical bacterial isolates for rapid identification utilizing two commercially available MADI-TOF MS platforms, namely the bioMérieux VITEK® MS and Bruker Biotyper systems. In addition, 16S rRNA sequencing was performed to confirm the species-level identification of the unknown clinical isolates. Both MALDI-TOF MS systems identified the isolates of C. jejuni, C. coli, C. lari, and C. fetus. The results of this study suggest that the MALDI-TOF MS technique can be used in the identification of Campylobacter spp. of public health importance.
Comparison of sequencing the D2 region of the large subunit ribosomal RNA gene (MicroSEQ®) versus the internal transcribed spacer (ITS) regions using two public databases for identification of common and uncommon clinically relevant fungal species.

PubMed

Arbefeville, S; Harris, A; Ferrieri, P

2017-09-01

Fungal infections cause considerable morbidity and mortality in immunocompromised patients. Rapid and accurate identification of fungi is essential to guide accurately targeted antifungal therapy. With the advent of molecular methods, clinical laboratories can use new technologies to supplement traditional phenotypic identification of fungi. The aims of the study were to evaluate the sole commercially available MicroSEQ® D2 LSU rDNA Fungal Identification Kit compared to the in-house developed internal transcribed spacer (ITS) regions assay in identifying moulds, using two well-known online public databases to analyze sequenced data. 85 common and uncommon clinically relevant fungi isolated from clinical specimens were sequenced for the D2 region of the large subunit (LSU) of ribosomal RNA (rRNA) gene with the MicroSEQ® Kit and the ITS regions with the in house developed assay. The generated sequenced data were analyzed with the online GenBank and MycoBank public databases. The D2 region of the LSU rRNA gene identified 89.4% or 92.9% of the 85 isolates to the genus level and the full ITS region (f-ITS) 96.5% or 100%, using GenBank or MycoBank, respectively, when compared to the consensus ID. When comparing species-level designations to the consensus ID, D2 region of the LSU rRNA gene aligned with 44.7% (38/85) or 52.9% (45/85) of these isolates in GenBank or MycoBank, respectively. By comparison, f-ITS possessed greater specificity, followed by ITS1, then ITS2 regions using GenBank or MycoBank. Using GenBank or MycoBank, D2 region of the LSU rRNA gene outperformed phenotypic based ID at the genus level. Comparing rates of ID between D2 region of the LSU rRNA gene and the ITS regions in GenBank or MycoBank at the species level against the consensus ID, f-ITS and ITS2 exceeded performance of the D2 region of the LSU rRNA gene, but ITS1 had similar performance to the D2 region of the LSU rRNA gene using MycoBank. Our results indicated that the MicroSEQ® D2 LSU rDNA Fungal Identification Kit was equivalent to the in-house developed ITS regions assay to identify fungi at the genus level. The MycoBank database gave a better curated database and thus allowed a better genus and species identification for both D2 region of the LSU rRNA gene and ITS regions. Copyright © 2017 Elsevier B.V. All rights reserved.
European securitization and biometric identification: the uses of genetic profiling.

PubMed

Johnson, Paul; Williams, Robin

2007-01-01

The recent loss of confidence in textual and verbal methods for validating the identity claims of individual subjects has resulted in growing interest in the use of biometric technologies to establish corporeal uniqueness. Once established, this foundational certainty allows changing biographies and shifting category memberships to be anchored to unchanging bodily surfaces, forms or features. One significant source for this growth has been the "securitization" agendas of nation states that attempt the greater control and monitoring of population movement across geographical borders. Among the wide variety of available biometric schemes, DNA profiling is regarded as a key method for discerning and recording embodied individuality. This paper discusses the current limitations on the use of DNA profiling in civil identification practices and speculates on future uses of the technology with regard to its interoperability with other biometric databasing systems.
[A new herbs traceability method based on DNA barcoding-origin-morphology analysis--an example from an adulterant of 'Heiguogouqi'].

PubMed

Gu, Xuan; Zhang, Xiao-qin; Song, Xiao-na; Zang, Yi-mei; Li Yan-peng; Ma, Chang-hua; Zhao, Bai-xiao; Liu, Chun-sheng

2014-12-01

The fruit of Lycium ruthenicum is a common folk medicine in China. Now it is popular for its antioxidative effect and other medical functions. The adulterants of the herb confuse consumers. In order to identify a new adulterant of L. ruthenicum, a research was performed based on NCBI Nucleotide Database ITS Sequence, combined analysis of the origin and morphology of the adulterant to traceable varieties. Total genomic DNA was isolated from the materials, and nuclear DNA ITS sequences were amplified and sequenced; DNA fragments were collated and matched by using ContingExpress. Similarity identification of BLAST analysis was performed. Besides, the distribution of plant origin and morphology were considered to further identification and verification. Families and genera were identified by molecular identification method. The adulterant was identified as plant belonging to Berberis. Origin analysis narrowed the range of sample identification. Seven different kinds of plants in Berberis were potential sources of the sample. Adulterants variety was traced by morphological analysis. The united molecular identification-origin-morphology research proves to be a preceding way to medical herbs traceability with time-saving and economic advantages and the results showed the new adulterant of L. ruthenicum was B. kaschgarica. The main differences between B. kaschgarica and L. ruthenicum are as follows: in terms of the traits, the surface of B. kaschgarica is smooth and crispy, and that of L. ruthenicum is shrinkage, solid and hard. In microscopic characteristics, epicarp cells of B. aschgarica thickening like a string of beads, stone cells as the rectangle, and the stone cell walls of L. ruthenicum is wavy, obvious grain layer. In molecular sequences, the length of ITS sequence of B. kaschgarica is 606 bp, L. ruthenicum is 654 bp, the similarity of the two sequences is 53.32%.
A single mini-barcode test to screen for Australian mammalian predators from environmental samples

PubMed Central

MacDonald, Anna J; Sarre, Stephen D

2017-01-01

Abstract Identification of species from trace samples is now possible through the comparison of diagnostic DNA fragments against reference DNA sequence databases. DNA detection of animals from non-invasive samples, such as predator faeces (scats) that contain traces of DNA from their species of origin, has proved to be a valuable tool for the management of elusive wildlife. However, application of this approach can be limited by the availability of appropriate genetic markers. Scat DNA is often degraded, meaning that longer DNA sequences, including standard DNA barcoding markers, are difficult to recover. Instead, targeted short diagnostic markers are required to serve as diagnostic mini-barcodes. The mitochondrial genome is a useful source of such trace DNA markers because it provides good resolution at the species level and occurs in high copy numbers per cell. We developed a mini-barcode based on a short (178 bp) fragment of the conserved 12S ribosomal ribonucleic acid mitochondrial gene sequence, with the goal of discriminating amongst the scats of large mammalian predators of Australia. We tested the sensitivity and specificity of our primers and can accurately detect and discriminate amongst quolls, cats, dogs, foxes, and devils from trace DNA samples. Our approach provides a cost-effective, time-efficient, and non-invasive tool that enables identification of all 8 medium-large mammal predators in Australia, including native and introduced species, using a single test. With modification, this approach is likely to be of broad applicability elsewhere. PMID:28810700
The future of forensic DNA analysis.

PubMed

Butler, John M

2015-08-05

The author's thoughts and opinions on where the field of forensic DNA testing is headed for the next decade are provided in the context of where the field has come over the past 30 years. Similar to the Olympic motto of 'faster, higher, stronger', forensic DNA protocols can be expected to become more rapid and sensitive and provide stronger investigative potential. New short tandem repeat (STR) loci have expanded the core set of genetic markers used for human identification in Europe and the USA. Rapid DNA testing is on the verge of enabling new applications. Next-generation sequencing has the potential to provide greater depth of coverage for information on STR alleles. Familial DNA searching has expanded capabilities of DNA databases in parts of the world where it is allowed. Challenges and opportunities that will impact the future of forensic DNA are explored including the need for education and training to improve interpretation of complex DNA profiles. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
MALDI-TOF mass spectrometry provides high accuracy in identification of Salmonella at species level but is limited to type or subtype Salmonella serovars.

PubMed

Kang, Lin; Li, Nan; Li, Ping; Zhou, Yang; Gao, Shan; Gao, Hongwei; Xin, Wenwen; Wang, Jinglin

2017-04-01

Salmonella can cause global foodborne illnesses in humans and many animals. The current diagnostic gold standard used for detecting Salmonella infection is microbiological culture followed by serological confirmation tests. However, these methods are complicated and time-consuming. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis offers some advantages in rapid identification, for example, simple and fast sample preparation, fast and automated measurement, and robust and reliable identification up to genus and species levels, possibly even to the strain level. In this study, we established a reference database for species identification using whole-cell MALDI-TOF MS; the database consisted of 12 obtained main spectra of the Salmonella culture collection strains belonged to seven serotypes. Eighty-two clinical isolates of Salmonella were identified using established database, and partial 16S rDNA gene sequencing and serological method were used as comparison. We found that MALDI-TOF mass spectrometry provided high accuracy in identification of Salmonella at species level but was limited to type or subtype Salmonella serovars. We also tried to find serovar-specific biomarkers and failed. Our study demonstrated that (a) MALDI-TOF MS was suitable for identification of Salmonella at species level with high accuracy and (b) that MALDI-TOF MS method presented in this study was not useful for serovar assignment of Salmonella currently, because of its low matching with serological method and (c) MALDI-TOF MS method presented in this study was not suitable to subtype S. typhimurium because of its low discriminatory ability.
Fast and Sensitive Alignment of Microbial Whole Genome Sequencing Reads to Large Sequence Datasets on a Desktop PC: Application to Metagenomic Datasets and Pathogen Identification

PubMed Central

2014-01-01

Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner. PMID:25077800
Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.

PubMed

Pongor, Lőrinc S; Vera, Roberto; Ligeti, Balázs

2014-01-01

Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.
Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.

PubMed

O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M

2010-10-01

Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the sequence results can be verified and isolates are made available for future study.
DNA barcoding reveals high levels of genetic diversity in the fishes of the Itapecuru Basin in Maranhão, Brazil.

PubMed

Nascimento, M H S; Almeida, M S; Veira, M N S; Limeira Filho, D; Lima, R C; Barros, M C; Fraga, E C

2016-08-29

DNA barcoding is a useful complementary tool for use in traditional taxonomic studies due to its ability to detect cryptic species, and may be particularly efficient in the identification of fish species. The fish fauna of the Itapecuru River represents an important fishery resource in the Brazilian State of Maranhão, although it is currently suffering increasing degradation as a result of anthropogenic impacts. Therefore, DNA barcoding was used in the present study to identify fish species and establish a database of the rich freshwater fish fauna of Maranhão. A total of 440 specimens were analyzed, corresponding to 64 species belonging to 59 genera, 31 families, and 10 orders. Overall, 92.19% of these species could be identified by DNA barcoding, and were characterized by low levels (average 0.80%) of intra-specific divergence. However, five species (Anableps anableps, Gymnotus carapo, Sciades couma, Pseudauchenipterus nodosus, and Leporinus piau) presented values of mean genetic divergence above 3%, indicating the existence of cryptic diversity in these fishes. The DNA barcoding approach permitted the analysis of a large number of specimens and facilitated the discrimination and identification of closely related fish species in the Itapecuru Basin.

Identification of wood-boring beetles (Cerambycidae and Buprestidae) intercepted in trade-associated solid wood packaging material using DNA barcoding and morphology.

PubMed

Wu, Yunke; Trepanowski, Nevada F; Molongoski, John J; Reagel, Peter F; Lingafelter, Steven W; Nadel, Hannah; Myers, Scott W; Ray, Ann M

2017-01-16

Global trade facilitates the inadvertent movement of insect pests and subsequent establishment of populations outside their native ranges. Despite phytosanitary measures, nonnative insects arrive at United States (U.S.) ports of entry as larvae in solid wood packaging material (SWPM). Identification of wood-boring larval insects is important for pest risk analysis and management, but is difficult beyond family level due to highly conserved morphology. Therefore, we integrated DNA barcoding and rearing of larvae to identify wood-boring insects in SWPM. From 2012 to 2015, we obtained larvae of 338 longhorned beetles (Cerambycidae) and 38 metallic wood boring beetles (Buprestidae) intercepted in SWPM associated with imported products at six U.S. ports. We identified 265 specimens to species or genus using DNA barcodes. Ninety-three larvae were reared to adults and identified morphologically. No conflict was found between the two approaches, which together identified 275 cerambycids (23 genera) and 16 buprestids (4 genera). Our integrated approach confirmed novel DNA barcodes for seven species (10 specimens) of woodborers not in public databases. This study demonstrates the utility of DNA barcoding as a tool for regulatory agencies. We provide important documentation of potential beetle pests that may cross country borders through the SWPM pathway.
Identification of wood-boring beetles (Cerambycidae and Buprestidae) intercepted in trade-associated solid wood packaging material using DNA barcoding and morphology

PubMed Central

Wu, Yunke; Trepanowski, Nevada F.; Molongoski, John J.; Reagel, Peter F.; Lingafelter, Steven W.; Nadel, Hannah; Myers, Scott W.; Ray, Ann M.

2017-01-01

Global trade facilitates the inadvertent movement of insect pests and subsequent establishment of populations outside their native ranges. Despite phytosanitary measures, nonnative insects arrive at United States (U.S.) ports of entry as larvae in solid wood packaging material (SWPM). Identification of wood-boring larval insects is important for pest risk analysis and management, but is difficult beyond family level due to highly conserved morphology. Therefore, we integrated DNA barcoding and rearing of larvae to identify wood-boring insects in SWPM. From 2012 to 2015, we obtained larvae of 338 longhorned beetles (Cerambycidae) and 38 metallic wood boring beetles (Buprestidae) intercepted in SWPM associated with imported products at six U.S. ports. We identified 265 specimens to species or genus using DNA barcodes. Ninety-three larvae were reared to adults and identified morphologically. No conflict was found between the two approaches, which together identified 275 cerambycids (23 genera) and 16 buprestids (4 genera). Our integrated approach confirmed novel DNA barcodes for seven species (10 specimens) of woodborers not in public databases. This study demonstrates the utility of DNA barcoding as a tool for regulatory agencies. We provide important documentation of potential beetle pests that may cross country borders through the SWPM pathway. PMID:28091577
Nuclear Magnetic Resonance Spectroscopy-Based Identification of Yeast.

PubMed

Himmelreich, Uwe; Sorrell, Tania C; Daniel, Heide-Marie

2017-01-01

Rapid and robust high-throughput identification of environmental, industrial, or clinical yeast isolates is important whenever relatively large numbers of samples need to be processed in a cost-efficient way. Nuclear magnetic resonance (NMR) spectroscopy generates complex data based on metabolite profiles, chemical composition and possibly on medium consumption, which can not only be used for the assessment of metabolic pathways but also for accurate identification of yeast down to the subspecies level. Initial results on NMR based yeast identification where comparable with conventional and DNA-based identification. Potential advantages of NMR spectroscopy in mycological laboratories include not only accurate identification but also the potential of automated sample delivery, automated analysis using computer-based methods, rapid turnaround time, high throughput, and low running costs.We describe here the sample preparation, data acquisition and analysis for NMR-based yeast identification. In addition, a roadmap for the development of classification strategies is given that will result in the acquisition of a database and analysis algorithms for yeast identification in different environments.
Amelogenin test: From forensics to quality control in clinical and biochemical genomics.

PubMed

Francès, F; Portolés, O; González, J I; Coltell, O; Verdú, F; Castelló, A; Corella, D

2007-01-01

The increasing number of samples from the biomedical genetic studies and the number of centers participating in the same involves increasing risk of mistakes in the different sample handling stages. We have evaluated the usefulness of the amelogenin test for quality control in sample identification. Amelogenin test (frequently used in forensics) was undertaken on 1224 individuals participating in a biomedical study. Concordance between referred sex in the database and amelogenin test was estimated. Additional sex-error genetic detecting systems were developed. The overall concordance rate was 99.84% (1222/1224). Two samples showed a female amelogenin test outcome, being codified as males in the database. The first, after checking sex-specific biochemical and clinical profile data was found to be due to a codification error in the database. In the second, after checking the database, no apparent error was discovered because a correct male profile was found. False negatives in amelogenin male sex determination were discarded by additional tests, and feminine sex was confirmed. A sample labeling error was revealed after a new DNA extraction. The amelogenin test is a useful quality control tool for detecting sex-identification errors in large genomic studies, and can contribute to increase its validity.
Evaluation of the Biotyper MALDI-TOF MS system for identification of Staphylococcus species.

PubMed

Zhu, Wenming; Sieradzki, Krzysztof; Albrecht, Valerie; McAllister, Sigrid; Lin, Wen; Stuchlik, Olga; Limbago, Brandi; Pohl, Jan; Kamile Rasheed, J

2015-10-01

The Bruker Biotyper MALDI-TOF MS (Biotyper) system, with a modified 30 minute formic acid extraction method, was evaluated by its ability to identify 216 clinical Staphylococcus isolates from the CDC reference collection comprising 23 species previously identified by conventional biochemical tests. 16S rDNA sequence analysis was used to resolve discrepancies. Of these, 209 (96.8%) isolates were correctly identified: 177 (84.7%) isolates had scores ≥2.0, while 32 (15.3%) had scores between 1.70 and 1.99. The Biotyper identification was inconsistent with the biochemical identification for seven (3.2%) isolates, but the Biotyper identifications were confirmed by 16S rDNA analysis. The distribution of low scores was strongly species-dependent, e.g. only 5% of Staphylococcus epidermidis and 4.8% of Staphylococcus aureus isolates scored below 2.0, while 100% of Staphylococcus cohnii, 75% of Staphylococcus sciuri, and 60% of Staphylococcus caprae produced low but accurate Biotyper scores. Our results demonstrate that the Biotyper can reliably identify Staphylococcus species with greater accuracy than conventional biochemicals. Broadening of the reference database by inclusion of additional examples of under-represented species could further optimize Biotyper results. Published by Elsevier B.V.
Identification of DNA Methyltransferase Genes in Human Pathogenic Bacteria by Comparative Genomics.

PubMed

Brambila-Tapia, Aniel Jessica Leticia; Poot-Hernández, Augusto Cesar; Perez-Rueda, Ernesto; Rodríguez-Vázquez, Katya

2016-06-01

DNA methylation plays an important role in gene expression and virulence in some pathogenic bacteria. In this report, we describe DNA methyltransferases (MTases) present in human pathogenic bacteria and compared them with related species, which are not pathogenic or less pathogenic, based in comparative genomics. We performed a search in the KEGG database of the KEGG database orthology groups associated with adenine and cytosine DNA MTase activities (EC: 2.1.1.37, EC: 2.1.1.113 and EC: 2.1.1.72) in 37 human pathogenic species and 18 non/less pathogenic relatives and performed comparisons of the number of these MTases sequences according to their genome size, the DNA MTase type and with their non-less pathogenic relatives. We observed that Helicobacter pylori and Neisseria spp. presented the highest number of MTases while ten different species did not present a predicted DNA MTase. We also detected a significant increase of adenine MTases over cytosine MTases (2.19 vs. 1.06, respectively, p < 0.001). Adenine MTases were the only MTases associated with restriction modification systems and DNA MTases associated with type I restriction modification systems were more numerous than those associated with type III restriction modification systems (0.84 vs. 0.17, p < 0.001); additionally, there was no correlation with the genome size and the total number of DNA MTases, indicating that the number of DNA MTases is related to the particular evolution and lifestyle of specific species, regulating the expression of virulence genes in some pathogenic bacteria.
Novel Antigen Identification Method for Discovery of Protective Malaria Antigens by Rapid Testing of DNA Vaccines Encoding Exons from the Parasite Genome

PubMed Central

Haddad, Diana; Bilcikova, Erika; Witney, Adam A.; Carlton, Jane M.; White, Charles E.; Blair, Peter L.; Chattopadhyay, Rana; Russell, Joshua; Abot, Esteban; Charoenvit, Yupin; Aguiar, Joao C.; Carucci, Daniel J.; Weiss, Walter R.

2004-01-01

We describe a novel approach for identifying target antigens for preerythrocytic malaria vaccines. Our strategy is to rapidly test hundreds of DNA vaccines encoding exons from the Plasmodium yoelii yoelii genomic sequence. In this antigen identification method, we measure reduction in parasite burden in the liver after sporozoite challenge in mice. Orthologs of protective P. y. yoelii genes can then be identified in the genomic databases of Plasmodium falciparum and Plasmodium vivax and investigated as candidate antigens for a human vaccine. A pilot study to develop the antigen identification method approach used 192 P. y. yoelii exons from genes expressed during the sporozoite stage of the life cycle. A total of 182 (94%) exons were successfully cloned into a DNA immunization vector with the Gateway cloning technology. To assess immunization strategies, mice were vaccinated with 19 of the new DNA plasmids in addition to the well-characterized protective plasmid encoding P. y. yoelii circumsporozoite protein. Single plasmid immunization by gene gun identified a novel vaccine target antigen which decreased liver parasite burden by 95% and which has orthologs in P. vivax and P. knowlesi but not P. falciparum. Intramuscular injection of DNA plasmids produced a different pattern of protective responses from those seen with gene gun immunization. Intramuscular immunization with plasmid pools could reduce liver parasite burden in mice despite the fact that none of the plasmids was protective when given individually. We conclude that high-throughput cloning of exons into DNA vaccines and their screening is feasible and can rapidly identify new malaria vaccine candidate antigens. PMID:14977966
DNA Microarray for Rapid Detection and Identification of Food and Water Borne Bacteria: From Dry to Wet Lab.

PubMed

Ranjbar, Reza; Behzadi, Payam; Najafi, Ali; Roudi, Raheleh

2017-01-01

A rapid, accurate, flexible and reliable diagnostic method may significantly decrease the costs of diagnosis and treatment. Designing an appropriate microarray chip reduces noises and probable biases in the final result. The aim of this study was to design and construct a DNA Microarray Chip for a rapid detection and identification of 10 important bacterial agents. In the present survey, 10 unique genomic regions relating to 10 pathogenic bacterial agents including Escherichia coli (E.coli), Shigella boydii, Sh.dysenteriae, Sh.flexneri, Sh.sonnei, Salmonella typhi, S.typhimurium, Brucella sp., Legionella pneumophila, and Vibrio cholera were selected for designing specific long oligo microarray probes. For this reason, the in-silico operations including utilization of the NCBI RefSeq database, Servers of PanSeq and Gview, AlleleID 7.7 and Oligo Analyzer 3.1 was done. On the other hand, the in-vitro part of the study comprised stages of robotic microarray chip probe spotting, bacterial DNAs extraction and DNA labeling, hybridization and microarray chip scanning. In wet lab section, different tools and apparatus such as Nexterion® Slide E, Qarray mini spotter, NimbleGen kit, TrayMix TM S4, and Innoscan 710 were used. A DNA microarray chip including 10 long oligo microarray probes was designed and constructed for detection and identification of 10 pathogenic bacteria. The DNA microarray chip was capable to identify all 10 bacterial agents tested simultaneously. The presence of a professional bioinformatician as a probe designer is needed to design appropriate multifunctional microarray probes to increase the accuracy of the outcomes.
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity

PubMed Central

Regis, David P.; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L.; Stefaniak, Maureen E.; Campo, Joseph J.; Carucci, Daniel J.; Roth, David A.; He, Huaping; Felgner, Philip L.; Doolan, Denise L.

2009-01-01

We have evaluated a technology called Transcriptionally Active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data. PMID:18164079
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity.

PubMed

Regis, David P; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L; Stefaniak, Maureen E; Campo, Joseph J; Carucci, Daniel J; Roth, David A; He, Huaping; Felgner, Philip L; Doolan, Denise L

2008-03-01

We have evaluated a technology called transcriptionally active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data.
Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease

PubMed Central

Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao

2018-01-01

Abstract Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. PMID:29069510
In silico analysis of 16S ribosomal RNA gene sequencing‐based methods for identification of medically important anaerobic bacteria

PubMed Central

Woo, Patrick C Y; Chung, Liliane M W; Teng, Jade L L; Tse, Herman; Pang, Sherby S Y; Lau, Veronica Y T; Wong, Vanessa W K; Kam, Kwok‐ling; Lau, Susanna K P; Yuen, Kwok‐Yung

2007-01-01

This study is the first study that provides useful guidelines to clinical microbiologists and technicians on the usefulness of full 16S rRNA sequencing, 5′‐end 527‐bp 16S rRNA sequencing and the existing MicroSeq full and 500 16S rDNA bacterial identification system (MicroSeq, Perkin‐Elmer Applied Biosystems Division, Foster City, California, USA) databases for the identification of all existing medically important anaerobic bacteria. Full and 527‐bp 16S rRNA sequencing are able to identify 52–63% of 130 Gram‐positive anaerobic rods, 72–73% of 86 Gram‐negative anaerobic rods and 78% of 23 anaerobic cocci. The existing MicroSeq databases are able to identify only 19–25% of 130 Gram‐positive anaerobic rods, 38% of 86 Gram‐negative anaerobic rods and 39% of 23 anaerobic cocci. These represent only 45–46% of those that should be confidently identified by full and 527‐bp 16S rRNA sequencing. To improve the usefulness of MicroSeq, bacterial species that should be confidently identified by full and/or 527‐bp 16S rRNA sequencing but not included in the existing MicroSeq databases should be included. PMID:17046845
Identification of Disulphide Stress-responsive Extracytoplasmic Function Sigma Factors in Rothia mucilaginosa

DTIC Science & Technology

2013-01-01

GC) content of deoxyribonucleic acid (DNA).1 This organism belongs to the actinobacteria , which includes the genera Actinomyces, Corynebacterium...however, it still remains unclear how Rothia species respond to environmental stress. The responsiveness and adaptation of a few actinobacteria to various... actinobacteria , the number of sigma factors in R. mucilaginosa is relatively small (microbial signal transduction (MiST2) database, http://mistdb.com/); however
Application of DNA Barcodes in Asian Tropical Trees – A Case Study from Xishuangbanna Nature Reserve, Southwest China

PubMed Central

Conran, John G.; Li, Jie

2015-01-01

Background Within a regional floristic context, DNA barcoding is more useful to manage plant diversity inventories on a large scale and develop valuable conservation strategies. However, there are no DNA barcode studies from tropical areas of China, which represents one of the biodiversity hotspots around the world. Methodology and Principal Findings A DNA barcoding database of an Asian tropical trees with high diversity was established at Xishuangbanna Nature Reserve, Yunnan, southwest China using rbcL and matK as standard barcodes, as well as trnH–psbA and ITS as supplementary barcodes. The performance of tree species identification success was assessed using 2,052 accessions from four plots belonging to two vegetation types in the region by three methods: Neighbor-Joining, Maximum-Likelihood and BLAST. We corrected morphological field identification errors (9.6%) for the three plots using rbcL and matK based on Neighbor-Joining tree. The best barcode region for PCR and sequencing was rbcL (97.6%, 90.8%), followed by trnH–psbA (93.6%, 85.6%), while matK and ITS obtained relative low PCR and sequencing success rates. However, ITS performed best for both species (44.6–58.1%) and genus (72.8–76.2%) identification. With trnH–psbA slightly less effective for species identification. The two standard barcode rbcL and matK gave poor results for species identification (24.7–28.5% and 31.6–35.3%). Compared with other studies from comparable tropical forests (e.g. Cameroon, the Amazon and India), the overall performance of the four barcodes for species identification was lower for the Xishuangbanna Nature Reserve, possibly because of species/genus ratios and species composition between these tropical areas. Conclusions/Significance Although the core barcodes rbcL and matK were not suitable for species identification of tropical trees from Xishuangbanna Nature Reserve, they could still help with identification at the family and genus level. Considering the relative sequence recovery and the species identification performance, we recommend the use of trnH–psbA and ITS in combination as the preferred barcodes for tropical tree species identification in China. PMID:26121045
Approaching the taxonomic affiliation of unidentified sequences in public databases--an example from the mycorrhizal fungi.

PubMed

Nilsson, R Henrik; Kristiansson, Erik; Ryberg, Martin; Larsson, Karl-Henrik

2005-07-18

During the last few years, DNA sequence analysis has become one of the primary means of taxonomic identification of species, particularly so for species that are minute or otherwise lack distinct, readily obtainable morphological characters. Although the number of sequences available for comparison in public databases such as GenBank increases exponentially, only a minuscule fraction of all organisms have been sequenced, leaving taxon sampling a momentous problem for sequence-based taxonomic identification. When querying GenBank with a set of unidentified sequences, a considerable proportion typically lack fully identified matches, forming an ever-mounting pile of sequences that the researcher will have to monitor manually in the hope that new, clarifying sequences have been submitted by other researchers. To alleviate these concerns, a project to automatically monitor select unidentified sequences in GenBank for taxonomic progress through repeated local BLAST searches was initiated. Mycorrhizal fungi--a field where species identification often is prohibitively complex--and the much used ITS locus were chosen as test bed. A Perl script package called emerencia is presented. On a regular basis, it downloads select sequences from GenBank, separates the identified sequences from those insufficiently identified, and performs BLAST searches between these two datasets, storing all results in an SQL database. On the accompanying web-service http://emerencia.math.chalmers.se, users can monitor the taxonomic progress of insufficiently identified sequences over time, either through active searches or by signing up for e-mail notification upon disclosure of better matches. Other search categories, such as listing all insufficiently identified sequences (and their present best fully identified matches) publication-wise, are also available. The ever-increasing use of DNA sequences for identification purposes largely falls back on the assumption that public sequence databases contain a thorough sampling of taxonomically well-annotated sequences. Taxonomy, held by some to be an old-fashioned trade, has accordingly never been more important. emerencia does not automate the taxonomic process, but it does allow researchers to focus their efforts elsewhere than countless manual BLAST runs and arduous sieving of BLAST hit lists. The emerencia system is available on an open source basis for local installation with any organism and gene group as targets.
Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories.

PubMed

Woo, P C Y; Lau, S K P; Teng, J L L; Tse, H; Yuen, K-Y

2008-10-01

In the last decade, as a result of the widespread use of PCR and DNA sequencing, 16S rDNA sequencing has played a pivotal role in the accurate identification of bacterial isolates and the discovery of novel bacteria in clinical microbiology laboratories. For bacterial identification, 16S rDNA sequencing is particularly important in the case of bacteria with unusual phenotypic profiles, rare bacteria, slow-growing bacteria, uncultivable bacteria and culture-negative infections. Not only has it provided insights into aetiologies of infectious disease, but it also helps clinicians in choosing antibiotics and in determining the duration of treatment and infection control procedures. With the use of 16S rDNA sequencing, 215 novel bacterial species, 29 of which belong to novel genera, have been discovered from human specimens in the past 7 years of the 21st century (2001-2007). One hundred of the 215 novel species, 15 belonging to novel genera, have been found in four or more subjects. The largest number of novel species discovered were of the genera Mycobacterium (n = 12) and Nocardia (n = 6). The oral cavity/dental-related specimens (n = 19) and the gastrointestinal tract (n = 26) were the most important sites for discovery and/or reservoirs of novel species. Among the 100 novel species, Streptococcus sinensis, Laribacter hongkongensis, Clostridium hathewayi and Borrelia spielmanii have been most thoroughly characterized, with the reservoirs and routes of transmission documented, and S. sinensis, L. hongkongensis and C. hathewayi have been found globally. One of the greatest hurdles in putting 16S rDNA sequencing into routine use in clinical microbiology laboratories is automation of the technology. The only step that can be automated at the moment is input of the 16S rDNA sequence of the bacterial isolate for identification into one of the software packages that will generate the result of the identity of the isolate on the basis of its sequence database. However, studies on the accuracy of the software packages have given highly varied results, and interpretation of results remains difficult for most technicians, and even for clinical microbiologists. To fully utilize 16S rDNA sequencing in clinical microbiology, better guidelines are needed for interpretation of the identification results, and additional/supplementary methods are necessary for bacterial species that cannot be identified confidently by 16S rDNA sequencing alone.
Use of PCR-restriction fragment length polymorphism analysis for identification of yeast species isolated from bovine intramammary infection.

PubMed

Fadda, M E; Pisano, M B; Scaccabarozzi, L; Mossa, V; Deplano, M; Moroni, P; Liciardi, M; Cosentino, S

2013-01-01

This study reports a rapid PCR-based technique using a one-enzyme RFLP for discrimination of yeasts isolated from bovine clinical and subclinical mastitis milk samples. We analyzed a total of 1,486 milk samples collected over 1 yr in south Sardinia and northern Italy, and 142 yeast strains were preliminarily grouped based on their cultural morphology and physiological characteristics. Assimilation tests were conducted using the identification kit API ID 32C and APILAB Plus software (bioMérieux, Marcy l'Etoile, France). For PCR-RFLP analysis, the 18S-ITS1-5.8S ribosomal(r)DNA region was amplified and then digested with HaeIII, and dendrogram analysis of RFLP fragments was carried out. Furthermore, within each of the groups identified by the API or PCR-RFLP methods, the identification of isolates was confirmed by sequencing of the D1/D2 region using an ABI Prism 310 automatic sequencer (Applied Biosystems, Foster City, CA). The combined phenotypic and molecular approach enabled the identification of 17 yeast species belonging to the genera Candida (47.9%), Cryptococcus (21.1%), Trichosporon (19.7%), Geotrichum (7.1%), and Rhodotorula (4.2%). All Candida species were correctly identified by the API test and their identification confirmed by sequencing. All strains identified with the API system as Geotrichum candidum, Cryptococcus uniguttulatus, and Rhodotorula glutinis also produced characteristic restriction patterns and were confirmed as Galactomyces geotrichum (a teleomorph of G. candidum), Filobasidium uniguttulatum (teleomorph of Crypt. uniguttulatus), and R. glutinis, respectively, by D1/D2 rDNA sequencing. With regard to the genus Trichosporon, preliminary identification by API was problematic, whereas the RFLP technique used in this study gave characteristic restriction profiles for each species. Moreover, sequencing of the D1/D2 region allowed not only successful identification of Trichosporon gracile where API could not, but also correct identification of misidentified isolates. In conclusion, the 18S-ITS1-5.8S region appears to be useful in detecting genetic variability among yeast species, which is valuable for taxonomic purposes and for species identification. We have established an RFLP database for yeast species identified in milk samples using the software GelCompar II and the RFLP database constitutes an initial method for veterinary yeast identification. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Identification of RNA molecules by specific enzyme digestion and mass spectrometry: software for and implementation of RNA mass mapping

PubMed Central

Matthiesen, Rune; Kirpekar, Finn

2009-01-01

The idea of identifying or characterizing an RNA molecule based on a mass spectrum of specifically generated RNA fragments has been used in various forms for well over a decade. We have developed software—named RRM for ‘RNA mass mapping’—which can search whole prokaryotic genomes or RNA FASTA sequence databases to identify the origin of a given RNA based on a mass spectrum of RNA fragments. As input, the program uses the masses of specific RNase cleavage of the RNA under investigation. RNase T1 digestion is used here as a demonstration of the usability of the method for RNA identification. The concept for identification is that the masses of the digestion products constitute a specific fingerprint, which characterize the given RNA. The search algorithm is based on the same principles as those used in peptide mass fingerprinting, but has here been extended to work for both RNA sequence databases and for genome searches. A simple and powerful probability model for ranking RNA matches is proposed. We demonstrate viability of the entire setup by identifying the DNA template of a series of RNAs of biological and of in vitro transcriptional origin in complete microbial genomes and by identifying authentic 16S ribosomal RNAs in a ‘small ribosomal subunit RNA’ database. Thus, we present a new tool for a rapid identification of unknown RNAs using only a few picomoles of starting material. PMID:19264806
Identification of Human N-Myristoylated Proteins from Human Complementary DNA Resources by Cell-Free and Cellular Metabolic Labeling Analyses.

PubMed

Takamitsu, Emi; Otsuka, Motoaki; Haebara, Tatsuki; Yano, Manami; Matsuzaki, Kanako; Kobuchi, Hirotsugu; Moriya, Koko; Utsumi, Toshihiko

2015-01-01

To identify physiologically important human N-myristoylated proteins, 90 cDNA clones predicted to encode human N-myristoylated proteins were selected from a human cDNA resource (4,369 Kazusa ORFeome project human cDNA clones) by two bioinformatic N-myristoylation prediction systems, NMT-The MYR Predictor and Myristoylator. After database searches to exclude known human N-myristoylated proteins, 37 cDNA clones were selected as potential human N-myristoylated proteins. The susceptibility of these cDNA clones to protein N-myristoylation was first evaluated using fusion proteins in which the N-terminal ten amino acid residues were fused to an epitope-tagged model protein. Then, protein N-myristoylation of the gene products of full-length cDNAs was evaluated by metabolic labeling experiments both in an insect cell-free protein synthesis system and in transfected human cells. As a result, the products of 13 cDNA clones (FBXL7, PPM1B, SAMM50, PLEKHN, AIFM3, C22orf42, STK32A, FAM131C, DRICH1, MCC1, HID1, P2RX5, STK32B) were found to be human N-myristoylated proteins. Analysis of the role of protein N-myristoylation on the intracellular localization of SAMM50, a mitochondrial outer membrane protein, revealed that protein N-myristoylation was required for proper targeting of SAMM50 to mitochondria. Thus, the strategy used in this study is useful for the identification of physiologically important human N-myristoylated proteins from human cDNA resources.
Identification of Human N-Myristoylated Proteins from Human Complementary DNA Resources by Cell-Free and Cellular Metabolic Labeling Analyses

PubMed Central

Takamitsu, Emi; Otsuka, Motoaki; Haebara, Tatsuki; Yano, Manami; Matsuzaki, Kanako; Kobuchi, Hirotsugu; Moriya, Koko; Utsumi, Toshihiko

2015-01-01

To identify physiologically important human N-myristoylated proteins, 90 cDNA clones predicted to encode human N-myristoylated proteins were selected from a human cDNA resource (4,369 Kazusa ORFeome project human cDNA clones) by two bioinformatic N-myristoylation prediction systems, NMT-The MYR Predictor and Myristoylator. After database searches to exclude known human N-myristoylated proteins, 37 cDNA clones were selected as potential human N-myristoylated proteins. The susceptibility of these cDNA clones to protein N-myristoylation was first evaluated using fusion proteins in which the N-terminal ten amino acid residues were fused to an epitope-tagged model protein. Then, protein N-myristoylation of the gene products of full-length cDNAs was evaluated by metabolic labeling experiments both in an insect cell-free protein synthesis system and in transfected human cells. As a result, the products of 13 cDNA clones (FBXL7, PPM1B, SAMM50, PLEKHN, AIFM3, C22orf42, STK32A, FAM131C, DRICH1, MCC1, HID1, P2RX5, STK32B) were found to be human N-myristoylated proteins. Analysis of the role of protein N-myristoylation on the intracellular localization of SAMM50, a mitochondrial outer membrane protein, revealed that protein N-myristoylation was required for proper targeting of SAMM50 to mitochondria. Thus, the strategy used in this study is useful for the identification of physiologically important human N-myristoylated proteins from human cDNA resources. PMID:26308446

Identification of tissue-embedded ascarid larvae by ribosomal DNA sequencing.

PubMed

Ishiwata, Kenji; Shinohara, Akio; Yagi, Kinpei; Horii, Yoichiro; Tsuchiya, Kimiyuki; Nawa, Yukifumi

2004-01-01

Polymerase chain reaction (PCR) was applied to identify tissue-embedded ascarid nematode larvae. Two sequences of the internal transcribed spacer (ITS) regions of ribosomal DNA (rDNA), ITS1 and ITS2, of the ascarid parasites were amplified and compared with those of ascarid-nematodes registered in a DNA database (GenBank). The ITS sequences of the PCR products obtained from the ascarid parasite specimen in our laboratory were compatible with those of registered adult Ascaris and Toxocara parasites. PCR amplification of the ITS regions was sensitive enough to detect a single larva of Ascaris suum mixed with porcine liver tissue. Using this method, ascarid larvae embedded in the liver of a naturally infected turkey were identified as Toxocara canis. These results suggest that even a single larva embedded in tissues from patients with larva migrans could be identified by sequencing the ITS regions.
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server

PubMed Central

Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J

2006-01-01

Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
Applications of three DNA barcodes in assorting intertidal red macroalgal flora in Qingdao, China

NASA Astrophysics Data System (ADS)

Zhao, Xiaobo; Pang, Shaojun; Shan, Tifeng; Liu, Feng

2013-03-01

This study is part of the endeavor to construct a comprehensive DNA barcoding database for common seaweeds in China. Identifications of red seaweeds, which have simple morphology and anatomy, are sometimes difficult solely depending on morphological characteristics. In recent years, DNA barcode technique has become a more and more effective tool to help solve some of the taxonomic difficulties. Some DNA markers such as COI (cytochrome oxidase subunit I) are proposed as standardized DNA barcodes for all seaweed species. In this study, COI, UPA (universal plastid amplicon, domain V of 23S rRNA), and ITS (nuclear internal transcribed spacer) were employed to analyze common species of intertidal red seaweeds in Qingdao (119.3°-121°E, 35.35°-37.09°N). The applicability of using one or a few combined barcodes to identify red seaweed species was tested. The results indicated that COI is a sensitive marker at species level. However, not all the tested species gave PCR amplification products due to lack of the universal primers. The second barcode UPA had effective universal primers but needed to be tested for the effectiveness of resolving closely related species. More than one ITS sequence types were found in some species in this investigation, which might lead to confusion in further analysis. Therefore ITS sequence is not recommended as a universal barcode for seaweeds identification.
Candida guilliermondii and Other Species of Candida Misidentified as Candida famata: Assessment by Vitek 2, DNA Sequencing Analysis, and Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry in Two Global Antifungal Surveillance Programs

PubMed Central

Woosley, Leah N.; Diekema, Daniel J.; Jones, Ronald N.; Pfaller, Michael A.

2013-01-01

Candida famata (teleomorph Debaryomyces hansenii) has been described as a medically relevant yeast, and this species has been included in many commercial identification systems that are currently used in clinical laboratories. Among 53 strains collected during the SENTRY and ARTEMIS surveillance programs and previously identified as C. famata (includes all submitted strains with this identification) by a variety of commercial methods (Vitek, MicroScan, API, and AuxaColor), DNA sequencing methods demonstrated that 19 strains were C. guilliermondii, 14 were C. parapsilosis, 5 were C. lusitaniae, 4 were C. albicans, and 3 were C. tropicalis, and five isolates belonged to other Candida species (two C. fermentati and one each C. intermedia, C. pelliculosa, and Pichia fabianni). Additionally, three misidentified C. famata strains were correctly identified as Kodomaea ohmeri, Debaryomyces nepalensis, and Debaryomyces fabryi using intergenic transcribed spacer (ITS) and/or intergenic spacer (IGS) sequencing. The Vitek 2 system identified three isolates with high confidence to be C. famata and another 15 with low confidence between C. famata and C. guilliermondii or C. parapsilosis, displaying only 56.6% agreement with DNA sequencing results. Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) results displayed 81.1% agreement with DNA sequencing. One strain each of C. metapsilosis, C. fermentati, and C. intermedia demonstrated a low score for identification (<2.0) in the MALDI Biotyper. K. ohmeri, D. nepalensis, and D. fabryi identified by DNA sequencing in this study were not in the current database for the MALDI Biotyper. These results suggest that the occurrence of C. famata in fungal infections is much lower than previously appreciated and that commercial systems do not produce accurate identifications except for the newly introduced MALDI-TOF instruments. PMID:23100350
Megadalton Complexes in the Chloroplast Stroma of Arabidopsis thaliana Characterized by Size Exclusion Chromatography, Mass Spectrometry, and Hierarchical Clustering*

PubMed Central

Olinares, Paul Dominic B.; Ponnala, Lalit; van Wijk, Klaas J.

2010-01-01

To characterize MDa-sized macromolecular chloroplast stroma protein assemblies and to extend coverage of the chloroplast stroma proteome, we fractionated soluble chloroplast stroma in the non-denatured state by size exclusion chromatography with a size separation range up to ∼5 MDa. To maximize protein complex stability and resolution of megadalton complexes, ionic strength and composition were optimized. Subsequent high accuracy tandem mass spectrometry analysis (LTQ-Orbitrap) identified 1081 proteins across the complete native mass range. Protein complexes and assembly states above 0.8 MDa were resolved using hierarchical clustering, and protein heat maps were generated from normalized protein spectral counts for each of the size exclusion chromatography fractions; this complemented previous analysis of stromal complexes up to 0.8 MDa (Peltier, J. B., Cai, Y., Sun, Q., Zabrouskov, V., Giacomelli, L., Rudella, A., Ytterberg, A. J., Rutschow, H., and van Wijk, K. J. (2006) The oligomeric stromal proteome of Arabidopsis thaliana chloroplasts. Mol. Cell. Proteomics 5, 114–133). This combined experimental and bioinformatics analyses resolved chloroplast ribosomes in different assembly and functional states (e.g. 30, 50, and 70 S), which enabled the identification of plastid homologues of prokaryotic ribosome assembly factors as well as proteins involved in co-translational modifications, targeting, and folding. The roles of these ribosome-associating proteins will be discussed. Known RNA splice factors (e.g. CAF1/WTF1/RNC1) as well as uncharacterized proteins with RNA-binding domains (pentatricopeptide repeat, RNA recognition motif, and chloroplast ribosome maturation), RNases, and DEAD box helicases were found in various sized complexes. Chloroplast DNA (>3 MDa) was found in association with the complete heteromeric plastid-encoded DNA polymerase complex, and a dozen other DNA-binding proteins, e.g. DNA gyrase, topoisomerase, and various DNA repair enzymes. The heteromeric ≥5-MDa pyruvate dehydrogenase complex and the 0.8–1-MDa acetyl-CoA carboxylase complex associated with uncharacterized biotin carboxyl carrier domain proteins constitute the entry point to fatty acid metabolism in leaves; we suggest that their large size relates to the need for metabolic channeling. Protein annotations and identification data are available through the Plant Proteomics Database, and mass spectrometry data are available through Proteomics Identifications database. PMID:20423899
Functional Genomics Analysis of Singapore Grouper Iridovirus: Complete Sequence Determination and Proteomic Analysis

PubMed Central

Song, Wen Jun; Qin, Qi Wei; Qiu, Jin; Huang, Can Hua; Wang, Fan; Hew, Choy Leong

2004-01-01

Here we report the complete genome sequence of Singapore grouper iridovirus (SGIV). Sequencing of the random shotgun and restriction endonuclease genomic libraries showed that the entire SGIV genome consists of 140,131 nucleotide bp. One hundred sixty-two open reading frames (ORFs) from the sense and antisense DNA strands, coding for lengths varying from 41 to 1,268 amino acids, were identified. Computer-assisted analyses of the deduced amino acid sequences revealed that 77 of the ORFs exhibited homologies to known virus genes, 23 of which matched functional iridovirus proteins. Forty-two putative conserved domains or signatures were detected in the National Center for Biotechnology Information CD-Search database and PROSITE database. An assortment of enzyme activities involved in DNA replication, transcription, nucleotide metabolism, cell signaling, etc., were identified. Viruses were cultured on a cell line derived from the embryonated egg of the grouper Epinephelus tauvina, isolated, and purified by sucrose gradient ultracentrifugation. The protein extract from the purified virions was analyzed by polyacrylamide gel electrophoresis followed by in-gel digestion of protein bands. Matrix-assisted laser desorption ionization-time of flight mass spectrometry and database searching led to identification of 26 proteins. Twenty of these represented novel or previously unidentified genes, which were further confirmed by reverse transcription-PCR (RT-PCR) and DNA sequencing of their respective RT-PCR products. PMID:15507645
Genetic diversity of armored scales (Hemiptera: Diaspididae) and soft scales (Hemiptera: Coccidae) in Chile.

PubMed

Amouroux, P; Crochard, D; Germain, J-F; Correa, M; Ampuero, J; Groussier, G; Kreiter, P; Malausa, T; Zaviezo, T

2017-05-17

Scale insects (Sternorrhyncha: Coccoidea) are one of the most invasive and agriculturally damaging insect groups. Their management and the development of new control methods are currently jeopardized by the scarcity of identification data, in particular in regions where no large survey coupling morphological and DNA analyses have been performed. In this study, we sampled 116 populations of armored scales (Hemiptera: Diaspididae) and 112 populations of soft scales (Hemiptera: Coccidae) in Chile, over a latitudinal gradient ranging from 18°S to 41°S, on fruit crops, ornamental plants and trees. We sequenced the COI and 28S genes in each population. In total, 19 Diaspididae species and 11 Coccidae species were identified morphologically. From the 63 COI haplotypes and the 54 28S haplotypes uncovered, and using several DNA data analysis methods (Automatic Barcode Gap Discovery, K2P distance, NJ trees), up to 36 genetic clusters were detected. Morphological and DNA data were congruent, except for three species (Aspidiotus nerii, Hemiberlesia rapax and Coccus hesperidum) in which DNA data revealed highly differentiated lineages. More than 50% of the haplotypes obtained had no high-scoring matches with any of the sequences in the GenBank database. This study provides 63 COI and 54 28S barcode sequences for the identification of Coccoidea from Chile.
Evaluating multiplexed next-generation sequencing as a method in palynology for mixed pollen samples.

PubMed

Keller, A; Danner, N; Grimmer, G; Ankenbrand, M; von der Ohe, K; von der Ohe, W; Rost, S; Härtel, S; Steffan-Dewenter, I

2015-03-01

The identification of pollen plays an important role in ecology, palaeo-climatology, honey quality control and other areas. Currently, expert knowledge and reference collections are essential to identify pollen origin through light microscopy. Pollen identification through molecular sequencing and DNA barcoding has been proposed as an alternative approach, but the assessment of mixed pollen samples originating from multiple plant species is still a tedious and error-prone task. Next-generation sequencing has been proposed to avoid this hindrance. In this study we assessed mixed pollen probes through next-generation sequencing of amplicons from the highly variable, species-specific internal transcribed spacer 2 region of nuclear ribosomal DNA. Further, we developed a bioinformatic workflow to analyse these high-throughput data with a newly created reference database. To evaluate the feasibility, we compared results from classical identification based on light microscopy from the same samples with our sequencing results. We assessed in total 16 mixed pollen samples, 14 originated from honeybee colonies and two from solitary bee nests. The sequencing technique resulted in higher taxon richness (deeper assignments and more identified taxa) compared to light microscopy. Abundance estimations from sequencing data were significantly correlated with counted abundances through light microscopy. Simulation analyses of taxon specificity and sensitivity indicate that 96% of taxa present in the database are correctly identifiable at the genus level and 70% at the species level. Next-generation sequencing thus presents a useful and efficient workflow to identify pollen at the genus and species level without requiring specialised palynological expert knowledge. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Blood meal identification and parasite detection in laboratory-fed and field-captured Lutzomyia longipalpis by PCR using FTA databasing paper

PubMed Central

Sant’Anna, Mauricio R.V.; Jones, Nathaniel G.; Hindley, Jonathan A.; Mendes-Sousa, Antonio F.; Dillon, Rod J.; Cavalcante, Reginaldo R.; Alexander, Bruce; Bates, Paul A.

2008-01-01

The phlebotomine sand fly Lutzomyia longipalpis takes blood from a variety of wild and domestic animals and transmits Leishmania (Leishmania) infantum chagasi, etiological agent of American visceral leishmaniasis. Blood meal identification in sand flies has depended largely on serological methods but a new protocol described here uses filter-based technology to stabilise and store blood meal DNA, allowing subsequent PCR identification of blood meal sources, as well as parasite detection, in blood-fed sand flies. This technique revealed that 53.6% of field-collected sand flies captured in the back yards of houses in Teresina (Brazil) had fed on chickens. The potential applications of this technique in epidemiological studies and strategic planning for leishmaniasis control programmes are discussed. PMID:18606150
Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease.

PubMed

Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao; Ning, Shangwei; Jin, Lianhong; Li, Xia

2018-01-04

Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of Uvaria sp by barcoding coupled with high-resolution melting analysis (Bar-HRM).

PubMed

Osathanunkul, M; Madesis, P; Ounjai, S; Pumiputavon, K; Somboonchai, R; Lithanatudom, P; Chaowasku, T; Wipasa, J; Suwannapoom, C

2016-01-13

DNA barcoding, which was developed about a decade ago, relies on short, standardized regions of the genome to identify plant and animal species. This method can be used to not only identify known species but also to discover novel ones. Numerous sequences are stored in online databases worldwide. One of the ways to save cost and time (by omitting the sequencing step) in species identification is to use available barcode data to design optimized primers for further analysis, such as high-resolution melting analysis (HRM). This study aimed to determine the effectiveness of the hybrid method Bar-HRM (DNA barcoding combined with HRM) to identify species that share similar external morphological features, rather than conduct traditional taxonomic identification that require major parts (leaf, flower, fruit) of the specimens. The specimens used for testing were those, which could not be identified at the species level and could either be Uvaria longipes or Uvaria wrayias, indicated by morphological identification. Primer pairs derived from chloroplast regions (matK, psbA-trnH, rbcL, and trnL) were used in the Bar-HRM. The results obtained from psbA-trnH primers were good enough to help in identifying the specimen while the rest were not. Bar-HRM analysis was proven to be a fast and cost-effective method for plant species identification.
Forensic botany: species identification of botanical trace evidence using a multigene barcoding approach.

PubMed

Ferri, Gianmarco; Alù, Milena; Corradini, Beatrice; Beduschi, Giovanni

2009-09-01

Forensic botany can provide significant supporting evidence during criminal investigations. However, it is still an underutilized field of investigation with its most common application limited to identifying specific as well as suspected illegal plants. The ubiquitous presence of plant species can be useful in forensics, but the absence of an accurate identification system remains the major obstacle to the present inability to routinely and correctly identify trace botanical evidence. Many plant materials cannot be identified and differentiated to the species level by traditional morphological characteristics when botanical specimens are degraded and lack physical features. By taking advantage of a universal barcode system, DNA sequencing, and other biomolecular techniques used routinely in forensic investigations, two chloroplast DNA regions were evaluated for their use as "barcoding" markers for plant identification in the field of forensics. We therefore investigated the forensic use of two non-coding plastid regions, psbA-trnH and trnL-trnF, to create a multimarker system for species identification that could be useful throughout the plant kingdom. The sequences from 63 plants belonging to our local flora were submitted and registered on the GenBank database. Sequence comparison to set up the level of identification (species, genus, or family) through Blast algorithms allowed us to assess the suitability of this method. The results confirmed the effectiveness of our botanic universal multimarker assay in forensic investigations.
Forensic analysis of mtDNA haplotypes from two rural communities in Haiti reflects their population history.

PubMed

Wilson, Jamie L; Saint-Louis, Vertus; Auguste, Jensen O; Jackson, Bruce A

2012-11-01

Very little genetic data exist on Haitians, an estimated 1.2 million of whom, not including illegal immigrants, reside in the United States. The absence of genetic data on a population of this size reduces the discriminatory power of criminal and missing-person DNA databases in the United States and Caribbean. We present a forensic population study that provides the first genetic data set for Haiti. This study uses hypervariable segment one (HVS-1) mitochondrial DNA (mtDNA) nucleotide sequences from 291 subjects primarily from rural areas of northern and southern Haiti, where admixture would be minimal. Our results showed that the African maternal genetic component of Haitians had slightly higher West-Central African admixture than African-Americans and Dominicans, but considerably less than Afro-Brazilians. These results lay the foundation for further forensic genetics studies in the Haitian population and serve as a model for forensic mtDNA identification of individuals in other isolated or rural communities. © 2012 American Academy of Forensic Sciences.
Identification of promising DNA GyrB inhibitors for Tuberculosis using pharmacophore-based virtual screening, molecular docking and molecular dynamics studies.

PubMed

Islam, Md Ataul; Pillay, Tahir S

2017-08-01

In this study, we searched for potential DNA GyrB inhibitors using pharmacophore-based virtual screening followed by molecular docking and molecular dynamics simulation approaches. For this purpose, a set of 248 DNA GyrB inhibitors was collected from the literature and a well-validated pharmacophore model was generated. The best pharmacophore model explained that two each of hydrogen bond acceptors and hydrophobicity regions were critical for inhibition of DNA GyrB. Good statistical results of the pharmacophore model indicated that the model was robust in nature. Virtual screening of molecular databases revealed three molecules as potential antimycobacterial agents. The final screened promising compounds were evaluated in molecular docking and molecular dynamics simulation studies. In the molecular dynamics studies, RMSD and RMSF values undoubtedly explained that the screened compounds formed stable complexes with DNA GyrB. Therefore, it can be concluded that the compounds identified may have potential for the treatment of TB. © 2017 John Wiley & Sons A/S.
Lindnera (Pichia) fabianii blood infection after mesenteric ischemia.

PubMed

Gabriel, Frederic; Noel, Thierry; Accoceberry, Isabelle

2012-04-01

Lindnera (Pichia) fabianii (teleomorph of Candida fabianii) is a yeast species rarely involved in human infections. This report describes the first known human case of a Lindnera fabianii blood infection after mesenteric ischemia. The 53-year-old patient was hospitalized in the intensive care unit after a suicide attempt and was suffering from a mesenteric ischemia and acute renal failure. Lindnera fabianii was recovered from an oropharyngeal swab, then isolated from stool and urine samples before the diagnosis of the blood infection. Caspofungin intravenous treatment was associated with a successful outcome. Final unequivocal identification of the strain was done by sequencing the internal transcribed spacer (ITS) region, and regions of 18S rDNA gene and of the translation elongation factor-1α gene. Until our work, the genomic databases did not contain the complete ITS region of L. fabianii as a single nucleotide sequence (encompassing ITS1, the 5.8S rDNA and ITS2), and misidentification with other yeast species, e.g., Lindnera (Pichia) mississippiensis, could have occurred. Our work demonstrates that the usual DNA barcoding method based on sequencing of the ITS region may fail to provide the correct identification of some taxa, and that partial sequencing of the EF1α gene may be much more effective for the accurate delineation and molecular identification of new emerging opportunistic yeast pathogens.
Screening, Isolation and Identification of Probiotic Producing Lactobacillus acidophilus Strains EMBS081 & EMBS082 by 16S rRNA Gene Sequencing.

PubMed

Chandok, Harshpreet; Shah, Pratik; Akare, Uday Raj; Hindala, Maliram; Bhadoriya, Sneha Singh; Ravi, G V; Sharma, Varsha; Bandaru, Srinivas; Rathore, Pragya; Nayarisseri, Anuraj

2015-09-01

16S rDNA sequencing which has gained wide popularity amongst microbiologists for the molecular characterization and identification of newly discovered isolates provides accurate identification of isolates down to the level of sub-species (strain). Its most important advantage over the traditional biochemical characterization methods is that it can provide an accurate identification of strains with atypical phenotypic characters as well. The following work is an application of 16S rRNA gene sequencing approach to identify a novel species of Probiotic Lactobacillus acidophilus. The sample was collected from pond water samples of rural and urban areas of Krishna district, Vijayawada, Andhra Pradesh, India. Subsequently, the sample was serially diluted and the aliquots were incubated for a suitable time period following which the suspected colony was subjected to 16S rDNA sequencing. The sequence aligned against other species was concluded to be a novel, Probiotic L. acidophilus bacteria, further which were named L. acidophilus strain EMBS081 & EMBS082. After the sequence characterization, the isolate was deposited in GenBank Database, maintained by the National Centre for Biotechnology Information NCBI. The sequence can also be retrieve from EMBL and DDBJ repositories with accession numbers JX255677 and KC150145.
RiceFOX: a database of Arabidopsis mutant lines overexpressing rice full-length cDNA that contains a wide range of trait information to facilitate analysis of gene function.

PubMed

Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami

2011-02-01

Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named 'RiceFOX'. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.
RiceFOX: A Database of Arabidopsis Mutant Lines Overexpressing Rice Full-Length cDNA that Contains a Wide Range of Trait Information to Facilitate Analysis of Gene Function

PubMed Central

Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami

2011-01-01

Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named ‘RiceFOX’. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/. PMID:21186176
FragIdent--automatic identification and characterisation of cDNA-fragments.

PubMed

Seelow, Dominik; Goehler, Heike; Hoffmann, Katrin

2009-03-02

Many genetic studies and functional assays are based on cDNA fragments. After the generation of cDNA fragments from an mRNA sample, their content is at first unknown and must be assigned by sequencing reactions or hybridisation experiments. Even in characterised libraries, a considerable number of clones are wrongly annotated. Furthermore, mix-ups can happen in the laboratory. It is therefore essential to the relevance of experimental results to confirm or determine the identity of the employed cDNA fragments. However, the manual approach for the characterisation of these fragments using BLAST web interfaces is not suited for larger number of sequences and so far, no user-friendly software is publicly available. Here we present the development of FragIdent, an application for the automatic identification of open reading frames (ORFs) within cDNA-fragments. The software performs BLAST analyses to identify the genes represented by the sequences and suggests primers to complete the sequencing of the whole insert. Gene-specific information as well as the protein domains encoded by the cDNA fragment are retrieved from Internet-based databases and included in the output. The application features an intuitive graphical interface and is designed for researchers without any bioinformatics skills. It is suited for projects comprising up to several hundred different clones. We used FragIdent to identify 84 cDNA clones from a yeast two-hybrid experiment. Furthermore, we identified 131 protein domains within our analysed clones. The source code is freely available from our homepage at http://compbio.charite.de/genetik/FragIdent/.
R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring.

PubMed

Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès

2016-01-01

Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/. © The Author(s) 2016. Published by Oxford University Press.

R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring

PubMed Central

Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès

2016-01-01

Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/ PMID:26989149
The practical evaluation of DNA barcode efficacy.

PubMed

Spouge, John L; Mariño-Ramírez, Leonardo

2012-01-01

This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.
DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability

PubMed Central

Little, Damon P.

2011-01-01

For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types. PMID:21857897
cDNA cloning of Brassica napus malonyl-CoA:ACP transacylase (MCAT) (fab D) and complementation of an E. coli MCAT mutant.

PubMed

Simon, J W; Slabas, A R

1998-09-18

The GenBank database was searched using the E. coli malonyl CoA:ACP transacylase (MCAT) sequence, for plant protein/cDNA sequences corresponding to MCAT, a component of plant fatty acid synthetase (FAS), for which the plant cDNA has not been isolated. A 272-bp Zea mays EST sequence (GenBank accession number: AA030706) was identified which has strong homology to the E. coli MCAT. A PCR derived cDNA probe from Zea mays was used to screen a Brassica napus (rape) cDNA library. This resulted in the isolation of a 1200-bp cDNA clone which encodes an open reading frame corresponding to a protein of 351 amino acids. The protein shows 47% homology to the E. coli MCAT amino acid sequence in the coding region for the mature protein. Expression of a plasmid (pMCATrap2) containing the plant cDNA sequence in Fab D89, an E. coli mutant, in MCAT activity restores growth demonstrating functional complementation and direct function of the cloned cDNA. This is the first functional evidence supporting the identification of a plant cDNA for MCAT.
Identification of an expressed gene in Dipylidium caninum.

PubMed

Miranda, Rodrigo R C; Costa-Júnior, Livio M; Campos, Artur K; Santos, Hudson A; Rabelo, Elida M L

2004-10-01

Recombinant DNA studies have been focused on developing vaccines to different cestodes. But few studies involving Dipylidium caninum molecular biology and genes have been done. Only partial sequences of mitochondrial DNA and ribosomal RNA gene are available in databases. Any molecular work with this parasite, including epidemiology, study of drug-resistant strains, and vaccine development, is hampered by the lack of knowledge of its genome. Thus, the knowledge of specific genes of different developmental stages of D. caninum is crucial to locate potential targets to be used as candidates to develop a vaccine and/or new drugs against this parasite. Here we report, for the first time, the sequencing of a fragment of a D. caninum expressed gene.
Multicenter Evaluation of the Vitek MS v3.0 System for the Identification of Filamentous Fungi.

PubMed

Rychert, Jenna; Slechta, E Sue; Barker, Adam P; Miranda, Edwin; Babady, N Esther; Tang, Yi-Wei; Gibas, Connie; Wiederhold, Nathan; Sutton, DeAnna; Hanson, Kimberly E

2018-02-01

Invasive fungal infections are an important cause of morbidity and mortality affecting primarily immunocompromised patients. While fungal identification to the species level is critical to providing appropriate therapy, it can be slow and laborious and often relies on subjective morphological criteria. The use of matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry has the potential to speed up and improve the accuracy of identification. In this multicenter study, we evaluated the accuracy of the Vitek MS v3.0 system in identifying 1,601 clinical mold isolates compared to identification by DNA sequence analysis and supported by morphological and phenotypic testing. Among the 1,519 isolates representing organisms in the v3.0 database, 91% ( n = 1,387) were correctly identified to the species level. An additional 27 isolates (2%) were correctly identified to the genus level. Fifteen isolates were incorrectly identified, due to either a single incorrect identification ( n = 13) or multiple identifications from different genera ( n = 2). In those cases, when a single identification was provided that was not correct, the misidentification was within the same genus. The Vitek MS v3.0 was unable to identify 91 (6%) isolates, despite repeat testing. These isolates were distributed among all the genera. When considering all isolates tested, even those that were not represented in the database, the Vitek MS v3.0 provided a single correct identification 98% of the time. These findings demonstrate that the Vitek MS v3.0 system is highly accurate for the identification of common molds encountered in the clinical mycology laboratory. Copyright © 2018 American Society for Microbiology.
Novel primers for complete mitochondrial cytochrome b genesequencing in mammals

USGS Publications Warehouse

Naidu, Ashwin; Fitak, Robert R.; Munguia-Vega, Adrian; Culver, Melanie

2011-01-01

Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.
Fish species identification using PCR-RFLP analysis and lab-on-a-chip capillary electrophoresis: application to detect white fish species in food products and an interlaboratory study.

PubMed

Dooley, John J; Sage, Helen D; Clarke, Marie-Anne L; Brown, Helen M; Garrett, Stephen D

2005-05-04

Identification of 10 white fish species associated with U.K. food products was achieved using PCR-RFLP of the mitochondrial cytochrome b gene. Use of lab-on-a-chip capillary electrophoresis for end-point analysis enabled accurate sizing of DNA fragments and identification of fish species at a level of 5% (w/w) in a fish admixture. One restriction enzyme, DdeI, allowed discrimination of eight species. When combined with NlaIII and HaeIII, specific profiles for all 10 species were generated. The method was applied to a range of products and subjected to an interlaboratory study carried out by five U.K. food control laboratories. One hundred percent correct identification of single species samples and six of nine admixture samples was achieved by all laboratories. The results indicated that fish species identification could be carried out using a database of PCR-RFLP profiles without the need for reference materials.
Development of a Genome-Proxy Microarray for Profiling Marine Microbial Communities and its Application to a Time Series in Monterey Bay, California

DTIC Science & Technology

2008-09-01

community representation. 12 survey a complex microbial community. Community DNA or rRNA extracted from a sample may require amplification before...restricted to cultivated clades, since not only do many clades have sufficient database representation due to 16S environmental surveys , but such...well developed for standard and comprehensive surveys . Depending on the population being targeted and the identification method, FCM can be a
RISSC: a novel database for ribosomal 16S–23S RNA genes spacer regions

PubMed Central

García-Martínez, Jesús; Bescós, Ignacio; Rodríguez-Sala, Jesús Javier; Rodríguez-Valera, Francisco

2001-01-01

A novel database, under the acronym RISSC (Ribosomal Intergenic Spacer Sequence Collection), has been created. It compiles more than 1600 entries of edited DNA sequence data from the 16S–23S ribosomal spacers present in most prokaryotes and organelles (e.g. mitochondria and chloroplasts) and is accessible through the Internet (http://ulises.umh.es/RISSC), where systematic searches for specific words can be conducted, as well as BLAST-type sequence searches. Additionally, a characteristic feature of this region, the presence/absence and nature of tRNA genes within the spacer, is included in all the entries, even when not previously indicated in the original database. All these combined features could provide a useful documentation tool for studies on evolution, identification, typing and strain characterization, among others. PMID:11125084
Pioneer identification of fake tiger claws using morphometric and DNA-based analysis in wildlife forensics in India.

PubMed

Vipin; Sharma, Vinita; Sharma, Chandra Prakash; Kumar, Ved Prakash; Goyal, Surendra Prakash

2016-09-01

The illegal trade in wildlife is a serious threat to the existence of wild animals throughout the world. The short supply and high demand for wildlife articles have caused an influx of many different forms of fake wildlife articles into this trade. The task of identifying the materials used in making such articles poses challenges in wildlife forensics as different approaches are required for species identification. Claws constitute 3.8% of the illegal animal parts (n=2899) received at the Wildlife Institute of India (WII) for species identification. We describe the identification of seized suspected tiger claws (n=18) using a combined approach of morphometric and DNA-based analysis. The differential keratin density, determined using X-ray radiographs, indicated that none of the 18 claws were of any large cat but were fake. We determined three claw measurements, viz. ac (from the external coronary dermo-epidermal interface to the epidermis of the skin fold connecting the palmar flanges of the coronary horn), bc (from the claw tip to the epidermis of the skin fold connecting the palmar flanges of the coronary horn) and the ratio bc/ac, for all the seized (n=18), tiger (n=23) and leopard (n=49) claws. Univariate and multivariate statistical analyses were performed using SPSS. A scatter plot generated using canonical discriminant function analysis revealed that of the 18 seized claws, 14 claws formed a cluster separate from the clusters of the tiger and leopard claws, whereas the remaining four claws were within the leopard cluster. Because a discrepancy was observed between the X-ray images and the measurements of these four claws, one of the claw that clustered with the leopard claws was chosen randomly and DNA analysis carried out using the cyt b (137bp) and 16S rRNA (410bp) genes. A BLAST search and comparison with the reference database at WII indicated that the keratin material of the claw was derived from Bos taurus (cattle). This is a pioneering discovery, and we suggest that a hierarchical combination of techniques be used for identifying claws involved in wildlife offences, i.e. that an X-ray, morphometric and DNA-based analysis be carried out, to ascertain whether the claws are of tigers or leopards. To identify species in the illegal wildlife trade morphometric and genetic reference database should be developed. Morphological features as well as DNA profiles need to be used for better implementation of the Wildlife (Protection) Act, 1972 of India and other laws/treaties in South-east Asia. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites.

PubMed

Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo

2014-01-01

Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites

PubMed Central

Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo

2014-01-01

Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955
Utility of 16S rDNA Sequencing for Identification of Rare Pathogenic Bacteria.

PubMed

Loong, Shih Keng; Khor, Chee Sieng; Jafar, Faizatul Lela; AbuBakar, Sazaly

2016-11-01

Phenotypic identification systems are established methods for laboratory identification of bacteria causing human infections. Here, the utility of phenotypic identification systems was compared against 16S rDNA identification method on clinical isolates obtained during a 5-year study period, with special emphasis on isolates that gave unsatisfactory identification. One hundred and eighty-seven clinical bacteria isolates were tested with commercial phenotypic identification systems and 16S rDNA sequencing. Isolate identities determined using phenotypic identification systems and 16S rDNA sequencing were compared for similarity at genus and species level, with 16S rDNA sequencing as the reference method. Phenotypic identification systems identified ~46% (86/187) of the isolates with identity similar to that identified using 16S rDNA sequencing. Approximately 39% (73/187) and ~15% (28/187) of the isolates showed different genus identity and could not be identified using the phenotypic identification systems, respectively. Both methods succeeded in determining the species identities of 55 isolates; however, only ~69% (38/55) of the isolates matched at species level. 16S rDNA sequencing could not determine the species of ~20% (37/187) of the isolates. The 16S rDNA sequencing is a useful method over the phenotypic identification systems for the identification of rare and difficult to identify bacteria species. The 16S rDNA sequencing method, however, does have limitation for species-level identification of some bacteria highlighting the need for better bacterial pathogen identification tools. © 2016 Wiley Periodicals, Inc.
DNA fingerprinting of Chinese melon provides evidentiary support of seed quality appraisal.

PubMed

Gao, Peng; Ma, Hongyan; Luan, Feishi; Song, Haibin

2012-01-01

Melon, Cucumis melo L. is an important vegetable crop worldwide. At present, there are phenomena of homonyms and synonyms present in the melon seed markets of China, which could cause variety authenticity issues influencing the process of melon breeding, production, marketing and other aspects. Molecular markers, especially microsatellites or simple sequence repeats (SSRs) are playing increasingly important roles for cultivar identification. The aim of this study was to construct a DNA fingerprinting database of major melon cultivars, which could provide a possibility for the establishment of a technical standard system for purity and authenticity identification of melon seeds. In this study, to develop the core set SSR markers, 470 polymorphic SSRs were selected as the candidate markers from 1219 SSRs using 20 representative melon varieties (lines). Eighteen SSR markers, evenly distributed across the genome and with the highest contents of polymorphism information (PIC) were identified as the core marker set for melon DNA fingerprinting analysis. Fingerprint codes for 471 melon varieties (lines) were established. There were 51 materials which were classified into17 groups based on sharing the same fingerprint code, while field traits survey results showed that these plants in the same group were synonyms because of the same or similar field characters. Furthermore, DNA fingerprinting quick response (QR) codes of 471 melon varieties (lines) were constructed. Due to its fast readability and large storage capacity, QR coding melon DNA fingerprinting is in favor of read convenience and commercial applications.
DNA Fingerprinting of Chinese Melon Provides Evidentiary Support of Seed Quality Appraisal

PubMed Central

Gao, Peng; Ma, Hongyan; Luan, Feishi; Song, Haibin

2012-01-01

Melon, Cucumis melo L. is an important vegetable crop worldwide. At present, there are phenomena of homonyms and synonyms present in the melon seed markets of China, which could cause variety authenticity issues influencing the process of melon breeding, production, marketing and other aspects. Molecular markers, especially microsatellites or simple sequence repeats (SSRs) are playing increasingly important roles for cultivar identification. The aim of this study was to construct a DNA fingerprinting database of major melon cultivars, which could provide a possibility for the establishment of a technical standard system for purity and authenticity identification of melon seeds. In this study, to develop the core set SSR markers, 470 polymorphic SSRs were selected as the candidate markers from 1219 SSRs using 20 representative melon varieties (lines). Eighteen SSR markers, evenly distributed across the genome and with the highest contents of polymorphism information (PIC) were identified as the core marker set for melon DNA fingerprinting analysis. Fingerprint codes for 471 melon varieties (lines) were established. There were 51 materials which were classified into17 groups based on sharing the same fingerprint code, while field traits survey results showed that these plants in the same group were synonyms because of the same or similar field characters. Furthermore, DNA fingerprinting quick response (QR) codes of 471 melon varieties (lines) were constructed. Due to its fast readability and large storage capacity, QR coding melon DNA fingerprinting is in favor of read convenience and commercial applications. PMID:23285039
DNA barcoding commercially important fish species of Turkey.

PubMed

Keskın, Emre; Atar, Hasan H

2013-09-01

DNA barcoding was used in the identification of 89 commercially important freshwater and marine fish species found in Turkish ichthyofauna. A total of 1765 DNA barcodes using a 654-bp-long fragment of the mitochondrial cytochrome c oxidase subunit I gene were generated for 89 commercially important freshwater and marine fish species found in Turkish ichthyofauna. These species belong to 70 genera, 40 families and 19 orders from class Actinopterygii, and all were associated with a distinct DNA barcode. Nine and 12 of the COI barcode clusters represent the first species records submitted to the BOLD and GenBank databases, respectively. All COI barcodes (except sequences of first species records) were matched with reference sequences of expected species, according to morphological identification. Average nucleotide frequencies of the data set were calculated as T = 29.7%, C = 28.2%, A = 23.6% and G = 18.6%. Average pairwise genetic distance among individuals were estimated as 0.32%, 9.62%, 17,90% and 22.40% for conspecific, congeneric, confamilial and within order, respectively. Kimura 2-parameter genetic distance values were found to increase with taxonomic level. For most of the species analysed in our data set, there is a barcoding gap, and an overlap in the barcoding gap exists for only two genera. Neighbour-joining trees were drawn based on DNA barcodes and all the specimens clustered in agreement with their taxonomic classification at species level. Results of this study supported DNA barcoding as an efficient molecular tool for a better monitoring, conservation and management of fisheries. © 2013 John Wiley & Sons Ltd.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

PubMed

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Monitoring and Surveillance of Marine Invasive Species in Californian Waters by DNA Barcoding: Methodological and Analytical Solutions

NASA Astrophysics Data System (ADS)

Campbell, T. L.; Geller, J. B.; Heller, P.; Ruiz, G.; Chang, A.; McCann, L.; Ceballos, L.; Marraffini, M.; Ashton, G.; Larson, K.; Havard, S.; Meagher, K.; Wheelock, M.; Drake, C.; Rhett, G.

2016-02-01

The Ballast Water Management Act, the Marine Invasive Species Act, and the Coastal Ecosystem Protection Act require the California Department of Fish and Wildlife to monitor and evaluate the extent of biological invasions in the state's marine and estuarine waters. This has been performed statewide, using a variety of methodologies. Conventional sample collection and processing is laborious, slow and costly, and may require considerable taxonomic expertise requiring detailed time-consuming microscopic study of multiple specimens. These factors limit the volume of biomass that can be searched for introduced species. New technologies continue to reduce the cost and increase the throughput of genetic analyses, which become efficient alternatives to traditional morphological analysis for identification, monitoring and surveillance of marine invasive species. Using next-generation sequencing of mitochondrial Cytochrome c oxidase subunit I (COI) and nuclear large subunit ribosomal RNA (LSU), we analyzed over 15,000 individual marine invertebrates collected in Californian waters. We have created sequence databases of California native and non-native species to assist in molecular identification and surveillance in North American waters. Metagenetics, the next-generation sequencing of environmental samples with comparison to DNA sequence databases, is a faster and cost-effective alternative to individual sample analysis. We have sequenced from biomass collected from whole settlement plates and plankton in California harbors, and used our introduced species database to create species lists. We can combine these species lists for individual marinas with collected environmental data, such as temperature, salinity, and dissolved oxygen to understand the ecology of marine invasions. Here we discuss high throughput sampling, sequencing, and COASTLINE, our data analysis answer to challenges working with hundreds of millions of sequencing reads from tens of thousands of specimens.
Complementary DNA sequencing and identification of mRNAs from the venomous gland of Agkistrodon piscivorus leucostoma.

PubMed

Jia, Ying; Cantu, Bruno A; Sánchez, Elda E; Pérez, John C

2008-06-15

To advance our knowledge on the snake venom composition and transcripts expressed in venom gland at the molecular level, we constructed a cDNA library from the venom gland of Agkistrodon piscivorus leucostoma for the generation of expressed sequence tags (ESTs) database. From the randomly sequenced 2112 independent clones, we have obtained ESTs for 1309 (62%) cDNAs, which showed significant deduced amino acid sequence similarity (scores >80) to previously characterized proteins in National Center for Biotechnology Information (NCBI) database. Ribosomal proteins make up 47 clones (2%) and the remaining 756 (36%) cDNAs represent either unknown identity or show BLASTX sequence identity scores of <80 with known GenBank accessions. The most highly expressed gene encoding phospholipase A(2) (PLA(2)) accounting for 35% of A. p. leucostoma venom gland cDNAs was identified and further confirmed by crude venom applied to sodium dodecyl sulfate/polyacrylamide gel electrophoresis (SDS-PAGE) electrophoresis and protein sequencing. A total of 180 representative genes were obtained from the sequence assemblies and deposited to EST database. Clones showing sequence identity to disintegrins, thrombin-like enzymes, hemorrhagic toxins, fibrinogen clotting inhibitors and plasminogen activators were also identified in our EST database. These data can be used to develop a research program that will help us identify genes encoding proteins that are of medical importance or proteins involved in the mechanisms of the toxin venom.

[Applications of DNA identification technology in protection of wild animals].

PubMed

Ni, Ping-Ya; Pei, Li; Ge, Wen-Dong; Zhang, Ying; Yang, Xue-Ying; Xu, Xiao-Yu; Tu, Zheng

2011-12-01

With the development of biotechnology, forensic DNA identification technology in protection of wild animals has been used more and more widely. This review introduces the global status of wildlife crime and the relevant protection to wildlife, outlines the practical applications of forensic DNA identification technology with regard to species identification, determination of geographic origin, individual identification and paternity identification. It focus on the techniques commonly used in DNA typing and their merits and demerits, as well as the problems and prospects of forensic DNA technology for wildlife conservation.
Use of DNA Barcodes to Identify Invasive Armyworm Spodoptera Species in Florida

PubMed Central

Nagoshi, Rodney N.; Brambila, Julieta; Meagher, Robert L.

2011-01-01

A critical component for sustaining adequate food production is the protection of local agriculture from invasive pest insects. Essential to this goal is the ability to accurately distinguish foreign from closely related domestic species, a process that has traditionally required identification using diagnostic morphological “keys” that can be both subtle and labor-intensive. This is the case for the Lepidopteran group of insects represented by Spodoptera, a genus of Noctuidae “armyworm” moths that includes several important agricultural pests. Two of the most destructive species, Spodoptera littoralis (Boisduval) (Lepidoptera: Noctuidae) and S. litura (F.) are not yet established in North America. To facilitate the monitoring for these pests, the feasibility of using DNA barcoding methodology for distinguishing between domestic and foreign Spodoptera species was tested. A DNA barcoding database was derived for a subset of Spodoptera species native to Florida, with an emphasis on those attracted to pheromone blends developed for S. litura or S. littoralis. These were then compared to the barcode sequences of S. litura collected from Taiwan and S. littoralis from Portugal. Consistent discrimination of the different species was obtained with phenetic relationships produced that were generally in agreement with phylogenetic studies using morphological characteristics. The data presented here indicate that DNA barcoding has the potential to be an efficient and accurate supplement to morphological methods for the identification of invasive Spodoptera pests in North America. PMID:22239735
Use of DNA barcodes to identify invasive armyworm Spodoptera species in Florida.

PubMed

Nagoshi, Rodney N; Brambila, Julieta; Meagher, Robert L

2011-01-01

A critical component for sustaining adequate food production is the protection of local agriculture from invasive pest insects. Essential to this goal is the ability to accurately distinguish foreign from closely related domestic species, a process that has traditionally required identification using diagnostic morphological "keys" that can be both subtle and labor-intensive. This is the case for the Lepidopteran group of insects represented by Spodoptera, a genus of Noctuidae "armyworm" moths that includes several important agricultural pests. Two of the most destructive species, Spodoptera littoralis (Boisduval) (Lepidoptera: Noctuidae) and S. litura (F.) are not yet established in North America. To facilitate the monitoring for these pests, the feasibility of using DNA barcoding methodology for distinguishing between domestic and foreign Spodoptera species was tested. A DNA barcoding database was derived for a subset of Spodoptera species native to Florida, with an emphasis on those attracted to pheromone blends developed for S. litura or S. littoralis. These were then compared to the barcode sequences of S. litura collected from Taiwan and S. littoralis from Portugal. Consistent discrimination of the different species was obtained with phenetic relationships produced that were generally in agreement with phylogenetic studies using morphological characteristics. The data presented here indicate that DNA barcoding has the potential to be an efficient and accurate supplement to morphological methods for the identification of invasive Spodoptera pests in North America.
Molecular Identification of Commercialized Medicinal Plants in Southern Morocco

PubMed Central

Krüger, Åsa; Rydberg, Anders; Abbad, Abdelaziz; Björk, Lars; Martin, Gary

2012-01-01

Background Medicinal plant trade is important for local livelihoods. However, many medicinal plants are difficult to identify when they are sold as roots, powders or bark. DNA barcoding involves using a short, agreed-upon region of a genome as a unique identifier for species– ideally, as a global standard. Research Question What is the functionality, efficacy and accuracy of the use of barcoding for identifying root material, using medicinal plant roots sold by herbalists in Marrakech, Morocco, as a test dataset. Methodology In total, 111 root samples were sequenced for four proposed barcode regions rpoC1, psbA-trnH, matK and ITS. Sequences were searched against a tailored reference database of Moroccan medicinal plants and their closest relatives using BLAST and Blastclust, and through inference of RAxML phylograms of the aligned market and reference samples. Principal Findings Sequencing success was high for rpoC1, psbA-trnH, and ITS, but low for matK. Searches using rpoC1 alone resulted in a number of ambiguous identifications, indicating insufficient DNA variation for accurate species-level identification. Combining rpoC1, psbA-trnH and ITS allowed the majority of the market samples to be identified to genus level. For a minority of the market samples, the barcoding identification differed significantly from previous hypotheses based on the vernacular names. Conclusions/Significance Endemic plant species are commercialized in Marrakech. Adulteration is common and this may indicate that the products are becoming locally endangered. Nevertheless the majority of the traded roots belong to species that are common and not known to be endangered. A significant conclusion from our results is that unknown samples are more difficult to identify than earlier suggested, especially if the reference sequences were obtained from different populations. A global barcoding database should therefore contain sequences from different populations of the same species to assure the reference sequences characterize the species throughout its distributional range. PMID:22761800
DNA barcoding the native flowering plants and conifers of Wales.

PubMed

de Vere, Natasha; Rich, Tim C G; Ford, Col R; Trinder, Sarah A; Long, Charlotte; Moore, Chris W; Satterthwaite, Danielle; Davies, Helena; Allainguillaume, Joel; Ronca, Sandra; Tatarinova, Tatiana; Garbett, Hannah; Walker, Kevin; Wilkinson, Mike J

2012-01-01

We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification.
DNA Barcoding the Native Flowering Plants and Conifers of Wales

PubMed Central

de Vere, Natasha; Rich, Tim C. G.; Ford, Col R.; Trinder, Sarah A.; Long, Charlotte; Moore, Chris W.; Satterthwaite, Danielle; Davies, Helena; Allainguillaume, Joel; Ronca, Sandra; Tatarinova, Tatiana; Garbett, Hannah; Walker, Kevin; Wilkinson, Mike J.

2012-01-01

We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification. PMID:22701588
PFR²: a curated database of planktonic foraminifera 18S ribosomal DNA as a resource for studies of plankton ecology, biogeography and evolution.

PubMed

Morard, Raphaël; Darling, Kate F; Mahé, Frédéric; Audic, Stéphane; Ujiié, Yurika; Weiner, Agnes K M; André, Aurore; Seears, Heidi A; Wade, Christopher M; Quillévéré, Frédéric; Douady, Christophe J; Escarguel, Gilles; de Garidel-Thoron, Thibault; Siccha, Michael; Kucera, Michal; de Vargas, Colomban

2015-11-01

Planktonic foraminifera (Rhizaria) are ubiquitous marine pelagic protists producing calcareous shells with conspicuous morphology. They play an important role in the marine carbon cycle, and their exceptional fossil record serves as the basis for biochronostratigraphy and past climate reconstructions. A major worldwide sampling effort over the last two decades has resulted in the establishment of multiple large collections of cryopreserved individual planktonic foraminifera samples. Thousands of 18S rDNA partial sequences have been generated, representing all major known morphological taxa across their worldwide oceanic range. This comprehensive data coverage provides an opportunity to assess patterns of molecular ecology and evolution in a holistic way for an entire group of planktonic protists. We combined all available published and unpublished genetic data to build PFR(2), the Planktonic foraminifera Ribosomal Reference database. The first version of the database includes 3322 reference 18S rDNA sequences belonging to 32 of the 47 known morphospecies of extant planktonic foraminifera, collected from 460 oceanic stations. All sequences have been rigorously taxonomically curated using a six-rank annotation system fully resolved to the morphological species level and linked to a series of metadata. The PFR(2) website, available at http://pfr2.sb-roscoff.fr, allows downloading the entire database or specific sections, as well as the identification of new planktonic foraminiferal sequences. Its novel, fully documented curation process integrates advances in morphological and molecular taxonomy. It allows for an increase in its taxonomic resolution and assures that integrity is maintained by including a complete contingency tracking of annotations and assuring that the annotations remain internally consistent. © 2015 John Wiley & Sons Ltd.
DNA Barcoding Reveals Limited Accuracy of Identifications Based on Folk Taxonomy

PubMed Central

Martin, Gary; Abbad, Abdelaziz; Kool, Anneleen

2014-01-01

Background The trade of plant roots as traditional medicine is an important source of income for many people around the world. Destructive harvesting practices threaten the existence of some plant species. Harvesters of medicinal roots identify the collected species according to their own folk taxonomies, but once the dried or powdered roots enter the chain of commercialization, accurate identification becomes more challenging. Methodology A survey of morphological diversity among four root products traded in the medina of Marrakech was conducted. Fifty-one root samples were selected for molecular identification using DNA barcoding using three markers, trnH-psbA, rpoC1, and ITS. Sequences were searched using BLAST against a tailored reference database of Moroccan medicinal plants and their closest relatives submitted to NCBI GenBank. Principal Findings Combining psbA-trnH, rpoC1, and ITS allowed the majority of the market samples to be identified to species level. Few of the species level barcoding identifications matched the scientific names given in the literature, including the most authoritative and widely cited pharmacopeia. Conclusions/Significance The four root complexes selected from the medicinal plant products traded in Marrakech all comprise more than one species, but not those previously asserted. The findings have major implications for the monitoring of trade in endangered plant species as morphology-based species identifications alone may not be accurate. As a result, trade in certain species may be overestimated, whereas the commercialization of other species may not be recorded at all. PMID:24416210
TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

PubMed

Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

2018-04-11

Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
DNA barcodes for bio-surveillance: regulated and economically important arthropod plant pests.

PubMed

Ashfaq, Muhammad; Hebert, Paul D N

2016-11-01

Many of the arthropod species that are important pests of agriculture and forestry are impossible to discriminate morphologically throughout all of their life stages. Some cannot be differentiated at any life stage. Over the past decade, DNA barcoding has gained increasing adoption as a tool to both identify known species and to reveal cryptic taxa. Although there has not been a focused effort to develop a barcode library for them, reference sequences are now available for 77% of the 409 species of arthropods documented on major pest databases. Aside from developing the reference library needed to guide specimen identifications, past barcode studies have revealed that a significant fraction of arthropod pests are a complex of allied taxa. Because of their importance as pests and disease vectors impacting global agriculture and forestry, DNA barcode results on these arthropods have significant implications for quarantine detection, regulation, and management. The current review discusses these implications in light of the presence of cryptic species in plant pests exposed by DNA barcoding.
Molecular application for identification of polycyclic aromatic hydrocarbons degrading bacteria (PAHD) species isolated from oil polluted soil in Dammam, Saud Arabia.

PubMed

Ibrahim, Mohamed M; Al-Turki, Ameena; Al-Sewedi, Dona; Arif, Ibrahim A; El-Gaaly, Gehan A

2015-09-01

Soil contamination with petroleum hydrocarbon products such as diesel and engine oil is becoming one of the major environmental problems. This study describes hydrocarbons degrading bacteria (PHAD) isolated from long-standing petrol polluted soil from the eastern region, Dammam, Saudi Arabia. The isolated strains were firstly categorized by accessible shape detection, physiological and biochemistry tests. Thereafter, a technique established on the sequence analysis of a 16S rDNA gene was used. Isolation of DNA from the bacterial strains was performed, on which the PCR reaction was carried out. Strains were identified based on 16S rDNA sequence analysis, As follows amplified samples were spontaneously sequenced automatically and the attained results were matched to open databases. Among the isolated bacterial strains, S1 was identified as Staphylococcus aureus and strain S1 as Corynebacterium amycolatum.
Identification of DNA primase inhibitors via a combined fragment-based and virtual screening

NASA Astrophysics Data System (ADS)

Ilic, Stefan; Akabayov, Sabine R.; Arthanari, Haribabu; Wagner, Gerhard; Richardson, Charles C.; Akabayov, Barak

2016-11-01

The structural differences between bacterial and human primases render the former an excellent target for drug design. Here we describe a technique for selecting small molecule inhibitors of the activity of T7 DNA primase, an ideal model for bacterial primases due to their common structural and functional features. Using NMR screening, fragment molecules that bind T7 primase were identified and then exploited in virtual filtration to select larger molecules from the ZINC database. The molecules were docked to the primase active site using the available primase crystal structure and ranked based on their predicted binding energies to identify the best candidates for functional and structural investigations. Biochemical assays revealed that some of the molecules inhibit T7 primase-dependent DNA replication. The binding mechanism was delineated via NMR spectroscopy. Our approach, which combines fragment based and virtual screening, is rapid and cost effective and can be applied to other targets.
[Identification of genes that are specifically/preferentially expressed in developing cotton fibers by mRNA fluorescence differential display (FDD)].

PubMed

Sun, Jie; Li, Yuan-Li; Wang, Ruo-Hai; Xia, Gui-Xian

2004-01-01

Fluorescence differential display (FDD) technique was used to identify genes that are specifically or preferentially expressed in different developmental stages of cotton fiber cells. One hundred and nine differentially displayed cDNA fragments were isolated using 9, 21 and 27 DPA (days postanthesis) fibers as experimental materials. By a combination of two rounds of reverse Northern hybridization and Northern blot analyses, a number of such cDNA fragments were proved to represent fiber-specific/preferential genes. Sequencing determination and database searching indicated that most of these genes are novel. This work is an important step towards cloning the full-length cDNAs and characterizing the cellular functions of aforementioned genes in fiber development.
9 CFR 55.25 - Animal identification.

Code of Federal Regulations, 2014 CFR

2014-01-01

... CWD National Database or in an approved State database. The second animal identification must be... CWD National Database or in an approved State database. The means of animal identification must be...
9 CFR 55.25 - Animal identification.

Code of Federal Regulations, 2013 CFR

2013-01-01

... CWD National Database or in an approved State database. The second animal identification must be... CWD National Database or in an approved State database. The means of animal identification must be...
TCOF1 mutation database: novel mutation in the alternatively spliced exon 6A and update in mutation nomenclature.

PubMed

Splendore, Alessandra; Fanganiello, Roberto D; Masotti, Cibele; Morganti, Lucas S C; Passos-Bueno, M Rita

2005-05-01

Recently, a novel exon was described in TCOF1 that, although alternatively spliced, is included in the major protein isoform. In addition, most published mutations in this gene do not conform to current mutation nomenclature guidelines. Given these observations, we developed an online database of TCOF1 mutations in which all the reported mutations are renamed according to standard recommendations and in reference to the genomic and novel cDNA reference sequences (www.genoma.ib.usp.br/TCOF1_database). We also report in this work: 1) results of the first screening for large deletions in TCOF1 by Southern blot in patients without mutation detected by direct sequencing; 2) the identification of the first pathogenic mutation in the newly described exon 6A; and 3) statistical analysis of pathogenic mutations and polymorphism distribution throughout the gene.
RatMap--rat genome tools and data.

PubMed

Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik

2005-01-01

The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.
RatMap—rat genome tools and data

PubMed Central

Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M.; Ståhl, Fredrik

2005-01-01

The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB–Genetics at Göteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided. PMID:15608244
Cloning and identification of a cDNA that encodes a novel human protein with thrombospondin type I repeat domain, hPWTSR.

PubMed

Chen, Jin-Zhong; Wang, Shu; Tang, Rong; Yang, Quan-Sheng; Zhao, Enpeng; Chao, Yaoqiong; Ying, Kang; Xie, Yi; Mao, Yu-Min

2002-09-01

A cDNA was isolated from the fetal brain cDNA library by high throughput cDNA sequencing. The 2390 bp cDNA with an open reading fragment (ORF) of 816 bp encodes a 272 amino acids putative protein with a thrombospondin type I repeat (TSR) domain and a cysteine-rich region at the N-terminus, so it is named hPWTSR. We used Northern blot detected two bands with length of about 3 kb and 4 kb respectively, which expressed in human adult tissues with different intensities. The expression pattern was verified by RT-PCR, revealing that the transcripts were expressed ubiquitously in fetal tissues and human tumor tissues too. However, the transcript was detected neither in ovarian carcinoma GI-102 nor in lung carcinoma LX-1. Blast analysis against NCBI database revealed that the new gene contained at least 5 exons and located in human chromosome 6q22.33. Our results demonstrate that the gene is a novel member of TSR supergene family.
Half of the European fruit fly species barcoded (Diptera, Tephritidae); a feasibility test for molecular identification

PubMed Central

Smit, John; Reijnen, Bastian; Stokvis, Frank

2013-01-01

Abstract A feasibility test of molecular identification of European fruit flies (Diptera: Tephritidae) based on COI barcode sequences has been executed. A dataset containing 555 sequences of 135 ingroup species from three subfamilies and 42 genera and one single outgroup species has been analysed. 73.3% of all included species could be identified based on their COI barcode gene, based on similarity and distances. The low success rate is caused by singletons as well as some problematic groups: several species groups within the genus Terellia and especially the genus Urophora. With slightly more than 100 sequences – almost 20% of the total – this genus alone constitutes the larger part of the failure for molecular identification for this dataset. Deleting the singletons and Urophora results in a success-rate of 87.1% of all queries and 93.23% of the not discarded queries as correctly identified. Urophora is of special interest due to its economic importance as beneficial species for weed control, therefore it is desirable to have alternative markers for molecular identification. We demonstrate that the success of DNA barcoding for identification purposes strongly depends on the contents of the database used to BLAST against. Especially the necessity of including multiple specimens per species of geographically distinct populations and different ecologies for the understanding of the intra- versus interspecific variation is demonstrated. Furthermore thresholds and the distinction between true and false positives and negatives should not only be used to increase the reliability of the success of molecular identification but also to point out problematic groups, which should then be flagged in the reference database suggesting alternative methods for identification. PMID:24453563

Identification of bacteria isolated from veterinary clinical specimens using MALDI-TOF MS.

PubMed

Pavlovic, Melanie; Wudy, Corinna; Zeller-Peronnet, Veronique; Maggipinto, Marzena; Zimmermann, Pia; Straubinger, Alix; Iwobi, Azuka; Märtlbauer, Erwin; Busch, Ulrich; Huber, Ingrid

2015-01-01

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has recently emerged as a rapid and accurate identification method for bacterial species. Although it has been successfully applied for the identification of human pathogens, it has so far not been well evaluated for routine identification of veterinary bacterial isolates. This study was performed to compare and evaluate the performance of MALDI-TOF MS based identification of veterinary bacterial isolates with commercially available conventional test systems. Discrepancies of both methods were resolved by sequencing 16S rDNA and, if necessary, the infB gene for Actinobacillus isolates. A total of 375 consecutively isolated veterinary samples were collected. Among the 357 isolates (95.2%) correctly identified at the genus level by MALDI-TOF MS, 338 of them (90.1% of the total isolates) were also correctly identified at the species level. Conventional methods offered correct species identification for 319 isolates (85.1%). MALDI-TOF identification therefore offered more accurate identification of veterinary bacterial isolates. An update of the in-house mass spectra database with additional reference spectra clearly improved the identification results. In conclusion, the presented data suggest that MALDI-TOF MS is an appropriate platform for classification and identification of veterinary bacterial isolates.
Update of the Diatom EST Database: a new tool for digital transcriptomics

PubMed Central

Maheswari, Uma; Mock, Thomas; Armbrust, E. Virginia; Bowler, Chris

2009-01-01

The Diatom Expressed Sequence Tag (EST) Database was constructed to provide integral access to ESTs from these ecologically and evolutionarily interesting microalgae. It has now been updated with 130 000 Phaeodactylum tricornutum ESTs from 16 cDNA libraries and 77 000 Thalassiosira pseudonana ESTs from seven libraries, derived from cells grown in different nutrient and stress regimes. The updated relational database incorporates results from statistical analyses such as log-likelihood ratios and hierarchical clustering, which help to identify differentially expressed genes under different conditions, and allow similarities in gene expression in different libraries to be investigated in a functional context. The database also incorporates links to the recently sequenced genomes of P. tricornutum and T. pseudonana, enabling an easy cross-talk between the expression pattern of diatom orthologs and the genome browsers. These improvements will facilitate exploration of diatom responses to conditions of ecological relevance and will aid gene function identification of diatom-specific genes and in silico gene prediction in this largely unexplored class of eukaryotes. The updated Diatom EST Database is available at http://www.biologie.ens.fr/diatomics/EST3. PMID:19029140
Towards computational improvement of DNA database indexing and short DNA query searching.

PubMed

Stojanov, Done; Koceski, Sašo; Mileva, Aleksandra; Koceska, Nataša; Bande, Cveta Martinovska

2014-09-03

In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning, employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption that there is enough RAM to store the indexed data structure. Analysing the methodology proposed by Reneker, we observed that hits at starting positions [Formula: see text] are not reported, if the database is searched against a query shorter than [Formula: see text] nucleotides, such that [Formula: see text] is the length of the DNA database words being mapped and [Formula: see text] is the length of the query. A solution of this drawback is also presented.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE Office of Scientific and Technical Information (OSTI.GOV)

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE PAGES

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
Search for novel remedies to augment radiation resistance of inhabitants of Fukushima and Chernobyl disasters: identifying DNA repair protein XRCC4 inhibitors.

PubMed

Sun, Mao-Feng; Chen, Hsin-Yi; Tsai, Fuu-Jen; Lui, Shu-Hui; Chen, Chih-Yi; Chen, Calvin Yu-Chian

2011-10-01

Two nuclear plant disasters occurring within a span of 25 years threaten health and genome integrity both in Fukushima and Chernobyl. Search for remedies capable of enhancing DNA repair efficiency and radiation resistance in humans appears to be a urgent problem for now. XRCC4 is an important enhancer in promoting repair pathway triggered by DNA double-strand break (DSB). In the context of radiation therapy, active XRCC4 could reduce DSB-mediated apoptotic effect on cancer cells. Hence, developing XRCC4 inhibitors could possibly enhance radiotherapy outcomes. In this study, we screened traditional Chinese medicine (TCM) database, TCM Database@Taiwan, and have identified three potent inhibitor agents against XRCC4. Through molecular dynamics simulation, we have determined that the protein-ligand interactions were focused at Lys188 on chain A and Lys187 on chain B. Intriguingly, the hydrogen bonds for all three ligands fluctuated frequently but were held at close approximation. The pi-cation interactions and ionic interactions mediated by o-hydroxyphenyl and carboxyl functional groups respectively have been demonstrated to play critical roles in stabilizing binding conformations. Based on these results, we reported the identification of potential radiotherapy enhancers from TCM. We further characterized the key binding elements for inhibiting the XRCC4 activities.
Zooplankton community analysis in the Changjiang River estuary by single-gene-targeted metagenomics

NASA Astrophysics Data System (ADS)

Cheng, Fangping; Wang, Minxiao; Li, Chaolun; Sun, Song

2014-07-01

DNA barcoding provides accurate identification of zooplankton species through all life stages. Single-gene-targeted metagenomic analysis based on DNA barcode databases can facilitate longterm monitoring of zooplankton communities. With the help of the available zooplankton databases, the zooplankton community of the Changjiang (Yangtze) River estuary was studied using a single-gene-targeted metagenomic method to estimate the species richness of this community. A total of 856 mitochondrial cytochrome oxidase subunit 1 (cox1) gene sequences were determined. The environmental barcodes were clustered into 70 molecular operational taxonomic units (MOTUs). Forty-two MOTUs matched barcoded marine organisms with more than 90% similarity and were assigned to either the species (similarity>96%) or genus level (similarity<96%). Sibling species could also be distinguished. Many species that were overlooked by morphological methods were identified by molecular methods, especially gelatinous zooplankton and merozooplankton that were likely sampled at different life history phases. Zooplankton community structures differed significantly among all of the samples. The MOTU spatial distributions were influenced by the ecological habits of the corresponding species. In conclusion, single-gene-targeted metagenomic analysis is a useful tool for zooplankton studies, with which specimens from all life history stages can be identified quickly and effectively with a comprehensive database.
Nucleotide sequencing and identification of some wild mushrooms.

PubMed

Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

2013-01-01

The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.
Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

PubMed

Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

2015-10-19

Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.
False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing

PubMed Central

2014-01-01

Background Identification of historic pathogens is challenging since false positives and negatives are a serious risk. Environmental non-pathogenic contaminants are ubiquitous. Furthermore, public genetic databases contain limited information regarding these species. High-throughput sequencing may help reliably detect and identify historic pathogens. Results We shotgun-sequenced 8 16th-century Mixtec individuals from the site of Teposcolula Yucundaa (Oaxaca, Mexico) who are reported to have died from the huey cocoliztli (‘Great Pestilence’ in Nahautl), an unknown disease that decimated native Mexican populations during the Spanish colonial period, in order to identify the pathogen. Comparison of these sequences with those deriving from the surrounding soil and from 4 precontact individuals from the site found a wide variety of contaminant organisms that confounded analyses. Without the comparative sequence data from the precontact individuals and soil, false positives for Yersinia pestis and rickettsiosis could have been reported. Conclusions False positives and negatives remain problematic in ancient DNA analyses despite the application of high-throughput sequencing. Our results suggest that several studies claiming the discovery of ancient pathogens may need further verification. Additionally, true single molecule sequencing’s short read lengths, inability to sequence through DNA lesions, and limited ancient-DNA-specific technical development hinder its application to palaeopathology. PMID:24568097
Species From Feces: Order-Wide Identification of Chiroptera From Guano and Other Non-Invasive Genetic Samples

PubMed Central

Williamson, Charles H. D.; Sanchez, Daniel E.; Sobek, Colin J.; Chambers, Carol L.

2016-01-01

Bat guano is a relatively untapped reservoir of information, having great utility as a DNA source because it is often available at roosts even when bats are not and is an easy type of sample to collect from a difficult-to-study mammalian order. Recent advances from microbial community studies in primer design, sequencing, and analysis enable fast, accurate, and cost-effective species identification. Here, we borrow from this discipline to develop an order-wide DNA mini-barcode assay (Species from Feces) based on a segment of the mitochondrial gene cytochrome c oxidase I (COI). The assay works effectively with fecal DNA and is conveniently transferable to low-cost, high-throughput Illumina MiSeq technology that also allows simultaneous pairing with other markers. Our PCR primers target a region of COI that is highly discriminatory among Chiroptera (92% species-level identification of barcoded species), and are sufficiently degenerate to allow hybridization across diverse bat taxa. We successfully validated our system with 54 bat species across both suborders. Despite abundant arthropod prey DNA in guano, our primers were highly specific to bats; no arthropod DNA was detected in thousands of feces run on Sanger and Illumina platforms. The assay is extendable to fecal pellets of unknown age as well as individual and pooled guano, to allow for individual (using singular fecal pellets) and community (using combined pellets collected from across long-term roost sites) analyses. We developed a searchable database (http://nau.edu/CEFNS/Forestry/Research/Bats/Search-Tool/) that allows users to determine the discriminatory capability of our markers for bat species of interest. Our assay has applications worldwide for examining disease impacts on vulnerable species, determining species assemblages within roosts, and assessing the presence of bat species that are vulnerable or facing extinction. The development and analytical pathways are rapid, reliable, and inexpensive, and can be applied to ecology and conservation studies of other taxa. PMID:27654850
Species From Feces: Order-Wide Identification of Chiroptera From Guano and Other Non-Invasive Genetic Samples.

PubMed

Walker, Faith M; Williamson, Charles H D; Sanchez, Daniel E; Sobek, Colin J; Chambers, Carol L

Bat guano is a relatively untapped reservoir of information, having great utility as a DNA source because it is often available at roosts even when bats are not and is an easy type of sample to collect from a difficult-to-study mammalian order. Recent advances from microbial community studies in primer design, sequencing, and analysis enable fast, accurate, and cost-effective species identification. Here, we borrow from this discipline to develop an order-wide DNA mini-barcode assay (Species from Feces) based on a segment of the mitochondrial gene cytochrome c oxidase I (COI). The assay works effectively with fecal DNA and is conveniently transferable to low-cost, high-throughput Illumina MiSeq technology that also allows simultaneous pairing with other markers. Our PCR primers target a region of COI that is highly discriminatory among Chiroptera (92% species-level identification of barcoded species), and are sufficiently degenerate to allow hybridization across diverse bat taxa. We successfully validated our system with 54 bat species across both suborders. Despite abundant arthropod prey DNA in guano, our primers were highly specific to bats; no arthropod DNA was detected in thousands of feces run on Sanger and Illumina platforms. The assay is extendable to fecal pellets of unknown age as well as individual and pooled guano, to allow for individual (using singular fecal pellets) and community (using combined pellets collected from across long-term roost sites) analyses. We developed a searchable database (http://nau.edu/CEFNS/Forestry/Research/Bats/Search-Tool/) that allows users to determine the discriminatory capability of our markers for bat species of interest. Our assay has applications worldwide for examining disease impacts on vulnerable species, determining species assemblages within roosts, and assessing the presence of bat species that are vulnerable or facing extinction. The development and analytical pathways are rapid, reliable, and inexpensive, and can be applied to ecology and conservation studies of other taxa.
21 CFR 830.350 - Correction of information submitted to the Global Unique Device Identification Database.

Code of Federal Regulations, 2014 CFR

2014-04-01

... Unique Device Identification Database. 830.350 Section 830.350 Food and Drugs FOOD AND DRUG... Global Unique Device Identification Database § 830.350 Correction of information submitted to the Global Unique Device Identification Database. (a) If FDA becomes aware that any information submitted to the...
Implementation options for DNA-based identification into ecological status assessment under the European Water Framework Directive.

PubMed

Hering, Daniel; Borja, Angel; Jones, J Iwan; Pont, Didier; Boets, Pieter; Bouchez, Agnes; Bruce, Kat; Drakare, Stina; Hänfling, Bernd; Kahlert, Maria; Leese, Florian; Meissner, Kristian; Mergen, Patricia; Reyjol, Yorick; Segurado, Pedro; Vogler, Alfried; Kelly, Martyn

2018-07-01

Assessment of ecological status for the European Water Framework Directive (WFD) is based on "Biological Quality Elements" (BQEs), namely phytoplankton, benthic flora, benthic invertebrates and fish. Morphological identification of these organisms is a time-consuming and expensive procedure. Here, we assess the options for complementing and, perhaps, replacing morphological identification with procedures using eDNA, metabarcoding or similar approaches. We rate the applicability of DNA-based identification for the individual BQEs and water categories (rivers, lakes, transitional and coastal waters) against eleven criteria, summarised under the headlines representativeness (for example suitability of current sampling methods for DNA-based identification, errors from DNA-based species detection), sensitivity (for example capability to detect sensitive taxa, unassigned reads), precision of DNA-based identification (knowledge about uncertainty), comparability with conventional approaches (for example sensitivity of metrics to differences in DNA-based identification), cost effectiveness and environmental impact. Overall, suitability of DNA-based identification is particularly high for fish, as eDNA is a well-suited sampling approach which can replace expensive and potentially harmful methods such as gill-netting, trawling or electrofishing. Furthermore, there are attempts to replace absolute by relative abundance in metric calculations. For invertebrates and phytobenthos, the main challenges include the modification of indices and completing barcode libraries. For phytoplankton, the barcode libraries are even more problematic, due to the high taxonomic diversity in plankton samples. If current assessment concepts are kept, DNA-based identification is least appropriate for macrophytes (rivers, lakes) and angiosperms/macroalgae (transitional and coastal waters), which are surveyed rather than sampled. We discuss general implications of implementing DNA-based identification into standard ecological assessment, in particular considering any adaptations to the WFD that may be required to facilitate the transition to molecular data. Copyright © 2018 Elsevier Ltd. All rights reserved.
DNA barcode variability and host plant usage of fruit flies (Diptera: Tephritidae) in Thailand.

PubMed

Kunprom, Chonticha; Pramual, Pairot

2016-10-01

The objectives of this study were to examine the genetic variation in fruit flies (Diptera: Tephritidae) in Thailand and to test the efficiency of the mitochondrial cytochrome c oxidase subunit I (COI) barcoding region for species-level identification. Twelve fruit fly species were collected from 24 host plant species of 13 families. The number of host plant species for each fruit fly species ranged between 1 and 11, with Bactrocera correcta found in the most diverse host plants. A total of 123 COI sequences were obtained from these fruit fly species. Sequences from the NCBI database were also included, for a total of 17 species analyzed. DNA barcoding identification analysis based on the best close match method revealed a good performance, with 94.4% of specimens correctly identified. However, many specimens (3.6%) had ambiguous identification, mostly due to intra- and interspecific overlap between members of the B. dorsalis complex. A phylogenetic tree based on the mitochondrial barcode sequences indicated that all species, except for the members of the B. dorsalis complex, were monophyletic with strong support. Our work supports recent calls for synonymization of these species. Divergent lineages were observed within B. correcta and B. tuberculata, and this suggested that these species need further taxonomic reexamination.
Barcode Identifiers as a Practical Tool for Reliable Species Assignment of Medically Important Black Yeast Species

PubMed Central

Heinrichs, Guido; de Hoog, G. Sybren

2012-01-01

Herpotrichiellaceous black yeasts and relatives comprise severe pathogens flanked by nonpathogenic environmental siblings. Reliable identification by conventional methods is notoriously difficult. Molecular identification is hampered by the sequence variability in the internal transcribed spacer (ITS) domain caused by difficult-to-sequence homopolymeric regions and by poor taxonomic attribution of sequences deposited in GenBank. Here, we present a potential solution using short barcode identifiers (27 to 50 bp) based on ITS2 ribosomal DNA (rDNA), which allows unambiguous definition of species-specific fragments. Starting from proven sequences of ex-type and authentic strains, we were able to describe 103 identifiers. Multiple BLAST searches of these proposed barcode identifiers in GenBank revealed uniqueness for 100 taxonomic entities, whereas the three remaining identifiers each matched with two entities, but the species of these identifiers could easily be discriminated by differences in the remaining ITS regions. Using the proposed barcode identifiers, a 4.1-fold increase of 100% matches in GenBank was achieved in comparison to the classical approach using the complete ITS sequences. The proposed barcode identifiers will be made accessible for the diagnostic laboratory in a permanently updated online database, thereby providing a highly practical, reliable, and cost-effective tool for identification of clinically important black yeasts and relatives. PMID:22785187
Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry: a new possibility for the identification and typing of anaerobic bacteria.

PubMed

Nagy, Elizabeth

2014-01-01

Anaerobic bacteria predominate in the normal flora of humans and are important, often life-threatening pathogens in mixed infections originating from the indigenous microbiota. The isolation and identification of anaerobes by phenotypic and DNA-based molecular methods at a species level is time-consuming and laborious. Following the successful adaptation of the matrix-assisted laser desorption/ionization time-of-flight mass spectrometry for the routine laboratory identification of bacteria, the extensive development of a database has been initiated to use this method for the identification of anaerobic bacteria. Not only frequently isolated anaerobic species, but also newly recognized and taxonomically rearranged genera and species can be identified using direct smear samples or whole-cell protein extraction, and even phylogenetically closely related species can be identified correctly by means of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Typing of anaerobic bacteria on a subspecies level, determination of antibiotic resistance and direct identification of blood culture isolates will revolutionize anaerobe bacteriology in the near future.
DNA barcoding as a tool for coral reef conservation

NASA Astrophysics Data System (ADS)

Neigel, J.; Domingo, A.; Stake, J.

2007-09-01

DNA Barcoding (DBC) is a method for taxonomic identification of animals that is based entirely on the 5' portion of the mitochondrial gene, cytochrome oxidase subunit I ( COI-5). It can be especially useful for identification of larval forms or incomplete specimens lacking diagnostic morphological characters. DBC can also facilitate the discovery of species and in defining “molecular taxonomic units” in problematic groups. However, DBC is not a panacea for coral reef taxonomy. In two of the most ecologically important groups on coral reefs, the Anthozoa and Porifera, COI-5 sequences have diverged too little to be diagnostic for all species. Other problems for DBC include paraphyly in mitochondrial gene trees and lack of differentiation between hybrids and their maternal ancestors. DBC also depends on the availability of databases of COI-5 sequences, which are still in early stages of development. A global effort to barcode all fish species has demonstrated the importance of large-scale coordination and is yielding promising results. Whether or not COI-5 by itself is sufficient for species assignments has become a contentious question; it is generally advantageous to use sequences from multiple loci.
A database of annotated tentative orthologs from crop abiotic stress transcripts.

PubMed

Balaji, Jayashree; Crouch, Jonathan H; Petite, Prasad V N S; Hoisington, David A

2006-10-07

A minimal requirement to initiate a comparative genomics study on plant responses to abiotic stresses is a dataset of orthologous sequences. The availability of a large amount of sequence information, including those derived from stress cDNA libraries allow for the identification of stress related genes and orthologs associated with the stress response. Orthologous sequences serve as tools to explore genes and their relationships across species. For this purpose, ESTs from stress cDNA libraries across 16 crop species including 6 important cereal crops and 10 dicots were systematically collated and subjected to bioinformatics analysis such as clustering, grouping of tentative orthologous sets, identification of protein motifs/patterns in the predicted protein sequence, and annotation with stress conditions, tissue/library source and putative function. All data are available to the scientific community at http://intranet.icrisat.org/gt1/tog/homepage.htm. We believe that the availability of annotated plant abiotic stress ortholog sets will be a valuable resource for researchers studying the biology of environmental stresses in plant systems, molecular evolution and genomics.
Molecular identification of hard ticks (Ixodes sp.) infesting rodents in Selangor, Malaysia

NASA Astrophysics Data System (ADS)

Ishak, Siti Nabilah; Shiang, Lim Fang; Taib, Farah Shafawati Mohd; Jing, Khoo Jing; Nor, Shukor Md; Yusof, Muhammad Afif; Sah, Shahrul Anuar Mohd; Sitam, Frankie Thomas; Japning, Jeffrine Rovie Ryan

2018-04-01

This study aims to identify hard ticks (Ixodes sp.) infesting rodents in three different sites in Selangor, Malaysia using a molecular approach. A total of 11 individual ticks infesting four different host species (Rattus tiomanicus, Rattus ratus, Maxomys surifer and Sundamys muelleri) were examined based on its morphological features, followed by molecular identification using mitochondrial 16S rDNA gene. Confirmation of the species identity was accomplished by using BLAST program. Clustering analysis based on 16S rDNA sequences was carried out by constructing Neighbour-joining (NJ) and Maximum parsimony (MP) tree using MEGA 7 to clarify the genetic identity of Ixodes sp. Based on morphological features, all individual ticks were only able to be identified up to genus level as most of the samples were fully engorged, damaged and lacked morphological characters. However, molecular analysis of samples revealed 99% similarity with Ixodes granulatus from the GenBank database. Thus, the result of this study showed that all these ticks (Ixodes granulatus) were genetically affiliated to a monophyletic group with highly homogenous sequences.

MICA: desktop software for comprehensive searching of DNA databases

PubMed Central

Stokes, William A; Glick, Benjamin S

2006-01-01

Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays) that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software. PMID:17018144
Molecular Characterization and Analysis of 16S Ribosomal DNA in Some Isolates of Demodex folicullorum

PubMed Central

DANESHPARVAR, Afrooz; MOWLAVI, Gholamreza; MIRJALALI, Hamed; HAJJARAN, Homa; MOBEDI, Iraj; NADDAF, Saeed Reza; SHIDFAR, Mohammadreza; SADAT MAKKI, Mahsa

2017-01-01

Background: Demodicosis is one of the most prevalent skin diseases resulting from infestation by Demodex mites. This parasite usually inhabits in follicular infundibulum or sebaceous duct and transmits through close contact with an infested host. Methods: This study was carried from September 2014 to January 2016 at Tehran University of Medical Sciences, Tehran, Iran. DNA extraction and amplification of 16S ribosomal RNA was performed on four isolates, already obtained from four different patients and identified morphologically though clearing with 10% Potassium hydroxide (KOH) and microscopical examination. Amplified fragments from the isolates were compared with GeneBank database and phylogenetic analysis was carried out using MEGA6 software. Results: A 390 bp fragment of 16S rDNA was obtained in all isolates and analysis of generated sequences showed high similarity with those submitted to GenBank, previously. Intra-species similarity and distance also showed 99.983% and 0.017, respectively, for the studied isolates. Multiple alignments of the isolates showed Single Nucleotide Polymorphisms (SNPs) in 16S rRNA fragment. Phylogenetic analysis revealed that all 4 isolates clustered with other D. folliculorum, recovered from GenBank database. Our accession numbers KF875587 and KF875589 showed more similarity together in comparison with two other studied isolates. Conclusion: Mitochondrial 16S rDNA is one of the most suitable molecular barcodes for identification D. folliculorum and this fragment can use for intra-species characterization of the most human-infected mites. PMID:28761482
Molecular Characterization and Analysis of 16S Ribosomal DNA in Some Isolates of Demodex folicullorum.

PubMed

Daneshparvar, Afrooz; Mowlavi, Gholamreza; Mirjalali, Hamed; Hajjaran, Homa; Mobedi, Iraj; Naddaf, Saeed Reza; Shidfar, Mohammadreza; Sadat Makki, Mahsa

2017-01-01

Demodicosis is one of the most prevalent skin diseases resulting from infestation by Demodex mites. This parasite usually inhabits in follicular infundibulum or sebaceous duct and transmits through close contact with an infested host. This study was carried from September 2014 to January 2016 at Tehran University of Medical Sciences, Tehran, Iran. DNA extraction and amplification of 16S ribosomal RNA was performed on four isolates, already obtained from four different patients and identified morphologically though clearing with 10% Potassium hydroxide (KOH) and microscopical examination. Amplified fragments from the isolates were compared with GeneBank database and phylogenetic analysis was carried out using MEGA6 software. A 390 bp fragment of 16S rDNA was obtained in all isolates and analysis of generated sequences showed high similarity with those submitted to GenBank, previously. Intra-species similarity and distance also showed 99.983% and 0.017, respectively, for the studied isolates. Multiple alignments of the isolates showed Single Nucleotide Polymorphisms (SNPs) in 16S rRNA fragment. Phylogenetic analysis revealed that all 4 isolates clustered with other D. folliculorum, recovered from GenBank database. Our accession numbers KF875587 and KF875589 showed more similarity together in comparison with two other studied isolates. Mitochondrial 16S rDNA is one of the most suitable molecular barcodes for identification D. folliculorum and this fragment can use for intra-species characterization of the most human-infected mites.
MethBank 3.0: a database of DNA methylomes across a variety of species.

PubMed

Li, Rujiao; Liang, Fang; Li, Mengwei; Zou, Dong; Sun, Shixiang; Zhao, Yongbing; Zhao, Wenming; Bao, Yiming; Xiao, Jingfa; Zhang, Zhang

2018-01-04

MethBank (http://bigd.big.ac.cn/methbank) is a database that integrates high-quality DNA methylomes across a variety of species and provides an interactive browser for visualization of methylation data. Here, we present an updated implementation of MethBank (version 3.0) by incorporating more DNA methylomes from multiple species and equipping with more enhanced functionalities for data annotation and more friendly web interfaces for data presentation, search and visualization. MethBank 3.0 features large-scale integration of high-quality methylomes, involving 34 consensus reference methylomes derived from a large number of human samples, 336 single-base resolution methylomes from different developmental stages and/or tissues of five plants, and 18 single-base resolution methylomes from gametes and early embryos at multiple stages of two animals. Additionally, it is enhanced by improving the functionalities for data annotation, which accordingly enables systematic identification of methylation sites closely associated with age, sites with constant methylation levels across different ages, differentially methylated promoters, age-specific differentially methylated cytosines/regions, and methylated CpG islands. Moreover, MethBank provides tools to estimate human methylation age online and to identify differentially methylated promoters, respectively. Taken together, MethBank is upgraded with significant improvements and advances over the previous version, which is of great help for deciphering DNA methylation regulatory mechanisms for epigenetic studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
DNA barcodes of the native ray-finned fishes in Taiwan.

PubMed

Chang, Chia-Hao; Shao, Kwang-Tsao; Lin, Han-Yang; Chiu, Yung-Chieh; Lee, Mao-Ying; Liu, Shih-Hui; Lin, Pai-Lei

2017-07-01

Species identification based on the DNA sequence of a fragment of the cytochrome c oxidase subunit I gene in the mitochondrial genome, DNA barcoding, is widely applied to assist in sustainable exploitation of fish resources and the protection of fish biodiversity. The aim of this study was to establish a reliable barcoding reference database of the native ray-finned fishes in Taiwan. A total of 2993 individuals, belonging to 1245 species within 637 genera, 184 families and 29 orders of ray-finned fishes and representing approximately 40% of the recorded ray-finned fishes in Taiwan, were PCR amplified at the barcode region and bidirectionally sequenced. The mean length of the 2993 barcodes is 549 bp. Mean congeneric K2P distance (15.24%) is approximately 10-fold higher than the mean conspecific one (1.51%), but approximately 1.4-fold less than the mean genetic distance between families (20.80%). The Barcode Index Number (BIN) discordance report shows that 2993 specimens represent 1275 BINs and, among them, 86 BINs are singletons, 570 BINs are taxonomically concordant, and the other 619 BINs are taxonomically discordant. Barcode gap analysis also revealed that more than 90% of the collected fishes in this study can be discriminated by DNA barcoding. Overall, the barcoding reference database established by this study reveals the need for taxonomic revisions and voucher specimen rechecks, in addition to assisting in the management of Taiwan's fish resources and diversity. © 2016 John Wiley & Sons Ltd.
An expanded mammal mitogenome dataset from Southeast Asia

PubMed Central

Ramos-Madrigal, Jazmín; Peñaloza, Fernando; Liu, Shanlin; Mikkel-Holger, S. Sinding; Riddhi, P. Patel; Martins, Renata; Lenz, Dorina; Fickel, Jörns; Roos, Christian; Shamsir, Mohd Shahir; Azman, Mohammad Shahfiz; Burton, K. Lim; Stephen, J. Rossiter; Wilting, Andreas

2017-01-01

Abstract Southeast (SE) Asia is 1 of the most biodiverse regions in the world, and it holds approximately 20% of all mammal species. Despite this, the majority of SE Asia's genetic diversity is still poorly characterized. The growing interest in using environmental DNA to assess and monitor SE Asian species, in particular threatened mammals—has created the urgent need to expand the available reference database of mitochondrial barcode and complete mitogenome sequences. We have partially addressed this need by generating 72 new mitogenome sequences reconstructed from DNA isolated from a range of historical and modern tissue samples. Approximately 55 gigabases of raw sequence were generated. From this data, we assembled 72 complete mitogenome sequences, with an average depth of coverage of ×102.9 and ×55.2 for modern samples and historical samples, respectively. This dataset represents 52 species, of which 30 species had no previous mitogenome data available. The mitogenomes were geotagged to their sampling location, where known, to display a detailed geographical distribution of the species. Our new database of 52 taxa will strongly enhance the utility of environmental DNA approaches for monitoring mammals in SE Asia as it greatly increases the likelihoods that identification of metabarcoding sequencing reads can be assigned to reference sequences. This magnifies the confidence in species detections and thus allows more robust surveys and monitoring programmes of SE Asia's threatened mammal biodiversity. The extensive collections of historical samples from SE Asia in western and SE Asian museums should serve as additional valuable material to further enrich this reference database. PMID:28873965
An expanded mammal mitogenome dataset from Southeast Asia.

PubMed

Mohd Salleh, Faezah; Ramos-Madrigal, Jazmín; Peñaloza, Fernando; Liu, Shanlin; Mikkel-Holger, S Sinding; Riddhi, P Patel; Martins, Renata; Lenz, Dorina; Fickel, Jörns; Roos, Christian; Shamsir, Mohd Shahir; Azman, Mohammad Shahfiz; Burton, K Lim; Stephen, J Rossiter; Wilting, Andreas; Gilbert, M Thomas P

2017-08-01

Southeast (SE) Asia is 1 of the most biodiverse regions in the world, and it holds approximately 20% of all mammal species. Despite this, the majority of SE Asia's genetic diversity is still poorly characterized. The growing interest in using environmental DNA to assess and monitor SE Asian species, in particular threatened mammals-has created the urgent need to expand the available reference database of mitochondrial barcode and complete mitogenome sequences. We have partially addressed this need by generating 72 new mitogenome sequences reconstructed from DNA isolated from a range of historical and modern tissue samples. Approximately 55 gigabases of raw sequence were generated. From this data, we assembled 72 complete mitogenome sequences, with an average depth of coverage of ×102.9 and ×55.2 for modern samples and historical samples, respectively. This dataset represents 52 species, of which 30 species had no previous mitogenome data available. The mitogenomes were geotagged to their sampling location, where known, to display a detailed geographical distribution of the species. Our new database of 52 taxa will strongly enhance the utility of environmental DNA approaches for monitoring mammals in SE Asia as it greatly increases the likelihoods that identification of metabarcoding sequencing reads can be assigned to reference sequences. This magnifies the confidence in species detections and thus allows more robust surveys and monitoring programmes of SE Asia's threatened mammal biodiversity. The extensive collections of historical samples from SE Asia in western and SE Asian museums should serve as additional valuable material to further enrich this reference database. © The Author 2017. Published by Oxford University Press.
Identification of a Divergent Environmental DNA Sequence Clade Using the Phylogeny of Gregarine Parasites (Apicomplexa) from Crustacean Hosts

PubMed Central

Rueckert, Sonja; Simdyanov, Timur G.; Aleoshin, Vladimir V.; Leander, Brian S.

2011-01-01

Background Environmental SSU rDNA surveys have significantly improved our understanding of microeukaryotic diversity. Many of the sequences acquired using this approach are closely related to lineages previously characterized at both morphological and molecular levels, making interpretation of these data relatively straightforward. Some sequences, by contrast, appear to be phylogenetic orphans and are sometimes inferred to represent “novel lineages” of unknown cellular identity. Consequently, interpretation of environmental DNA surveys of cellular diversity rely on an adequately comprehensive database of DNA sequences derived from identified species. Several major taxa of microeukaryotes, however, are still very poorly represented in these databases, and this is especially true for diverse groups of single-celled parasites, such as gregarine apicomplexans. Methodology/Principal Findings This study attempts to address this paucity of DNA sequence data by characterizing four different gregarine species, isolated from the intestines of crustaceans, at both morphological and molecular levels: Thiriotia pugettiae sp. n. from the graceful kelp crab (Pugettia gracilis), Cephaloidophora cf. communis from two different species of barnacles (Balanus glandula and B. balanus), Heliospora cf. longissima from two different species of freshwater amphipods (Eulimnogammarus verrucosus and E. vittatus), and Heliospora caprellae comb. n. from a skeleton shrimp (Caprella alaskana). SSU rDNA sequences were acquired from isolates of these gregarine species and added to a global apicomplexan alignment containing all major groups of gregarines characterized so far. Molecular phylogenetic analyses of these data demonstrated that all of the gregarines collected from crustacean hosts formed a very strongly supported clade with 48 previously unidentified environmental DNA sequences. Conclusions/Significance This expanded molecular phylogenetic context enabled us to establish a major clade of intestinal gregarine parasites and infer the cellular identities of several previously unidentified environmental SSU rDNA sequences, including several sequences that have formerly been discussed broadly in the literature as a suspected “novel” lineage of eukaryotes. PMID:21483868
RICD: a rice indica cDNA database resource for rice functional genomics.

PubMed

Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin

2008-11-26

The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
Differentiation of clinically relevant Mucorales Rhizopus microsporus and R. arrhizus by matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS).

PubMed

Dolatabadi, Somayeh; Kolecka, Anna; Versteeg, Matthijs; de Hoog, Sybren G; Boekhout, Teun

2015-07-01

This study addresses the usefulness of matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) MS for reliable identification of the two most frequently occurring clinical species of Rhizopus, namely Rhizopus arrhizus with its two varieties, arrhizus and delemar, and Rhizopus microsporus. The test-set comprised 38 isolates of clinical and environmental origin previously identified by internal transcribed spacer (ITS) sequencing of rDNA. Multi-locus sequence data targeting three gene markers (ITS, ACT, TEF ) showed two monophylic clades for Rhizopus arrhizus and Rhizopus microsporus (bootstrap values of 99 %). Cluster analysis confirmed the presence of two distinct clades within Rhizopus arrhizus representing its varieties arrhizus and delemar. The MALDI Biotyper 3.0 Microflex LT platform (Bruker Daltonics) was used to confirm the distinction between Rhizopus arrhizus and Rhizopus microsporus and the presence of two varieties within the species Rhizopus arrhizus. An in-house database of 30 reference main spectra (MSPs) was initially tested for correctness using commercially available databases of Bruker Daltonics. By challenging the database with the same strains of which an in-house database was created, automatic identification runs confirmed that MALDI-TOF MS is able to recognize the strains at the variety level. Based on principal component analysis, two MSP dendrograms were created and showed concordance with the multi-locus tree; thus, MALDI-TOF MS is a useful tool for diagnostics of mucoralean species.
The Italian Twin Project: from the personal identification number to a national twin registry.

PubMed

Stazi, Maria Antonietta; Cotichini, Rodolfo; Patriarca, Valeria; Brescianini, Sonia; Fagnani, Corrado; D'Ippolito, Cristina; Cannoni, Stefania; Ristori, Giovanni; Salvetti, Marco

2002-10-01

The unique opportunity given by the "fiscal code", an alphanumeric identification with demographic information on any single person residing in Italy, introduced in 1976 by the Ministry of Finance, allowed a database of all potential Italian twins to be created. This database contains up to now name, surname, date and place of birth and home address of about 1,300,000 "possible twins". Even though we estimated an excess of 40% of pseudo-twins, this still is the world's largest twin population ever collected. The database of possible twins is currently used in population-based studies on multiple sclerosis, Alzheimer's disease, celiac disease, and type 1 diabetes. A system is currently being developed for linking the database with data from mortality and cancer registries. In 2001, the Italian Government, through the Ministry of Health, financed a broad national research program on twin studies, including the establishment of a national twin registry. Among all the possible twins, a sample of 500,000 individuals are going to be contacted and we expect to enrol around 120,000 real twin pairs in a formal Twin Registry. According to available financial resources, a sub sample of the enrolled population will be asked to donate DNA. A biological bank from twins will be then implemented, guaranteeing information on future etiological questions regarding genetic and modifiable factors for physical impairment and disability, cancers, cardiovascular diseases and other age related chronic illnesses.
MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status

PubMed Central

Attimonelli, M.; Altamura, N.; Benne, R.; Brennicke, A.; Cooper, J. M.; D’Elia, D.; Montalvo, A. de; Pinto, B. de; De Robertis, M.; Golik, P.; Knoop, V.; Lanave, C.; Lazowska, J.; Licciulli, F.; Malladi, B. S.; Memeo, F.; Monnerot, M.; Pasimeni, R.; Pilbout, S.; Schapira, A. H. V.; Sloof, P.; Saccone, C.

2000-01-01

MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces cerevisiae. MitBASE reports all available information from different organisms and from intraspecies variants and mutants. Data have been drawn from the primary databases and from the literature; value adding information has been structured, e.g., editing information on protist mtDNA genomes, pathological information for human mtDNA variants, etc. The different databases, some of which are structured using commercial packages (Microsoft Access, File Maker Pro) while others use a flat-file format, have been integrated under ORACLE. Ad hoc retrieval systems have been devised for some of the above listed databases keeping into account their peculiarities. The database is resident at the EBI and is available at the following site: http://www3.ebi.ac.uk/Research/Mitbase/mitbase.pl . The impact of this project is intended for both basic and applied research. The study of mitochondrial genetic diseases and mitochondrial DNA intraspecies diversity are key topics in several biotechnological fields. The database has been funded within the EU Biotechnology programme. PMID:10592207
Dictionary-driven prokaryotic gene finding.

PubMed

Shibuya, Tetsuo; Rigoutsos, Isidore

2002-06-15

Gene identification, also known as gene finding or gene recognition, is among the important problems of molecular biology that have been receiving increasing attention with the advent of large scale sequencing projects. Previous strategies for solving this problem can be categorized into essentially two schools of thought: one school employs sequence composition statistics, whereas the other relies on database similarity searches. In this paper, we propose a new gene identification scheme that combines the best characteristics from each of these two schools. In particular, our method determines gene candidates among the ORFs that can be identified in a given DNA strand through the use of the Bio-Dictionary, a database of patterns that covers essentially all of the currently available sample of the natural protein sequence space. Our approach relies entirely on the use of redundant patterns as the agents on which the presence or absence of genes is predicated and does not employ any additional evidence, e.g. ribosome-binding site signals. The Bio-Dictionary Gene Finder (BDGF), the algorithm's implementation, is a single computational engine able to handle the gene identification task across distinct archaeal and bacterial genomes. The engine exhibits performance that is characterized by simultaneous very high values of sensitivity and specificity, and a high percentage of correctly predicted start sites. Using a collection of patterns derived from an old (June 2000) release of the Swiss-Prot/TrEMBL database that contained 451 602 proteins and fragments, we demonstrate our method's generality and capabilities through an extensive analysis of 17 complete archaeal and bacterial genomes. Examples of previously unreported genes are also shown and discussed in detail.
COGcollator: a web server for analysis of distant relationships between homologous protein families.

PubMed

Dibrova, Daria V; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Mulkidjanian, Armen Y

2017-11-29

The Clusters of Orthologous Groups (COGs) of proteins systematize evolutionary related proteins into specific groups with similar functions. However, the available databases do not provide means to assess the extent of similarity between the COGs. We intended to provide a method for identification and visualization of evolutionary relationships between the COGs, as well as a respective web server. Here we introduce the COGcollator, a web tool for identification of evolutionarily related COGs and their further analysis. We demonstrate the utility of this tool by identifying the COGs that contain distant homologs of (i) the catalytic subunit of bacterial rotary membrane ATP synthases and (ii) the DNA/RNA helicases of the superfamily 1. This article was reviewed by Drs. Igor N. Berezovsky, Igor Zhulin and Yuri Wolf.
Detection and Identification of Gastrointestinal Lactobacillus Species by Using Denaturing Gradient Gel Electrophoresis and Species-Specific PCR Primers

PubMed Central

Walter, J.; Tannock, G. W.; Tilsala-Timisjarvi, A.; Rodtong, S.; Loach, D. M.; Munro, K.; Alatossava, T.

2000-01-01

Denaturing gradient gel electrophoresis (DGGE) of DNA fragments obtained by PCR amplification of the V2-V3 region of the 16S rRNA gene was used to detect the presence of Lactobacillus species in the stomach contents of mice. Lactobacillus isolates cultured from human and porcine gastrointestinal samples were identified to the species level by using a combination of DGGE and species-specific PCR primers that targeted 16S-23S rRNA intergenic spacer region or 16S rRNA gene sequences. The identifications obtained by this approach were confirmed by sequencing the V2-V3 region of the 16S rRNA gene and by a BLAST search of the GenBank database. PMID:10618239
Northeast India Helminth Parasite Information Database (NEIHPID): Knowledge Base for Helminth Parasites

PubMed Central

Debnath, Manish; Kharumnuid, Graciously; Thongnibah, Welfrank; Tandon, Veena

2016-01-01

Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID) project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS) data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species), or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php PMID:27285615
Northeast India Helminth Parasite Information Database (NEIHPID): Knowledge Base for Helminth Parasites.

PubMed

Biswal, Devendra Kumar; Debnath, Manish; Kharumnuid, Graciously; Thongnibah, Welfrank; Tandon, Veena

2016-01-01

Most metazoan parasites that invade vertebrate hosts belong to three phyla: Platyhelminthes, Nematoda and Acanthocephala. Many of the parasitic members of these phyla are collectively known as helminths and are causative agents of many debilitating, deforming and lethal diseases of humans and animals. The North-East India Helminth Parasite Information Database (NEIHPID) project aimed to document and characterise the spectrum of helminth parasites in the north-eastern region of India, providing host, geographical distribution, diagnostic characters and image data. The morphology-based taxonomic data are supplemented with information on DNA sequences of nuclear, ribosomal and mitochondrial gene marker regions that aid in parasite identification. In addition, the database contains raw next generation sequencing (NGS) data for 3 foodborne trematode parasites, with more to follow. The database will also provide study material for students interested in parasite biology. Users can search the database at various taxonomic levels (phylum, class, order, superfamily, family, genus, and species), or by host, habitat and geographical location. Specimen collection locations are noted as co-ordinates in a MySQL database and can be viewed on Google maps, using Google Maps JavaScript API v3. The NEIHPID database has been made freely available at http://nepiac.nehu.ac.in/index.php.
Short Tandem Repeat DNA Internet Database

National Institute of Standards and Technology Data Gateway

SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access) Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.
High-throughput STR analysis for DNA database using direct PCR.

PubMed

Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan

2013-07-01

Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner. © 2013 American Academy of Forensic Sciences Published 2013. This article is a U.S. Government work and is in the public domain in the U.S.A.
The Protein-DNA Interface database

PubMed Central

2010-01-01

The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 Å or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface. We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes. PMID:20482798

The Protein-DNA Interface database.

PubMed

Norambuena, Tomás; Melo, Francisco

2010-05-18

The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 A or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface.We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes.
Evaluation of DNA mixtures from database search.

PubMed

Chung, Yuk-Ka; Hu, Yue-Qing; Fung, Wing K

2010-03-01

With the aim of bridging the gap between DNA mixture analysis and DNA database search, a novel approach is proposed to evaluate the forensic evidence of DNA mixtures when the suspect is identified by the search of a database of DNA profiles. General formulae are developed for the calculation of the likelihood ratio for a two-person mixture under general situations including multiple matches and imperfect evidence. The influence of the prior probabilities on the weight of evidence under the scenario of multiple matches is demonstrated by a numerical example based on Hong Kong data. Our approach is shown to be capable of presenting the forensic evidence of DNA mixtures in a comprehensive way when the suspect is identified through database search.
78 FR 58545 - Global Unique Device Identification Database; Draft Guidance for Industry; Availability

Federal Register 2010, 2011, 2012, 2013, 2014

2013-09-24

...] Global Unique Device Identification Database; Draft Guidance for Industry; Availability AGENCY: Food and... the availability of the draft guidance entitled ``Global Unique Device Identification Database (GUDID... manufacturer) will interface with the GUDID, as well as information on the database elements that must be...
PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

PubMed

Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

2016-01-01

Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.
A tuberculosis biomarker database: the key to novel TB diagnostics.

PubMed

Yerlikaya, Seda; Broger, Tobias; MacLean, Emily; Pai, Madhukar; Denkinger, Claudia M

2017-03-01

New diagnostic innovations for tuberculosis (TB), including point-of-care solutions, are critical to reach the goals of the End TB Strategy. However, despite decades of research, numerous reports on new biomarker candidates, and significant investment, no well-performing, simple and rapid TB diagnostic test is yet available on the market, and the search for accurate, non-DNA biomarkers remains a priority. To help overcome this 'biomarker pipeline problem', FIND and partners are working on the development of a well-curated and user-friendly TB biomarker database. The web-based database will enable the dynamic tracking of evidence surrounding biomarker candidates in relation to target product profiles (TPPs) for needed TB diagnostics. It will be able to accommodate raw datasets and facilitate the verification of promising biomarker candidates and the identification of novel biomarker combinations. As such, the database will simplify data and knowledge sharing, empower collaboration, help in the coordination of efforts and allocation of resources, streamline the verification and validation of biomarker candidates, and ultimately lead to an accelerated translation into clinically useful tools. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
[Cloning and sequencing of KIR2DL1 framework gene cDNA and identification of a novel allele].

PubMed

Sun, Ge; Wang, Chang; Zhen, Jianxin; Zhang, Guobin; Xu, Yunping; Deng, Zhihui

2016-10-01

To develop an assay for cDNA cloning and haplotype sequencing of KIR2DL1 framework gene and determine the genotype of an ethnic Han from southern China. Total RNA was isolated from peripheral blood sample, and complementary DNA (cDNA) transcript was synthesized by RT-PCR. The entire coding sequence of the KIR2DL1 framework gene was amplified with a pair of KIR2DL1-specific PCR primers. The PCR products with a length of approximately 1.2 kb were then subjected to cloning and haplotype sequencing. A specific target fragment of the KIR2DL1 framework gene was obtained. Following allele separation, a wild-type KIR2DL1*00302 allele and a novel variant allele, KIR2DL1*031, were identified. Sequence alignment with KIR2DL1 alleles from the IPD-KIR Database showed that the novel allele KIR2DL1*031 has differed from the closest allele KIR2DL1*00302 by a non-synonymous mutation at CDS nt 188A>G (codon 42 GAG>GGG) in exon 4, which has caused an amino acid change Glu42Gly. The sequence of the novel allele KIR2DL1*031 was submitted to GenBank under the accession number KP025960 and to the IPD-KIR Database under the submission number IWS40001982. A name KIR2DL1*031 has been officially assigned by the World Health Organization (WHO) Nomenclature Committee. An assay for cDNA cloning and haplotype sequencing of KIR2DL1 has been established, which has a broad applications in KIR studies at allelic level.
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures

PubMed Central

Pride, David T; Schoenfeld, Thomas

2008-01-01

Background Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. Results From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. Conclusion That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis. PMID:18798991
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.

PubMed

Pride, David T; Schoenfeld, Thomas

2008-09-17

Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis.
A “Rosetta Stone” for metazoan zooplankton: DNA barcode analysis of species diversity of the Sargasso Sea (Northwest Atlantic Ocean)

NASA Astrophysics Data System (ADS)

Bucklin, Ann; Ortman, Brian D.; Jennings, Robert M.; Nigro, Lisa M.; Sweetman, Christopher J.; Copley, Nancy J.; Sutton, Tracey; Wiebe, Peter H.

2010-12-01

Species diversity of the metazoan holozooplankton assemblage of the Sargasso Sea, Northwest Atlantic Ocean, was examined through coordinated morphological taxonomic identification of species and DNA sequencing of a ˜650 base-pair region of mitochondrial cytochrome oxidase I (mtCOI) as a DNA barcode (i.e., short sequence for species recognition and discrimination). Zooplankton collections were made from the surface to 5,000 meters during April, 2006 on the R/V R.H. Brown. Samples were examined by a ship-board team of morphological taxonomists; DNA barcoding was carried out in both ship-board and land-based DNA sequencing laboratories. DNA barcodes were determined for a total of 297 individuals of 175 holozooplankton species in four phyla, including: Cnidaria (Hydromedusae, 4 species; Siphonophora, 47); Arthropoda (Amphipoda, 10; Copepoda, 34; Decapoda, 9; Euphausiacea, 10; Mysidacea, 1; Ostracoda, 27); and Mollusca (Cephalopoda, 8; Heteropoda, 6; Pteropoda, 15); and Chaetognatha (4). Thirty species of fish (Teleostei) were also barcoded. For all seven zooplankton groups for which sufficient data were available, Kimura-2-Parameter genetic distances were significantly lower between individuals of the same species (mean=0.0114; S.D. 0.0117) than between individuals of different species within the same group (mean=0.3166; S.D. 0.0378). This difference, known as the barcode gap, ensures that mtCOI sequences are reliable characters for species identification for the oceanic holozooplankton assemblage. In addition, DNA barcodes allow recognition of new or undescribed species, reveal cryptic species within known taxa, and inform phylogeographic and population genetic studies of geographic variation. The growing database of "gold standard" DNA barcodes serves as a Rosetta Stone for marine zooplankton, providing the key for decoding species diversity by linking species names, morphology, and DNA sequence variation. In light of the pivotal position of zooplankton in ocean food webs, their usefulness as rapid responders to environmental change, and the increasing scarcity of taxonomists, the use of DNA barcodes is an important and useful approach for rapid analysis of species diversity and distribution in the pelagic community.
Introduction of a novel 18S rDNA gene arrangement along with distinct ITS region in the saline water microalga Dunaliella

PubMed Central

2010-01-01

Comparison of 18S rDNA gene sequences is a very promising method for identification and classification of living organisms. Molecular identification and discrimination of different Dunaliella species were carried out based on the size of 18S rDNA gene and, number and position of introns in the gene. Three types of 18S rDNA structure have already been reported: the gene with a size of ~1770 bp lacking any intron, with a size of ~2170 bp consisting one intron near 5' terminus, and with a size of ~2570 bp harbouring two introns near 5' and 3' termini. Hereby, we report a new 18S rDNA gene arrangement in terms of intron localization and nucleotide sequence in a Dunaliella isolated from Iranian salt lakes (ABRIINW-M1/2). PCR amplification with genus-specific primers resulted in production of a ~2170 bp DNA band, which is similar to that of D. salina 18S rDNA gene containing only one intron near 5' terminus. Whilst, sequence composition of the gene revealed the lack of any intron near 5' terminus in our isolate. Furthermore, another alteration was observed due to the presence of a 440 bp DNA fragment near 3' terminus. Accordingly, 18S rDNA gene of the isolate is clearly different from those of D. salina and any other Dunaliella species reported so far. Moreover, analysis of ITS region sequence showed the diversity of this region compared to the previously reported species. 18S rDNA and ITS sequences of our isolate were submitted with accesion numbers of EU678868 and EU927373 in NCBI database, respectively. The optimum growth rate of this isolate occured at the salinity level of 1 M NaCl. The maximum carotenoid content under stress condition of intense light (400 μmol photon m-2 s-1), high salinity (4 M NaCl) and deficiency of nitrate and phosphate nutritions reached to 240 ng/cell after 15 days. PMID:20377865
"Would you accept having your DNA profile inserted in the National Forensic DNA database? Why?" Results of a questionnaire applied in Portugal.

PubMed

Machado, Helena; Silva, Susana

2014-01-01

The creation and expansion of forensic DNA databases might involve potential threats to the protection of a range of human rights. At the same time, such databases have social benefits. Based on data collected through an online questionnaire applied to 628 individuals in Portugal, this paper aims to analyze the citizens' willingness to donate voluntarily a sample for profiling and inclusion in the National Forensic DNA Database and the views underpinning such a decision. Nearly one-quarter of the respondents would indicate 'no', and this negative response increased significantly with age and education. The overriding willingness to accept the inclusion of the individual genetic profile indicates an acknowledgement of the investigative potential of forensic DNA technologies and a relegation of civil liberties and human rights to the background, owing to the perceived benefits of protecting both society and the individual from crime. This rationale is mostly expressed by the idea that all citizens should contribute to the expansion of the National Forensic DNA Database for reasons that range from the more abstract assumption that donating a sample for profiling would be helpful in fighting crime to the more concrete suggestion that everyone (criminals and non-criminals) should be in the database. The concerns with the risks of accepting the donation of a sample for genetic profiling and inclusion in the National Forensic DNA Database are mostly related to lack of control and insufficient or unclear regulations concerning safeguarding individuals' data and supervising the access and uses of genetic data. By providing an empirically-grounded understanding of the attitudes regarding willingness to donate voluntary a sample for profiling and inclusion in a National Forensic DNA Database, this study also considers the citizens' perceived benefits and risks of operating forensic DNA databases. These collective views might be useful for the formation of international common ethical standards for the development and governance of DNA databases in a framework in which the citizens' perspectives are taken into consideration. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
JICST Factual Database JICST DNA Database

NASA Astrophysics Data System (ADS)

Shirokizawa, Yoshiko; Abe, Atsushi

Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.
Dfam: a database of repetitive DNA based on profile hidden Markov models.

PubMed

Wheeler, Travis J; Clements, Jody; Eddy, Sean R; Hubley, Robert; Jones, Thomas A; Jurka, Jerzy; Smit, Arian F A; Finn, Robert D

2013-01-01

We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.
Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi.

PubMed

Schoch, Conrad L; Robbertse, Barbara; Robert, Vincent; Vu, Duong; Cardinali, Gianluigi; Irinyi, Laszlo; Meyer, Wieland; Nilsson, R Henrik; Hughes, Karen; Miller, Andrew N; Kirk, Paul M; Abarenkov, Kessy; Aime, M Catherine; Ariyawansa, Hiran A; Bidartondo, Martin; Boekhout, Teun; Buyck, Bart; Cai, Qing; Chen, Jie; Crespo, Ana; Crous, Pedro W; Damm, Ulrike; De Beer, Z Wilhelm; Dentinger, Bryn T M; Divakar, Pradeep K; Dueñas, Margarita; Feau, Nicolas; Fliegerova, Katerina; García, Miguel A; Ge, Zai-Wei; Griffith, Gareth W; Groenewald, Johannes Z; Groenewald, Marizeth; Grube, Martin; Gryzenhout, Marieka; Gueidan, Cécile; Guo, Liangdong; Hambleton, Sarah; Hamelin, Richard; Hansen, Karen; Hofstetter, Valérie; Hong, Seung-Beom; Houbraken, Jos; Hyde, Kevin D; Inderbitzin, Patrik; Johnston, Peter R; Karunarathna, Samantha C; Kõljalg, Urmas; Kovács, Gábor M; Kraichak, Ekaphan; Krizsan, Krisztina; Kurtzman, Cletus P; Larsson, Karl-Henrik; Leavitt, Steven; Letcher, Peter M; Liimatainen, Kare; Liu, Jian-Kui; Lodge, D Jean; Luangsa-ard, Janet Jennifer; Lumbsch, H Thorsten; Maharachchikumbura, Sajeewa S N; Manamgoda, Dimuthu; Martín, María P; Minnis, Andrew M; Moncalvo, Jean-Marc; Mulè, Giuseppina; Nakasone, Karen K; Niskanen, Tuula; Olariaga, Ibai; Papp, Tamás; Petkovits, Tamás; Pino-Bodas, Raquel; Powell, Martha J; Raja, Huzefa A; Redecker, Dirk; Sarmiento-Ramirez, J M; Seifert, Keith A; Shrestha, Bhushan; Stenroos, Soili; Stielow, Benjamin; Suh, Sung-Oui; Tanaka, Kazuaki; Tedersoo, Leho; Telleria, M Teresa; Udayanga, Dhanushka; Untereiner, Wendy A; Diéguez Uribeondo, Javier; Subbarao, Krishna V; Vágvölgyi, Csaba; Visagie, Cobus; Voigt, Kerstin; Walker, Donald M; Weir, Bevan S; Weiß, Michael; Wijayawardene, Nalin N; Wingfield, Michael J; Xu, J P; Yang, Zhu L; Zhang, Ning; Zhuang, Wen-Ying; Federhen, Scott

2014-01-01

DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput sequencing require fast and effective methods for en masse species assignments. In this article, we focus on selecting and re-annotating a set of marker reference sequences that represent each currently accepted order of Fungi. The particular focus is on sequences from the internal transcribed spacer region in the nuclear ribosomal cistron, derived from type specimens and/or ex-type cultures. Re-annotated and verified sequences were deposited in a curated public database at the National Center for Biotechnology Information (NCBI), namely the RefSeq Targeted Loci (RTL) database, and will be visible during routine sequence similarity searches with NR_prefixed accession numbers. A set of standards and protocols is proposed to improve the data quality of new sequences, and we suggest how type and other reference sequences can be used to improve identification of Fungi. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353. Published by Oxford University Press 2013. This work is written by US Government employees and is in the public domain in the US.
Forensically Relevant Blow Flies in Lebanon Survey and Identification Using Molecular Markers (Diptera: Calliphoridae).

PubMed

Shayya, Salman; Debruyne, Régis; Nel, André; Azar, Dany

2018-05-12

Calliphoridae are among the first insects associated to decomposing animal remains. We have collected 1,841 specimens of three calliphorid genera: Calliphora, Lucilia, and Chrysomya, from different Lebanese localities as a first step in implementing a database of insects of forensic relevance for the country. Blow-flies are crucial for the estimation of the postmortem interval. DNA-based identification is a rapid and accurate method, often used for morphologically similar species, especially for immatures or incomplete specimens. In this study, we test the suitability of three genetic markers to identify adults and immature stages of calliphorids, viz., mitochondrial cytochrome c oxidase subunit I (COI) barcode, a region including partial sequences of mitochondrial Cyt-b-tRNAser-ND1, and second internal transcribed spacer (ITS2) region of nuclear ribosomal DNA. Forty Lebanese specimens of various developmental stages (egg, larva, wandering third instar, pupa, newly emerged adult, and mature adult) were identified among the three calliphorid genera: Calliphora, Lucilia, and Chrysomya, and compared with published sequences to confirm their specific assignation. Phylogenetic analyses showed the robustness of ITS2 and COI to identify calliphorids at species level. Nevertheless, ITS2 failed to discriminate Lucilia caesar (Linnaeus) (Diptera, Calliphoridae) from Lucilia illustris (Meigen) (Diptera, Calliphoridae), and COI had a similar issue with Lucilia sericata (Meigen) (Diptera, Calliphoridae) and Lucilia cuprina (Wiedemann) (Diptera, Calliphoridae). Thus, these two markers are complementary. This work contributes new nucleotide sequences for Lebanon. It is a first step in implementing a molecular database of forensic relevant insects for the country.
Human Provenancing: It's Elemental…

NASA Astrophysics Data System (ADS)

Meier-Augenstein, Wolfram; Kemp

2009-04-01

Forensic science already uses a variety of methods often in combination to determine a deceased person's identity if neither personal effects nor next of kin (or close friends) can positively identify the victim. While disciplines such as forensic anthropology are able to work from a blank canvass as it were and can provide information on age, gender and ethnical grouping, techniques such as DNA profiling do rely on finding a match either in a database or a comparative sample presumed to be an ante-mortem sample of the victim or from a putative relation. Chances for either to succeed would be greatly enhanced if information gained from a forensic anthropological examination and, circumstances permitting a facial reconstruction could be linked to another technique that can work from a blank canvass or at least does not require comparison to a subject specific database. With the help of isotope ratio mass spectrometry even the very atoms from which a body is made can be used to say something about a person that will help to focus human identification using traditional techniques such as DNA, fingerprints and odontology. Stable isotope fingerprinting works on the basis that almost all chemical elements and in particular the so-called light elements such as carbon (C) that comprise most of the human body occur naturally in different forms, namely isotopes. 2H isotope abundance values recorded by the human body through food and drink ultimately reflect averaged isotopic composition of precipitation or ground water. Stable isotope analysis of 2H isotopic composition in different human tissue such as hair, nails, bone and teeth enables us to construct a time resolved isotopic profile or ‘fingerprint' that may not necessarily permit direct identification of a murder victim or mass disaster victim but in conjunction with forensic anthropological information will provide sufficient intelligence to construct a profile for intelligence lead identification stating where a victim was from (point of origin), how old they were, what their ‘life style' was and even if and where they had recently travelled. Data from several criminal investigations are presented to illustrate potential and limitation of stable isotope analysis of human tissue in aid of victim identification.
The Species Dilemma of Northeast Indian Mahseer (Actinopterygii: Cyprinidae): DNA Barcoding in Clarifying the Riddle

PubMed Central

Laskar, Boni A.; Bhattacharjee, Maloyjo J.; Dhar, Bishal; Mahadani, Pradosh; Kundu, Shantanu; Ghosh, Sankar K.

2013-01-01

Background The taxonomic validity of Northeast Indian endemic Mahseer species, Tor progeneius and Neolissochilus hexastichus, has been argued repeatedly. This is mainly due to disagreements in recognizing the species based on morphological characters. Consequently, both the species have been concealed for many decades. DNA barcoding has become a promising and an independent technique for accurate species level identification. Therefore, utilization of such technique in association with the traditional morphotaxonomic description can resolve the species dilemma of this important group of sport fishes. Methodology/Principal Findings Altogether, 28 mahseer specimens including paratypes were studied from different locations in Northeast India, and 24 morphometric characters were measured invariably. The Principal Component Analysis with morphometric data revealed five distinct groups of sample that were taxonomically categorized into 4 species, viz., Tor putitora, T. progeneius, Neolissochilus hexagonolepis and N. hexastichus. Analysis with a dataset of 76 DNA barcode sequences of different mahseer species exhibited that the queries of T. putitora and N. hexagonolepis clustered cohesively with the respective conspecific database sequences maintaining 0.8% maximum K2P divergence. The closest congeneric divergence was 3 times higher than the mean conspecific divergence and was considered as barcode gap. The maximum divergence among the samples of T. progeneius and T. putitora was 0.8% that was much below the barcode gap, indicating them being synonymous. The query sequences of N. hexastichus invariably formed a discrete and a congeneric clade with the database sequences and maintained the interspecific divergence that supported its distinct species status. Notably, N. hexastichus was encountered in a single site and seemed to be under threat. Conclusion This study substantiated the identification of N. hexastichus to be a true species, and tentatively regarded T. progeneius to be a synonym of T. putitora. It would guide the conservationists to initiate priority conservation of N. hexastichus and T. putitora. PMID:23341979
Identification of Aspergillus sections Flavi, Nigri, and Fumigati and their differentiation using specific primers.

PubMed

Ashtiani, Nafiseh Mohebbi; Kachuei, Reza; Yalfani, Roozbeh; Harchegani, Asghar Beigi; Nosratabadi, Mohsen

2017-06-01

Aspergillus species are important in medicine, agriculture and various industries. The sections Fumigati, Flavi, and Nigri are the most important members of the Aspergillus genus. This study intended to identify and separate these three Aspergillus sections and to differentiate among them using specific primers. A bioinformatics study was initially performed to analyse the sequences of five genes, namely, beta-tubulin, calmodulin, the pre-rRNA processing protein Tsr1, the DNA-replication licensing factor Mcm7, and RNA polymerase II second largest subunit (RPB2) in the three Aspergillus sections using MEGA6 software and the NCBI database. Primers were designed to select genes for each of the Aspergillus sections being analysed. A total of 134 environmental and clinical Aspergillus species were isolated, purified and initially identified by colony morphology.. Subsequently, DNA was extracted using the phenol-chloroform method, specific primers were synthesized, PCR was performed for DNA from all isolates, and the results were compared to morphological characteristics. Of the 134 isolates tested, 56 were Nigri, 32 were Fumigati, 32 were Flavi, and the rest (14 isolates) belonged to other sections. The beta-tubulin and calmodulin genes were found to be the most suitable for differentiating among these three groups; the beta-tubulin gene was used for molecular identification of Aspergillus section Fumigati, and the calmodulin gene for identifying sections Flavi and Nigri.
The Polish Genetic Database of Victims of Totalitarianisms.

PubMed

Ossowski, A; Kuś, M; Kupiec, T; Bykowska, M; Zielińska, G; Jasiński, M E; March, A L

2016-01-01

This paper describes the creation of the Polish Genetic Database of Victims of Totalitarianism and the first research conducted under this project. On September 28th 2012, the Pomeranian Medical University in Szczecin and the Institute of National Remembrance-Commission for Prosecution of Crimes against the Polish Nation agreed to support the creation of the Polish Genetic Database of Victims of Totalitarianism (PBGOT, www.pbgot.pl). The purpose was to employ state-of-the-art methods of forensic genetics to identify the remains of unidentified victims of Communist and Nazi totalitarian regimes. The database was designed to serve as a central repository of genetic information of the victim's DNA and that of the victim's nearest living relatives, with the goal of making a positive identification of the victim. Along the way, PGBOT encountered several challenges. First, extracting useable DNA samples from the remains of individuals who had been buried for over half a century required forensic geneticists to create special procedures and protocols. Second, obtaining genetic reference material and historical information from the victim's closest relatives was both problematic and urgent. The victim's nearest living relatives were part of a dying generation, and the opportunity to obtain the best genetic and historical information about the victims would soon die with them. For this undertaking, PGBOT assembled a team of historians, archaeologists, forensic anthropologists, and forensic geneticists from several European research institutions. The field work was divided into five broad categories: (1) exhumation of victim remains and storing their biological material for later genetic testing; (2) researching archives and historical data for a more complete profile of those killed or missing and the families that lost them; (3) locating the victim's nearest relatives to obtain genetic reference samples (swabs), (4) entering the genetic data from both victims and family members into a common database; (5) making a conclusive, final identification of the victim. PGBOT's first project was to identify victims of the Communist regime buried in hidden mass graves in the Powązki Military Cemetery in Warsaw. Throughout 2012 and 2013, PGBOT carried out archaeological exhumations in the Powązki Military Cemetery that resulted in the recovery of the skeletal remains of 194 victims in several mass graves. Of the 194 sets of remains, more than 50 victims have been successfully matched and identified through genetic evidence. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Forensic botany II, DNA barcode for land plants: Which markers after the international agreement?

PubMed

Ferri, G; Corradini, B; Ferrari, F; Santunione, A L; Palazzoli, F; Alu', M

2015-03-01

The ambitious idea of using a short piece of DNA for large-scale species identification (DNA barcoding) is already a powerful tool for scientists and the application of this standard technique seems promising in a range of fields including forensic genetics. While DNA barcoding enjoyed a remarkable success for animal identification through cytochrome c oxidase I (COI) analysis, the attempts to identify a single barcode for plants remained a vain hope for a longtime. From the beginning, the Consortium for the Barcode of Life (CBOL) showed a lack of agreement on a core plant barcode, reflecting the diversity of viewpoints. Different research groups advocated various markers with divergent set of criteria until the recent publication by the CBOL-Plant Working Group. After a four-year effort, in 2009 the International Team concluded to agree on standard markers promoting a multilocus solution (rbcL and matK), with 70-75% of discrimination to the species level. In 2009 our group firstly proposed the broad application of DNA barcoding principles as a tool for identification of trace botanical evidence through the analysis of two chloroplast loci (trnH-psbA and trnL-trnF) in plant species belonging to local flora. Difficulties and drawbacks that were encountered included a poor coverage of species in specific databases and the lack of authenticated reference sequences for the selected markers. Successful preliminary results were obtained providing an approach to progressively identify unknown plant specimens to a given taxonomic rank, usable by any non-specialist botanist or in case of a shortage of taxonomic expertise. Now we considered mandatory to update and to compare our previous findings with the new selected plastid markers (matK+rbcL), taking into account forensic requirements. Features of all the four loci (the two previously analyzed trnH-psbA+trnL-trnF and matK+rbcL) were compared singly and in multilocus solutions to assess the most suitable combination for forensic botany. Based on obtained results, we recommend the adoption of a two-locus combination with rbcL+trnH-psbA plastid markers, which currently best satisfies forensic needs for botanical species identification. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

Validation of a for anaerobic bacteria optimized MALDI-TOF MS biotyper database: The ENRIA project.

PubMed

Veloo, A C M; Jean-Pierre, H; Justesen, U S; Morris, T; Urban, E; Wybo, I; Kostrzewa, M; Friedrich, A W

2018-03-12

Within the ENRIA project, several 'expertise laboratories' collaborated in order to optimize the identification of clinical anaerobic isolates by using a widely available platform, the Biotyper Matrix Assisted Laser Desorption Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS). Main Spectral Profiles (MSPs) of well characterized anaerobic strains were added to one of the latest updates of the Biotyper database db6903; (V6 database) for common use. MSPs of anaerobic strains nominated for addition to the Biotyper database are included in this validation. In this study, we validated the optimized database (db5989 [V5 database] + ENRIA MSPs) using 6309 anaerobic isolates. Using the V5 database 71.1% of the isolates could be identified with high confidence, 16.9% with low confidence and 12.0% could not be identified. Including the MSPs added to the V6 database and all MSPs created within the ENRIA project, the amount of strains identified with high confidence increased to 74.8% and 79.2%, respectively. Strains that could not be identified using MALDI-TOF MS decreased to 10.4% and 7.3%, respectively. The observed increase in high confidence identifications differed per genus. For Bilophila wadsworthia, Prevotella spp., gram-positive anaerobic cocci and other less commonly encountered species more strains were identified with higher confidence. A subset of the non-identified strains (42.1%) were identified using 16S rDNA gene sequencing. The obtained identities demonstrated that strains could not be identified either due to the generation of spectra of insufficient quality or due to the fact that no MSP of the encountered species was present in the database. Undoubtedly, the ENRIA project has successfully increased the number of anaerobic isolates that can be identified with high confidence. We therefore recommend further expansion of the database to include less frequently isolated species as this would also allow us to gain valuable insight into the clinical relevance of these less common anaerobic bacteria. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Searching mixed DNA profiles directly against profile databases.

PubMed

Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John

2014-03-01

DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives

PubMed Central

Marjanović, Damir; Konjhodžić, Rijad; Butorac, Sara Sanela; Drobnič, Katja; Merkaš, Siniša; Lauc, Gordan; Primorac, Damir; Anđelinović, Šimun; Milosavljević, Mladen; Karan, Željko; Vidović, Stojko; Stojković, Oliver; Panić, Bojana; Vučetić Dragović, Anđelka; Kovačević, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan

2011-01-01

The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a ‘regional supplement’ to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations. PMID:21674821
Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives.

PubMed

Marjanović, Damir; Konjhodzić, Rijad; Butorac, Sara Sanela; Drobnic, Katja; Merkas, Sinisa; Lauc, Gordan; Primorac, Damir; Andjelinović, Simun; Milosavljević, Mladen; Karan, Zeljko; Vidović, Stojko; Stojković, Oliver; Panić, Bojana; Vucetić Dragović, Andjelka; Kovacević, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan

2011-06-01

The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a 'regional supplement' to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations.
28 CFR 28.11 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-07-01

... Administration DEPARTMENT OF JUSTICE DNA IDENTIFICATION SYSTEM DNA Sample Collection, Analysis, and Indexing § 28.11 Definitions. DNA analysis means analysis of the deoxyribonucleic acid (DNA) identification information in a bodily sample. DNA sample means a tissue, fluid, or other bodily sample of an individual on...
28 CFR 28.11 - Definitions.

Code of Federal Regulations, 2011 CFR

2011-07-01

... Administration DEPARTMENT OF JUSTICE DNA IDENTIFICATION SYSTEM DNA Sample Collection, Analysis, and Indexing § 28.11 Definitions. DNA analysis means analysis of the deoxyribonucleic acid (DNA) identification information in a bodily sample. DNA sample means a tissue, fluid, or other bodily sample of an individual on...
28 CFR 28.11 - Definitions.

Code of Federal Regulations, 2013 CFR

2013-07-01

... Administration DEPARTMENT OF JUSTICE DNA IDENTIFICATION SYSTEM DNA Sample Collection, Analysis, and Indexing § 28.11 Definitions. DNA analysis means analysis of the deoxyribonucleic acid (DNA) identification information in a bodily sample. DNA sample means a tissue, fluid, or other bodily sample of an individual on...
28 CFR 28.11 - Definitions.

Code of Federal Regulations, 2012 CFR

2012-07-01

... Administration DEPARTMENT OF JUSTICE DNA IDENTIFICATION SYSTEM DNA Sample Collection, Analysis, and Indexing § 28.11 Definitions. DNA analysis means analysis of the deoxyribonucleic acid (DNA) identification information in a bodily sample. DNA sample means a tissue, fluid, or other bodily sample of an individual on...
28 CFR 28.11 - Definitions.

Code of Federal Regulations, 2014 CFR

2014-07-01

... Administration DEPARTMENT OF JUSTICE DNA IDENTIFICATION SYSTEM DNA Sample Collection, Analysis, and Indexing § 28.11 Definitions. DNA analysis means analysis of the deoxyribonucleic acid (DNA) identification information in a bodily sample. DNA sample means a tissue, fluid, or other bodily sample of an individual on...
Fastidious Gram-Negatives: Identification by the Vitek 2 Neisseria-Haemophilus Card and by Partial 16S rRNA Gene Sequencing Analysis.

PubMed

Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita

2010-12-31

Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification.
Fastidious Gram-Negatives: Identification by the Vitek 2 Neisseria-Haemophilus Card and by Partial 16S rRNA Gene Sequencing Analysis

PubMed Central

Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita

2010-01-01

Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification. PMID:21347215
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

PubMed Central

Borodovsky, M; Rudd, K E; Koonin, E V

1994-01-01

The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
Species composition of the genus Saprolegnia in fin fish aquaculture environments, as determined by nucleotide sequence analysis of the nuclear rDNA ITS regions.

PubMed

de la Bastide, Paul Y; Leung, Wai Lam; Hintz, William E

2015-01-01

The ITS region of the rDNA gene was compared for Saprolegnia spp. in order to improve our understanding of nucleotide sequence variability within and between species of this genus, determine species composition in Canadian fin fish aquaculture facilities, and to assess the utility of ITS sequence variability in genetic marker development. From a collection of more than 400 field isolates, ITS region nucleotide sequences were studied and it was determined that there was sufficient consistent inter-specific variation to support the designation of species identity based on ITS sequence data. This non-subjective approach to species identification does not rely upon transient morphological features. Phylogenetic analyses comparing our ITS sequences and species designations with data from previous studies generally supported the clade scheme of Diéguez-Uribeondo et al. (2007) and found agreement with the molecular taxonomic cluster system of Sandoval-Sierra et al. (2014). Our Canadian ITS sequence collection will thus contribute to the public database and assist the clarification of Saprolegnia spp. taxonomy. The analysis of ITS region sequence variability facilitated genus- and species-level identification of unknown samples from aquaculture facilities and provided useful information on species composition. A unique ITS-RFLP for the identification of S. parasitica was also described. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
hPDI: a database of experimental human protein-DNA interactions.

PubMed

Xie, Zhi; Hu, Shaohui; Blackshaw, Seth; Zhu, Heng; Qian, Jiang

2010-01-15

The human protein DNA Interactome (hPDI) database holds experimental protein-DNA interaction data for humans identified by protein microarray assays. The unique characteristics of hPDI are that it contains consensus DNA-binding sequences not only for nearly 500 human transcription factors but also for >500 unconventional DNA-binding proteins, which are completely uncharacterized previously. Users can browse, search and download a subset or the entire data via a web interface. This database is freely accessible for any academic purposes. http://bioinfo.wilmer.jhu.edu/PDI/.
Identification of tissue-specific cell death using methylation patterns of circulating DNA

PubMed Central

Lehmann-Werman, Roni; Neiman, Daniel; Zemmour, Hai; Moss, Joshua; Magenheim, Judith; Vaknin-Dembinsky, Adi; Rubertsson, Sten; Nellgård, Bengt; Blennow, Kaj; Zetterberg, Henrik; Spalding, Kirsty; Haller, Michael J.; Wasserfall, Clive H.; Schatz, Desmond A.; Greenbaum, Carla J.; Dorrell, Craig; Grompe, Markus; Zick, Aviad; Hubert, Ayala; Maoz, Myriam; Fendrich, Volker; Bartsch, Detlef K.; Golan, Talia; Ben Sasson, Shmuel A.; Zamir, Gideon; Razin, Aharon; Cedar, Howard; Shapiro, A. M. James; Glaser, Benjamin; Shemer, Ruth; Dor, Yuval

2016-01-01

Minimally invasive detection of cell death could prove an invaluable resource in many physiologic and pathologic situations. Cell-free circulating DNA (cfDNA) released from dying cells is emerging as a diagnostic tool for monitoring cancer dynamics and graft failure. However, existing methods rely on differences in DNA sequences in source tissues, so that cell death cannot be identified in tissues with a normal genome. We developed a method of detecting tissue-specific cell death in humans based on tissue-specific methylation patterns in cfDNA. We interrogated tissue-specific methylome databases to identify cell type-specific DNA methylation signatures and developed a method to detect these signatures in mixed DNA samples. We isolated cfDNA from plasma or serum of donors, treated the cfDNA with bisulfite, PCR-amplified the cfDNA, and sequenced it to quantify cfDNA carrying the methylation markers of the cell type of interest. Pancreatic β-cell DNA was identified in the circulation of patients with recently diagnosed type-1 diabetes and islet-graft recipients; oligodendrocyte DNA was identified in patients with relapsing multiple sclerosis; neuronal/glial DNA was identified in patients after traumatic brain injury or cardiac arrest; and exocrine pancreas DNA was identified in patients with pancreatic cancer or pancreatitis. This proof-of-concept study demonstrates that the tissue origins of cfDNA and thus the rate of death of specific cell types can be determined in humans. The approach can be adapted to identify cfDNA derived from any cell type in the body, offering a minimally invasive window for diagnosing and monitoring a broad spectrum of human pathologies as well as providing a better understanding of normal tissue dynamics. PMID:26976580
Automated forensic DNA purification optimized for FTA card punches and identifiler STR-based PCR analysis.

PubMed

Tack, Lois C; Thomas, Michelle; Reich, Karl

2007-03-01

Forensic labs globally face the same problem-a growing need to process a greater number and wider variety of samples for DNA analysis. The same forensic lab can be tasked all at once with processing mixed casework samples from crime scenes, convicted offender samples for database entry, and tissue from tsunami victims for identification. Besides flexibility in the robotic system chosen for forensic automation, there is a need, for each sample type, to develop new methodology that is not only faster but also more reliable than past procedures. FTA is a chemical treatment of paper, unique to Whatman Bioscience, and is used for the stabilization and storage of biological samples. Here, the authors describe optimization of the Whatman FTA Purification Kit protocol for use with the AmpFlSTR Identifiler PCR Amplification Kit.
Inclusiveness, Effectiveness and Intrusiveness: Issues in the Developing Uses of DNA Profiling in Support of Criminal Investigations

PubMed Central

2005-01-01

Précis The rapid implementation and continuing expansion of forensic DNA databases around the world has been supported by claims about their effectiveness in criminal investigations and challenged by assertions of the resulting intrusiveness into individual privacy. These two competing perspectives provide the basis for ongoing considerations about the categories of persons who should be subject to nonconsensual DNA sampling and profile retention as well as the uses to which such profiles should be put. This paper uses the example of the current arrangements for forensic DNA databasing in England & Wales to discuss the ways in which the legislative and operational basis for police DNA databasing is reliant upon continuous deliberations over these and other matters by a range of key stakeholders. We also assess the effects of the recent innovative use of DNA databasing for ‘familial searching’ in this jurisdiction in order to show how agreed understandings about the appropriate uses of DNA can become unsettled and reformulated even where their investigative effectiveness is uncontested. We conclude by making some observations about the future of what is recognised to be the largest forensic DNA database in the world. PMID:16240734
DNA barcoding of tuberous Orchidoideae: a resource for identification of orchids used in Salep.

PubMed

Ghorbani, Abdolbaset; Gravendeel, Barbara; Selliah, Sugirthini; Zarré, Shahin; de Boer, Hugo

2017-03-01

Tubers of terrestrial orchids are harvested and traded from the eastern Mediterranean to the Caspian Sea for the traditional product Salep. Overexploitation of wild populations and increased middle-class prosperity have escalated prices for Salep, causing overharvesting, depletion of native populations and providing an incentive to expand harvesting to untapped areas in Iran. Limited morphological distinctiveness among traded Salep tubers renders species identification impossible, making it difficult to establish which species are targeted and affected the most. In this study, a reference database of 490 nrITS, trnL-F spacer and matK sequences of 133 taxa was used to identify 150 individual tubers from 31 batches purchased in 12 cities in Iran to assess species diversity in commerce. The sequence reference database consisted of 211 nrITS, 158 trnL-F and 121 matK sequences, including 238 new sequences from collections made for this study. The markers enabled unambiguous species identification with tree-based methods for nrITS in 67% of the tested tubers, 58% for trnL-F and 59% for matK. Species in the genera Orchis (34%), Anacamptis (27%) and Dactylorhiza (19%) were the most common in Salep. Our study shows that all tuberous orchid species in this area are threatened by this trade, and further stresses the urgency of controlling illegal harvesting and cross-border trade of Salep tubers. © 2016 John Wiley & Sons Ltd.
DNA profiles, computer searches, and the Fourth Amendment.

PubMed

Kimel, Catherine W

2013-01-01

Pursuant to federal statutes and to laws in all fifty states, the United States government has assembled a database containing the DNA profiles of over eleven million citizens. Without judicial authorization, the government searches each of these profiles one-hundred thousand times every day, seeking to link database subjects to crimes they are not suspected of committing. Yet, courts and scholars that have addressed DNA databasing have focused their attention almost exclusively on the constitutionality of the government's seizure of the biological samples from which the profiles are generated. This Note fills a gap in the scholarship by examining the Fourth Amendment problems that arise when the government searches its vast DNA database. This Note argues that each attempt to match two DNA profiles constitutes a Fourth Amendment search because each attempted match infringes upon database subjects' expectations of privacy in their biological relationships and physical movements. The Note further argues that database searches are unreasonable as they are currently conducted, and it suggests an adaptation of computer-search procedures to remedy the constitutional deficiency.
[Medicinal plant DNA marker assisted breeding (Ⅱ) the assistant identification of SNPs assisted identification and breeding research of high yield Perilla frutescens new variety].

PubMed

Shen, Qi; Zhang, Dong; Sun, Wei; Zhang, Yu-Jun; Shang, Zhi-Wei; Chen, Shi-Lin

2017-05-01

Perilla frutescens is one of 60 kinds of food and medicine plants in the initial directory announced by health ministry of China. With the development of Perilla domain in recent , the breeding and application of good varieties has become the main bottleneck of its development. This study reported that applied to the system selection, add to marker-assisted method to breed perilla varieties. Through the whole genome sequencing and consistency matching, annotated the mutation locus according to genome data, and comparison analysis with Perilla common variants database, finally selected 30 non-synonymous mutation SNPs used as characteristic markers of Zhongyan Feishu No.1. those SNP marker were used as chosen standard of Perilla varieties. Finally breeding new perilla variety Zhongyan Feishu No.1, which possess to characters of the leaf and seed dual-used, high yield, high resistance, and could used to green fertilizer. The Zhongyan Feishu No.1 acquired the plant new varieties identification of Beijing city , the identification numbers is 2016054. Marker assisted identification guide new varieties breeding in plants, which can provide a new reference for breeding of medicinal plants. Copyright© by the Chinese Pharmaceutical Association.

The use of forensic DNA analysis in humanitarian forensic action: The development of a set of international standards.

PubMed

Goodwin, William H

2017-09-01

DNA analysis was first applied to the identification of victims of armed conflicts and other situations of violence (ACOSV) in the mid-1990s, starting in South America and the Balkans. Argentina was the first country to establish a genetic database specifically developed to identify disappeared children. Following on from these programs the early 2000s marked major programs, using a largely DNA-led approach, identifying missing persons in the Balkans and following the attack on the World Trade Center in New York. These two identification programs significantly expanded the magnitude of events to which DNA analysis was used to help provide the identity of missing persons. Guidelines developed by Interpol (2014) [1] related to best practice for identification of human remains following DVI type scenarios have been widely disseminated around the forensic community; in numerous cases these guidelines have been adopted or incorporated into national guidelines/standards/practice. However, given the complexity of many humanitarian contexts in which forensic science is employed there is a lack of internationally accepted guidelines, related to these contexts, for authorities to reference. In response the Argentine government's Human Rights Division in the Ministry of Foreign Affairs and Worship (MREC) proposed that the United Nations (UN) should promote best practice in the use of forensic genetics in humanitarian forensic action: this was adopted by the UN in Resolutions A/HRC/RES/10/26 and A/HRC/RES/15/5. Following on from the adoption of the resolutions MREC has coordinated, with the support of the International Committee of the Red Cross (ICRC), the drafting of a set of guidelines (MREC, ICRC, 2014) [2], with input from national and international agencies. To date the guidelines have been presented to South America's MERCOSUR and the UN and have been disseminated to interested parties. Copyright © 2017 Elsevier B.V. All rights reserved.
FISH-BOL and seafood identification: geographically dispersed case studies reveal systemic market substitution across Canada.

PubMed

Hanner, Robert; Becker, Sven; Ivanova, Natalia V; Steinke, Dirk

2011-10-01

The Fish Barcode of Life campaign involves a broad international collaboration among scientists working to advance the identification of fishes using DNA barcodes. With over 25% of the world's known ichthyofauna currently profiled, forensic identification of seafood products is now feasible and is becoming routine. Driven by growing consumer interest in the food supply, investigative reporters from five different media establishments procured seafood samples (n = 254) from numerous retail establishments located among five Canadian metropolitan areas between 2008 and 2010. The specimens were sent to the Canadian Centre for DNA Barcoding for analysis. By integrating the results from these individual case studies in a summary analysis, we provide a broad perspective on seafood substitution across Canada. Barcodes were recovered from 93% of the samples (n = 236), and identified using the Barcode of Life Data Systems "species identification" engine ( www.barcodinglife.org ). A 99% sequence similarity threshold was employed as a conservative matching criterion for specimen identification to the species level. Comparing these results against the Canadian Food Inspection Agency's "Fish List" a guideline to interpreting "false, misleading or deceptive" names (as per s 27 of the Fish Inspection regulations) demonstrated that 41% of the samples were mislabeled. Most samples were readily identified; however, this was not true in all cases because some samples had no close match. Others were ambiguous due to limited barcode resolution (or imperfect taxonomy) observed within a few closely related species complexes. The latter cases did not significantly impact the results because even the partial resolution achieved was sufficient to demonstrate mislabeling. This work highlights the functional utility of barcoding for the identification of diverse market samples. It also demonstrates how barcoding serves as a bridge linking scientific nomenclature with approved market names, potentially empowering regulatory bodies to enforce labeling standards. By synchronizing taxonomic effort with sequencing effort and database curation, barcoding provides a molecular identification resource of service to applied forensics.
Biomedical Requirements for High Productivity Computing Systems

DTIC Science & Technology

2005-04-01

server at http://www.ncbi.nlm.nih.gov/BLAST/. There are many variants of BLAST, including: 1. BLASTN - Compares a DNA query to a DNA database. Searches ...database (3 reading frames from each strand of the DNA) searching . 13 4. TBLASTN - Compares a protein query to a DNA database, in the 6 possible...the molecular during this phase. After eliminating molecules that could not match the query , an atom-by-atom search for the molecules in conducted
Morphological identification and COI barcodes of adult flies help determine species identities of chironomid larvae (Diptera, Chironomidae).

PubMed

Failla, A J; Vasquez, A A; Hudson, P; Fujimoto, M; Ram, J L

2016-02-01

Establishing reliable methods for the identification of benthic chironomid communities is important due to their significant contribution to biomass, ecology and the aquatic food web. Immature larval specimens are more difficult to identify to species level by traditional morphological methods than their fully developed adult counterparts, and few keys are available to identify the larval species. In order to develop molecular criteria to identify species of chironomid larvae, larval and adult chironomids from Western Lake Erie were subjected to both molecular and morphological taxonomic analysis. Mitochondrial cytochrome c oxidase I (COI) barcode sequences of 33 adults that were identified to species level by morphological methods were grouped with COI sequences of 189 larvae in a neighbor-joining taxon-ID tree. Most of these larvae could be identified only to genus level by morphological taxonomy (only 22 of the 189 sequenced larvae could be identified to species level). The taxon-ID tree of larval sequences had 45 operational taxonomic units (OTUs, defined as clusters with >97% identity or individual sequences differing from nearest neighbors by >3%; supported by analysis of all larval pairwise differences), of which seven could be identified to species or 'species group' level by larval morphology. Reference sequences from the GenBank and BOLD databases assigned six larval OTUs with presumptive species level identifications and confirmed one previously assigned species level identification. Sequences from morphologically identified adults in the present study grouped with and further classified the identity of 13 larval OTUs. The use of morphological identification and subsequent DNA barcoding of adult chironomids proved to be beneficial in revealing possible species level identifications of larval specimens. Sequence data from this study also contribute to currently inadequate public databases relevant to the Great Lakes region, while the neighbor-joining analysis reported here describes the application and confirmation of a useful tool that can accelerate identification and bioassessment of chironomid communities.
Morphological identification and COI barcodes of adult flies help determine species identities of chironomid larvae (Diptera, Chironomidae)

USGS Publications Warehouse

Failla, Andrew Joseph; Vasquez, Adrian Amelio; Hudson, Patrick L.; Fujimoto, Masanori; Ram, Jeffrey L.

2016-01-01

Establishing reliable methods for the identification of benthic chironomid communities is important due to their significant contribution to biomass, ecology and the aquatic food web. Immature larval specimens are more difficult to identify to species level by traditional morphological methods than their fully developed adult counterparts, and few keys are available to identify the larval species. In order to develop molecular criteria to identify species of chironomid larvae, larval and adult chironomids from Western Lake Erie were subjected to both molecular and morphological taxonomic analysis. Mitochondrial cytochrome c oxidase I (COI) barcode sequences of 33 adults that were identified to species level by morphological methods were grouped with COI sequences of 189 larvae in a neighbor-joining taxon-ID tree. Most of these larvae could be identified only to genus level by morphological taxonomy (only 22 of the 189 sequenced larvae could be identified to species level). The taxon-ID tree of larval sequences had 45 operational taxonomic units (OTUs, defined as clusters with >97% identity or individual sequences differing from nearest neighbors by >3%; supported by analysis of all larval pairwise differences), of which seven could be identified to species or ‘species group’ level by larval morphology. Reference sequences from the GenBank and BOLD databases assigned six larval OTUs with presumptive species level identifications and confirmed one previously assigned species level identification. Sequences from morphologically identified adults in the present study grouped with and further classified the identity of 13 larval OTUs. The use of morphological identification and subsequent DNA barcoding of adult chironomids proved to be beneficial in revealing possible species level identifications of larval specimens. Sequence data from this study also contribute to currently inadequate public databases relevant to the Great Lakes region, while the neighbor-joining analysis reported here describes the application and confirmation of a useful tool that can accelerate identification and bioassesment of chironomid communities.
The DNA methylation profile of liver tumors in C3H mice and identification of differentially methylated regions involved in the regulation of tumorigenic genes.

PubMed

Matsushita, Junya; Okamura, Kazuyuki; Nakabayashi, Kazuhiko; Suzuki, Takehiro; Horibe, Yu; Kawai, Tomoko; Sakurai, Toshihiro; Yamashita, Satoshi; Higami, Yoshikazu; Ichihara, Gaku; Hata, Kenichiro; Nohara, Keiko

2018-03-22

C3H mice have been frequently used in cancer studies as animal models of spontaneous liver tumors and chemically induced hepatocellular carcinoma (HCC). Epigenetic modifications, including DNA methylation, are among pivotal control mechanisms of gene expression leading to carcinogenesis. Although information on somatic mutations in liver tumors of C3H mice is available, epigenetic aspects are yet to be clarified. We performed next generation sequencing-based analysis of DNA methylation and microarray analysis of gene expression to explore genes regulated by DNA methylation in spontaneous liver tumors of C3H mice. Overlaying these data, we selected cancer-related genes whose expressions are inversely correlated with DNA methylation levels in the associated differentially methylated regions (DMRs) located around transcription start sites (TSSs) (promoter DMRs). We further assessed mutuality of the selected genes for expression and DNA methylation in human HCC using the Cancer Genome Atlas (TCGA) database. We obtained data on genome-wide DNA methylation profiles in the normal and tumor livers of C3H mice. We identified promoter DMRs of genes which are reported to be related to cancer and whose expressions are inversely correlated with the DNA methylation, including Mst1r, Slpi and Extl1. The association between DNA methylation and gene expression was confirmed using a DNA methylation inhibitor 5-aza-2'-deoxycytidine (5-aza-dC) in Hepa1c1c7 cells and Hepa1-6 cells. Overexpression of Mst1r in Hepa1c1c7 cells illuminated a novel downstream pathway via IL-33 upregulation. Database search indicated that gene expressions of Mst1r and Slpi are upregulated and the TSS upstream regions are hypomethylated also in human HCC. These results suggest that DMRs, including those of Mst1r and Slpi, are involved in liver tumorigenesis in C3H mice, and also possibly in human HCC. Our study clarified genome wide DNA methylation landscape of C3H mice. The data provide useful information for further epigenetic studies of mice models of HCC. The present study particularly proposed novel DNA methylation-regulated pathways for Mst1r and Slpi, which may be applied not only to mouse HCC but also to human HCC.
21 CFR 830.320 - Submission of unique device identification information.

Code of Federal Regulations, 2014 CFR

2014-04-01

... Identification Database § 830.320 Submission of unique device identification information. (a) Designation of... Unique Device Identification Database (GUDID) in a format that we can process, review, and archive...
DNA Barcoding of Metazoan Zooplankton Copepods from South Korea

PubMed Central

Ryu, Shi Hyun; Kim, Sang Ki; Lee, Jin Hee; Lim, Young Jin; Lee, Jimin; Jun, Jumin; Kwak, Myounghai; Lee, Young-Sup; Hwang, Jae-Sam; Venmathi Maran, Balu Alagar; Chang, Cheon Young; Kim, Il-Hoi; Hwang, Ui Wook

2016-01-01

Copepods, small aquatic crustaceans, are the most abundant metazoan zooplankton and outnumber every other group of multicellular animals on earth. In spite of ecological and biological importance in aquatic environment, their morphological plasticity, originated from their various lifestyles and their incomparable capacity to adapt to a variety of environments, has made the identification of species challenging, even for expert taxonomists. Molecular approaches to species identification have allowed rapid detection, discrimination, and identification of cryptic or sibling species based on DNA sequence data. We examined sequence variation of a partial mitochondrial cytochrome C oxidase I gene (COI) from 133 copepod individuals collected from the Korean Peninsula, in order to identify and discriminate 94 copepod species covering six copepod orders of Calanoida, Cyclopoida, Harpacticoida, Monstrilloida, Poecilostomatoida and Siphonostomatoida. The results showed that there exists a clear gap with ca. 20 fold difference between the averages of within-specific sequence divergence (2.42%) and that of between-specific sequence divergence (42.79%) in COI, suggesting the plausible utility of this gene in delimitating copepod species. The results showed, with the COI barcoding data among 94 copepod species, that a copepod species could be distinguished from the others very clearly, only with four exceptions as followings: Mesocyclops dissimilis–Mesocyclops pehpeiensis (0.26% K2P distance in percent) and Oithona davisae–Oithona similis (1.1%) in Cyclopoida, Ostrincola japonica–Pseudomyicola spinosus (1.5%) in Poecilostomatoida, and Hatschekia japonica–Caligus quadratus (5.2%) in Siphonostomatoida. Thus, it strongly indicated that COI may be a useful tool in identifying various copepod species and make an initial progress toward the construction of a comprehensive DNA barcode database for copepods inhabiting the Korean Peninsula. PMID:27383475
Detection of misidentifications of species from the Burkholderia cepacia complex and description of a new member, the soil bacterium Burkholderia catarinensis sp. nov.

PubMed

Bach, Evelise; Sant'Anna, Fernando Hayashi; Magrich Dos Passos, João Frederico; Balsanelli, Eduardo; de Baura, Valter Antonio; Pedrosa, Fábio de Oliveira; de Souza, Emanuel Maltempi; Passaglia, Luciane Maria Pereira

2017-08-31

The correct identification of bacteria from the Burkholderia cepacia complex (Bcc) is crucial for epidemiological studies and treatment of cystic fibrosis infections. However, genome-based identification tools are revealing many controversial Bcc species assignments. The aim of this work is to re-examine the taxonomic position of the soil bacterium B. cepacia 89 through polyphasic and genomic approaches. recA and 16S rRNA gene sequence analysis positioned strain 89 inside the Bcc group. However, based on the divergence score of seven concatenated allele sequences, and values of average nucleotide identity, and digital DNA:DNA hybridization, our results suggest that strain 89 is different from other Bcc species formerly described. Thus, we propose to classify Burkholderia sp. 89 as the novel species Burkholderia catarinensis sp. nov. with strain 89T (=DSM 103188T = BR 10601T) as the type strain. Moreover, our results call the attention to some probable misidentifications of Bcc genomes at the National Center for Biotechnology Information database. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Comprehensive proteomic analysis of Penicillium verrucosum.

PubMed

Nöbauer, Katharina; Hummel, Karin; Mayrhofer, Corina; Ahrens, Maike; Setyabudi, Francis M C; Schmidt-Heydt, Markus; Eisenacher, Martin; Razzazi-Fazeli, Ebrahim

2017-05-01

Mass spectrometric identification of proteins in species lacking validated sequence information is a major problem in veterinary science. In the present study, we used ochratoxin A producing Penicillium verrucosum to identify and quantitatively analyze proteins of an organism with yet no protein information available. The work presented here aimed to provide a comprehensive protein identification of P. verrucosum using shotgun proteomics. We were able to identify 3631 proteins in an "ab initio" translated database from DNA sequences of P. verrucosum. Additionally, a sequential window acquisition of all theoretical fragment-ion spectra analysis was done to find differentially regulated proteins at two different time points of the growth curve. We compared the proteins at the beginning (day 3) and at the end of the log phase (day 12). © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The collation of forensic DNA case data into a multi-dimensional intelligence database.

PubMed

Walsh, S J; Moss, D S; Kliem, C; Vintiner, G M

2002-01-01

The primary aim of any DNA Database is to link individuals to unsolved offenses and unsolved offenses to each other via DNA profiling. This aim has been successfully realised during the operation of the New Zealand (NZ) DNA Databank over the past five years. The DNA Intelligence Project (DIP), a collaborative project involving NZ forensic and law enforcement agencies, interrogated the forensic case data held on the NZ DNA databank and collated it into a functional intelligence database. This database has been used to identify significant trends which direct Police and forensic personnel towards the most appropriate use of DNA technology. Intelligence is being provided in areas such as the level of usage of DNA techniques in criminal investigation, the relative success of crime scene samples and the geographical distribution of crimes. The DIP has broadened the dimensions of the information offered through the NZ DNA Databank and has furthered the understanding and investigative capability of both Police and forensic scientists. The outcomes of this research fit soundly with the current policies of 'intelligence led policing', which are being adopted by Police jurisdictions locally and overseas.
Potential for DNA-based identification of Great Lakes fauna: Match and mismatch between taxa inventories and DNA barcode libraries

EPA Science Inventory

DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to biotic condition assessment and non-native species early-detection monitoring. However, the abi...
21 CFR 830.340 - Voluntary submission of ancillary device identification information.

Code of Federal Regulations, 2014 CFR

2014-04-01

... Identification Database § 830.340 Voluntary submission of ancillary device identification information. (a) You may not submit any information to the Global Unique Device Identification Database (GUDID) other than...
Dictionary-driven prokaryotic gene finding

PubMed Central

Shibuya, Tetsuo; Rigoutsos, Isidore

2002-01-01

Gene identification, also known as gene finding or gene recognition, is among the important problems of molecular biology that have been receiving increasing attention with the advent of large scale sequencing projects. Previous strategies for solving this problem can be categorized into essentially two schools of thought: one school employs sequence composition statistics, whereas the other relies on database similarity searches. In this paper, we propose a new gene identification scheme that combines the best characteristics from each of these two schools. In particular, our method determines gene candidates among the ORFs that can be identified in a given DNA strand through the use of the Bio-Dictionary, a database of patterns that covers essentially all of the currently available sample of the natural protein sequence space. Our approach relies entirely on the use of redundant patterns as the agents on which the presence or absence of genes is predicated and does not employ any additional evidence, e.g. ribosome-binding site signals. The Bio-Dictionary Gene Finder (BDGF), the algorithm’s implementation, is a single computational engine able to handle the gene identification task across distinct archaeal and bacterial genomes. The engine exhibits performance that is characterized by simultaneous very high values of sensitivity and specificity, and a high percentage of correctly predicted start sites. Using a collection of patterns derived from an old (June 2000) release of the Swiss-Prot/TrEMBL database that contained 451 602 proteins and fragments, we demonstrate our method’s generality and capabilities through an extensive analysis of 17 complete archaeal and bacterial genomes. Examples of previously unreported genes are also shown and discussed in detail. PMID:12060689
Laser mass spectrometry for DNA fingerprinting for forensic applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, C.H.; Tang, K.; Taranenko, N.I.

The application of DNA fingerprinting has become very broad in forensic analysis, patient identification, diagnostic medicine, and wildlife poaching, since every individual`s DNA structure is identical within all tissues of their body. DNA fingerprinting was initiated by the use of restriction fragment length polymorphisms (RFLP). In 1987, Nakamura et al. found that a variable number of tandem repeats (VNTR) often occurred in the alleles. The probability of different individuals having the same number of tandem repeats in several different alleles is very low. Thus, the identification of VNTR from genomic DNA became a very reliable method for identification of individuals.more » DNA fingerprinting is a reliable tool for forensic analysis. In DNA fingerprinting, knowledge of the sequence of tandem repeats and restriction endonuclease sites can provide the basis for identification. The major steps for conventional DNA fingerprinting include (1) specimen processing (2) amplification of selected DNA segments by PCR, and (3) gel electrophoresis to do the final DNA analysis. In this work we propose to use laser desorption mass spectrometry for fast DNA fingerprinting. The process and advantages are discussed.« less
Genetic assessment of leech species from yak (Bos grunniens) in the tract of Northeast India.

PubMed

Chatterjee, Nilkantha; Dhar, Bishal; Bhattarcharya, Debasis; Deori, Sourabh; Doley, Juwar; Bam, Joken; Das, Pranab J; Bera, Asit K; Deb, Sitangshu M; Devi, Ningthoujam Neelima; Paul, Rajesh; Malvika, Sorokhaibam; Ghosh, Sankar Kumar

2018-01-01

Yak is an iconic symbol of Tibet and high altitudes of Northeast India. It is highly cherished for milk, meat, and skin. However, yaks suffer drastic change in milk production, weight loss, etc, when infested by parasites. Among them, infestation by leeches is a serious problem in the Himalayan belt of Northeast India. The parasite feeds on blood externally or from body orifices, like nasopharynx, oral, rectum, etc. But there has been limited data about the leech species infesting the yak in that region because of the difficulties in morphological identification due to plasticity of the body, changes in shape, and surface structure and thus, warrants for the molecular characterization of leech. In anticipation, this study would be influential in proper identification of leech species infesting yak track and also helpful in inventorying of leech species in Northeast India. Here, we investigated, through combined approach of molecular markers and morphological parameters for the identification of leech species infesting yak. The DNA sequences of COI barcode fragment, 18S and 28S rDNA, were analyzed for species identification. The generated sequences were subjected to similarity match in global database and analyzed further through Neighbour-Joining, K2P distance based as well as ML approach. Among the three markers, only COI was successful in delineating species whereas the 18S and 28S failed to delineate the species. Our study confirmed the presence of the species from genus Hirudinaria, Haemadipsa, Whitmania, and one species Myxobdella annandalae, which has not been previously reported from this region.
Potential for DNA-based identification of Great Lakes fauna: match and mismatch between taxa inventories and DNA barcode libraries.

PubMed

Trebitz, Anett S; Hoffman, Joel C; Grant, George W; Billehus, Tyler M; Pilgrim, Erik M

2015-07-22

DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.
Potential for DNA-based identification of Great Lakes fauna: match and mismatch between taxa inventories and DNA barcode libraries

NASA Astrophysics Data System (ADS)

Trebitz, Anett S.; Hoffman, Joel C.; Grant, George W.; Billehus, Tyler M.; Pilgrim, Erik M.

2015-07-01

DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.
[Current status of DNA databases in the forensic field: new progress, new legal needs].

PubMed

Baeta, Miriam; Martínez-Jarreta, Begoña

2009-01-01

One of the most polemic issues regarding the use of deoxyribonucleic acid (DNA) in the legal sphere, refers to the creation of DNA databases. Until relatively recently, Spain did not have a law to support the establishment of a national DNA profile bank for forensic purposes, and preserve the fundamental rights of subjects whose data are archived therein. The regulatory law of police databases regarding identifiers obtained from DNA approved in 2007, covers this void in the Spanish legislation and responds to the incessant need to adapt the laws to continuous scientific and technological progress.
Disaster victim identification of military aircrew, 1945-2002.

PubMed

Smith, Adrian

2003-11-01

Aviation accident fatalities are characterized by substantial tissue disruption and fragmentation, limiting the usefulness of traditional identification methods. This study examines the success of disaster victim identification (DVI) in military aviation accident fatalities in the Australian Defense Force (ADF). Accident reports and autopsy records of aircrew fatalities during the period 1945-2002 were examined to identify difficulties experienced during the DVI process or injuries that would prevent identification of remains using non-DNA methods. The ADF had 301 aircraft fatalities sustained in 144 accidents during the period 1945-2002. The autopsy reports for 117 fatalities were reviewed (covering 73.7% of aircrew fatalities from 1960-2002). Of the 117 victims, 38 (32.4%) sustained injuries which were severe enough to prevent identification by traditional (non-DNA) comparative scientific DVI techniques of fingerprint and dental analysis. Many of the ADF fatalities who could not be positively identified in the past could be identified today through the use of DNA techniques. Successful DNA identification, however, depends on having a reference DNA profile. This paper recommends the establishment of a DNA repository to store reference blood samples to facilitate the identification of ADF aircrew remains without causing additional distress to family members.

DNA barcoding reveals a cryptic nemertean invasion in Atlantic and Mediterranean waters

NASA Astrophysics Data System (ADS)

Fernández-Álvarez, Fernando Ángel; Machordom, Annie

2013-09-01

For several groups, like nemerteans, morphology-based identification is a hard discipline, but DNA barcoding may help non-experts in the identification process. In this study, DNA barcoding is used to reveal the cryptic invasion of Pacific Cephalothrix cf. simula into Atlantic and Mediterranean coasts. Although DNA barcoding is a promising method for the identification of Nemertea, only 6 % of the known number of nemertean species is currently associated with a correct DNA barcode. Therefore, additional morphological and molecular studies are necessary to advance the utility of DNA barcoding in the characterisation of possible nemertean alien invasions.
Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.

PubMed

Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis

2017-01-01

Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Benefits and challenges to using DNA-based identification methods: An example study of larval fish from nearshore areas of Lake Superior

EPA Science Inventory

DNA-based identification methods could increase the ability of aquatic resource managers to track patterns of invasive species, especially for taxa that are difficult to identify morphologically. Nonetheless, use of DNA-based identification methods in aquatic surveys is still unc...
i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets.

PubMed

Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S

2011-11-30

Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
The Israel DNA database--the establishment of a rapid, semi-automated analysis system.

PubMed

Zamir, Ashira; Dell'Ariccia-Carmon, Aviva; Zaken, Neomi; Oz, Carla

2012-03-01

The Israel Police DNA database, also known as IPDIS (Israel Police DNA Index System), has been operating since February 2007. During that time more than 135,000 reference samples have been uploaded and more than 2000 hits reported. We have developed an effective semi-automated system that includes two automated punchers, three liquid handler robots and four genetic analyzers. An inhouse LIMS program enables full tracking of every sample through the entire process of registration, pre-PCR handling, analysis of profiles, uploading to the database, hit reports and ultimately storage. The LIMS is also responsible for the future tracking of samples and their profiles to be expunged from the database according to the Israeli DNA legislation. The database is administered by an in-house developed software program, where reference and evidentiary profiles are uploaded, stored, searched and matched. The DNA database has proven to be an effective investigative tool which has gained the confidence of the Israeli public and on which the Israel National Police force has grown to rely. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
[Application of mtDNA polymorphism in species identification of sarcosaphagous insects].

PubMed

Li, Xiang; Cai, Ji-feng

2011-04-01

Species identification of sarcosaphagous insects is one of the important steps in forensic research based on the knowledge of entomology. Recent studies reveal that the application of molecular biology, especially the mtDNA sequences analysis, works well in the species identification of sarcosaphagous insects. The molecular biology characteristics, structures, polymorphism of mtDNA of sarcosaphagous insects, and the recent studies in species identification of sarcosaphagous insects are reviewed in this article.
Development of DNA-based Identification methods to track the ...

EPA Pesticide Factsheets

The ability to track the identity and abundance of larval fish, which are ubiquitous during spawning season, may lead to a greater understanding of fish species distributions in Great Lakes nearshore areas including early-detection of invasive fish species before they become established. However, larval fish are notoriously hard to identify using traditional morphological techniques. While DNA-based identification methods could increase the ability of aquatic resource managers to determine larval fish composition, use of these methods in aquatic surveys is still uncommon and presents many challenges. In response to this need, we have been working with the U. S. Fish and Wildlife Service to develop field and laboratory methods to facilitate the identification of larval fish using DNA-meta-barcoding. In 2012, we initiated a pilot-project to develop a workflow for conducting DNA-based identification, and compared the species composition at sites within the St. Louis River Estuary of Lake Superior using traditional identification versus DNA meta-barcoding. In 2013, we extended this research to conduct DNA-identification of fish larvae collected from multiple nearshore areas of the Great Lakes by the USFWS. The species composition of larval fish generally mirrored that of fish species known from the same areas, but was influenced by the timing and intensity of sampling. Results indicate that DNA-based identification needs only very low levels of biomass to detect pre
Molecular species identification with rich floristic sampling: DNA barcoding the pteridophyte flora of Japan.

PubMed

Ebihara, Atsushi; Nitta, Joel H; Ito, Motomi

2010-12-08

DNA barcoding is expected to be an effective identification tool for organisms with heteromorphic generations such as pteridophytes, which possess a morphologically simple gametophyte generation. Although a reference data set including complete coverage of the target local flora/fauna is necessary for accurate identification, DNA barcode studies including such rich taxonomic sampling on a countrywide scale are lacking. The Japanese pteridophyte flora (733 taxa including subspecies and varieties) was used to test the utility of two plastid DNA barcode regions (rbcL and trnH-psbA) with the intention of developing an identification system for native gametophytes. DNA sequences were obtained from each of 689 (94.0%) taxa for rbcL and 617 (84.2%) taxa for trnH-psbA. Mean interspecific divergence values across all taxon pairs (K2P genetic distances) did not reveal a significant difference in rate between trnH-psbA and rbcL, but mean K2P distances of each genus showed significant heterogeneity according to systematic position. The minimum fail rate of taxon discrimination in an identification test using BLAST (12.52%) was obtained when rbcL and trnH-psbA were combined, and became lower in datasets excluding infraspecific taxa or apogamous taxa, or including sexual diploids only. This study demonstrates the overall effectiveness of DNA barcodes for species identification in the Japanese pteridophyte flora. Although this flora is characterized by a high occurrence of apogamous taxa that pose a serious challenge to identification using DNA barcodes, such taxa are limited to a small number of genera, and only minimally detract from the overall success rate. In the case that a query sequence is matched to a known apogamous genus, routine species identification may not be possible. Otherwise, DNA barcoding is a practical tool for identification of most Japanese pteridophytes, and is especially anticipated to be helpful for identification of non-hybridizing gametophytes.
Forensic identification of CITES protected slimming cactus (Hoodia) using DNA barcoding.

PubMed

Gathier, Gerard; van der Niet, Timotheus; Peelen, Tamara; van Vugt, Rogier R; Eurlings, Marcel C M; Gravendeel, Barbara

2013-11-01

Slimming cactus (Hoodia), found only in southwestern Africa, is a well-known herbal product for losing weight. Consequently, Hoodia extracts are sought-after worldwide despite a CITES Appendix II status. The failure to eradicate illegal trade is due to problems with detecting and identifying Hoodia using morphological and chemical characters. Our aim was to evaluate the potential of molecular identification of Hoodia based on DNA barcoding. Screening of nrITS1 and psbA-trnH DNA sequences from 26 accessions of Ceropegieae resulted in successful identification, while conventional chemical profiling using DLI-MS led to inaccurate detection and identification of Hoodia. The presence of Hoodia in herbal products was also successfully established using DNA sequences. A validation procedure of our DNA barcoding protocol demonstrated its robustness to changes in PCR conditions. We conclude that DNA barcoding is an effective tool for Hoodia detection and identification which can contribute to preventing illegal trade. © 2013 American Academy of Forensic Sciences.
Genetic characterization and phylogenetic analysis of Eimeria arloingi in Iranian native kids.

PubMed

Khodakaram-Tafti, A; Hashemnia, M; Razavi, S M; Sharifiyazdi, H; Nazifi, S

2013-09-01

Among the 16 species of Eimeria from goats, Eimeria arloingi and Eimeria ninakohlyakimovae are regarded as the most pathogenic species in the world and cause clinical caprine coccidiosis. E. arloingi is known to be an important cause of coccidiosis in Iranian kids. Molecular analyses of two portions of nuclear ribosomal DNA (internal transcribed spacer1 (ITS1) and 18S rDNA) were used for the genetic characterization of the E. arloingi. Comparison of the sequencing data of E. arloingi obtained in the present study (ITS1: KC507793 and 18S rDNA: KC507792) with other Eimeria species in the GenBank database revealed a particularly close relationship between E. arloingi and Eimeria spp. from the cattle and sheep. The phylogram based on the ITS1 sequences shows that the E. arloingi, Eimeria bovis, and Eimeria zuernii formed a distinct group separate from the other remaining Eimeria spp. in cattle and poultry. In pairwise alignment, 18S rDNA sequence derived from E. arloingi showed 99% similarity to Eimeria ahsata with differences observed at only three nucleotides. This study showed that the ITS1 and 18S rDNA gene are useful genetic markers for the specific identification and differentiation of Eimeria spp. in ruminants.
iDBPs: a web server for the identification of DNA binding proteins.

PubMed

Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir

2010-03-01

The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. http://idbps.tau.ac.il/
Policy implications for familial searching

PubMed Central

2011-01-01

In the United States, several states have made policy decisions regarding whether and how to use familial searching of the Combined DNA Index System (CODIS) database in criminal investigations. Familial searching pushes DNA typing beyond merely identifying individuals to detecting genetic relatedness, an application previously reserved for missing persons identifications and custody battles. The intentional search of CODIS for partial matches to an item of evidence offers law enforcement agencies a powerful tool for developing investigative leads, apprehending criminals, revitalizing cold cases and exonerating wrongfully convicted individuals. As familial searching involves a range of logistical, social, ethical and legal considerations, states are now grappling with policy options for implementing familial searching to balance crime fighting with its potential impact on society. When developing policies for familial searching, legislators should take into account the impact of familial searching on select populations and the need to minimize personal intrusion on relatives of individuals in the DNA database. This review describes the approaches used to narrow a suspect pool from a partial match search of CODIS and summarizes the economic, ethical, logistical and political challenges of implementing familial searching. We examine particular US state policies and the policy options adopted to address these issues. The aim of this review is to provide objective background information on the controversial approach of familial searching to inform policy decisions in this area. Herein we highlight key policy options and recommendations regarding effective utilization of familial searching that minimize harm to and afford maximum protection of US citizens. PMID:22040348
Policy implications for familial searching.

PubMed

Kim, Joyce; Mammo, Danny; Siegel, Marni B; Katsanis, Sara H

2011-11-01

In the United States, several states have made policy decisions regarding whether and how to use familial searching of the Combined DNA Index System (CODIS) database in criminal investigations. Familial searching pushes DNA typing beyond merely identifying individuals to detecting genetic relatedness, an application previously reserved for missing persons identifications and custody battles. The intentional search of CODIS for partial matches to an item of evidence offers law enforcement agencies a powerful tool for developing investigative leads, apprehending criminals, revitalizing cold cases and exonerating wrongfully convicted individuals. As familial searching involves a range of logistical, social, ethical and legal considerations, states are now grappling with policy options for implementing familial searching to balance crime fighting with its potential impact on society. When developing policies for familial searching, legislators should take into account the impact of familial searching on select populations and the need to minimize personal intrusion on relatives of individuals in the DNA database. This review describes the approaches used to narrow a suspect pool from a partial match search of CODIS and summarizes the economic, ethical, logistical and political challenges of implementing familial searching. We examine particular US state policies and the policy options adopted to address these issues. The aim of this review is to provide objective background information on the controversial approach of familial searching to inform policy decisions in this area. Herein we highlight key policy options and recommendations regarding effective utilization of familial searching that minimize harm to and afford maximum protection of US citizens.
Abnormal DNA methylation may contribute to the progression of osteosarcoma.

PubMed

Chen, Xiao-Gang; Ma, Liang; Xu, Jia-Xin

2018-01-01

The identification of optimal methylation biomarkers to achieve maximum diagnostic ability remains a challenge. The present study aimed to elucidate the potential molecular mechanisms underlying osteosarcoma (OS) using DNA methylation analysis. Based on the GSE36002 dataset obtained from the Gene Expression Omnibus database, differentially methylated genes were extracted between patients with OS and controls using t‑tests. Subsequently, hierarchical clustering was performed to segregate the samples into two distinct clusters, OS and normal. Gene Ontology (GO) and pathway enrichment analyses for differentially methylated genes were performed using the Database for Annotation, Visualization and Integrated Discovery tool. A protein‑protein interaction (PPI) network was established, followed by hub gene identification. Using the cut‑off threshold of ≥0.2 average β‑value difference, 3,725 unique CpGs (2,862 genes) were identified to be differentially methylated between the OS and normal groups. Among these 2,862 genes, 510 genes were differentially hypermethylated and 2,352 were differentially hypomethylated. The differentially hypermethylated genes were primarily involved in 20 GO terms, and the top 3 terms were associated with potassium ion transport. For differentially hypomethylated genes, GO functions principally included passive transmembrane transporter activity, channel activity and metal ion transmembrane transporter activity. In addition, a total of 10 significant pathways were enriched by differentially hypomethylated genes; notably, neuroactive ligand‑receptor interaction was the most significant pathway. Based on a connectivity degree >90, 7 hub genes were selected from the PPI network, including neuromedin U (NMU; degree=103) and NMU receptor 1 (NMUR1; degree=103). Functional terms (potassium ion transport, transmembrane transporter activity, and neuroactive ligand‑receptor interaction) and hub genes (NMU and NMUR1) may serve as potential targets for the treatment and diagnosis of OS.
Using Informatics-, Bioinformatics- and Genomics-Based Approaches for the Molecular Surveillance and Detection of Biothreat Agents

NASA Astrophysics Data System (ADS)

Seto, Donald

The convergence and wealth of informatics, bioinformatics and genomics methods and associated resources allow a comprehensive and rapid approach for the surveillance and detection of bacterial and viral organisms. Coupled with the continuing race for the fastest, most cost-efficient and highest-quality DNA sequencing technology, that is, "next generation sequencing", the detection of biological threat agents by `cheaper and faster' means is possible. With the application of improved bioinformatic tools for the understanding of these genomes and for parsing unique pathogen genome signatures, along with `state-of-the-art' informatics which include faster computational methods, equipment and databases, it is feasible to apply new algorithms to biothreat agent detection. Two such methods are high-throughput DNA sequencing-based and resequencing microarray-based identification. These are illustrated and validated by two examples involving human adenoviruses, both from real-world test beds.
Does filler database size influence identification accuracy?

PubMed

Bergold, Amanda N; Heaton, Paul

2018-06-01

Police departments increasingly use large photo databases to select lineup fillers using facial recognition software, but this technological shift's implications have been largely unexplored in eyewitness research. Database use, particularly if coupled with facial matching software, could enable lineup constructors to increase filler-suspect similarity and thus enhance eyewitness accuracy (Fitzgerald, Oriet, Price, & Charman, 2013). However, with a large pool of potential fillers, such technologies might theoretically produce lineup fillers too similar to the suspect (Fitzgerald, Oriet, & Price, 2015; Luus & Wells, 1991; Wells, Rydell, & Seelau, 1993). This research proposes a new factor-filler database size-as a lineup feature affecting eyewitness accuracy. In a facial recognition experiment, we select lineup fillers in a legally realistic manner using facial matching software applied to filler databases of 5,000, 25,000, and 125,000 photos, and find that larger databases are associated with a higher objective similarity rating between suspects and fillers and lower overall identification accuracy. In target present lineups, witnesses viewing lineups created from the larger databases were less likely to make correct identifications and more likely to select known innocent fillers. When the target was absent, database size was associated with a lower rate of correct rejections and a higher rate of filler identifications. Higher algorithmic similarity ratings were also associated with decreases in eyewitness identification accuracy. The results suggest that using facial matching software to select fillers from large photograph databases may reduce identification accuracy, and provides support for filler database size as a meaningful system variable. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
The real maccoyii: identifying tuna sushi with DNA barcodes--contrasting characteristic attributes and genetic distances.

PubMed

Lowenstein, Jacob H; Amato, George; Kolokotronis, Sergios-Orestis

2009-11-18

The use of DNA barcodes for the identification of described species is one of the least controversial and most promising applications of barcoding. There is no consensus, however, as to what constitutes an appropriate identification standard and most barcoding efforts simply attempt to pair a query sequence with reference sequences and deem identification successful if it falls within the bounds of some pre-established cutoffs using genetic distance. Since the Renaissance, however, most biological classification schemes have relied on the use of diagnostic characters to identify and place species. Here we developed a cytochrome c oxidase subunit I character-based key for the identification of all tuna species of the genus Thunnus, and compared its performance with distance-based measures for identification of 68 samples of tuna sushi purchased from 31 restaurants in Manhattan (New York City) and Denver, Colorado. Both the character-based key and GenBank BLAST successfully identified 100% of the tuna samples, while the Barcode of Life Database (BOLD) as well as genetic distance thresholds, and neighbor-joining phylogenetic tree building performed poorly in terms of species identification. A piece of tuna sushi has the potential to be an endangered species, a fraud, or a health hazard. All three of these cases were uncovered in this study. Nineteen restaurant establishments were unable to clarify or misrepresented what species they sold. Five out of nine samples sold as a variant of "white tuna" were not albacore (T. alalunga), but escolar (Lepidocybium flavorunneum), a gempylid species banned for sale in Italy and Japan due to health concerns. Nineteen samples were northern bluefin tuna (T. thynnus) or the critically endangered southern bluefin tuna (T. maccoyii), though nine restaurants that sold these species did not state these species on their menus. The Convention on International Trade Endangered Species (CITES) requires that listed species must be identifiable in trade. This research fulfills this requirement for tuna, and supports the nomination of northern bluefin tuna for CITES listing in 2010.
[Principles for molecular identification of traditional Chinese materia medica using DNA barcoding].

PubMed

Chen, Shi-Lin; Yao, Hui; Han, Jian-Ping; Xin, Tian-Yi; Pang, Xiao-Hui; Shi, Lin-Chun; Luo, Kun; Song, Jing-Yuan; Hou, Dian-Yun; Shi, Shang-Mei; Qian, Zhong-Zhi

2013-01-01

Since the research of molecular identification of Chinese Materia Medica (CMM) using DNA barcode is rapidly developing and popularizing, the principle of this method is approved to be listed in the Supplement of the Pharmacopoeia of the People's Republic of China. Based on the study on comprehensive samples, the DNA barcoding systems have been established to identify CMM, i.e. ITS2 as a core barcode and psbA-trnH as a complementary locus for identification of planta medica, and COI as a core barcode and ITS2 as a complementary locus for identification of animal medica. This article introduced the principle of molecular identification of CMM using DNA barcoding and its drafting instructions. Furthermore, its application perspective was discussed.
CmMDb: a versatile database for Cucumis melo microsatellite markers and other horticulture crop research.

PubMed

Bhawna; Chaduvula, Pavan K; Bonthala, Venkata S; Manjusha, Verma; Siddiq, Ebrahimali A; Polumetla, Ananda K; Prasad, Gajula M N V

2015-01-01

Cucumis melo L. that belongs to Cucurbitaceae family ranks among one of the highest valued horticulture crops being cultivated across the globe. Besides its economical and medicinal importance, Cucumis melo L. is a valuable resource and model system for the evolutionary studies of cucurbit family. However, very limited numbers of molecular markers were reported for Cucumis melo L. so far that limits the pace of functional genomic research in melon and other similar horticulture crops. We developed the first whole genome based microsatellite DNA marker database of Cucumis melo L. and comprehensive web resource that aids in variety identification and physical mapping of Cucurbitaceae family. The Cucumis melo L. microsatellite database (CmMDb: http://65.181.125.102/cmmdb2/index.html) encompasses 39,072 SSR markers along with its motif repeat, motif length, motif sequence, marker ID, motif type and chromosomal locations. The database is featured with novel automated primer designing facility to meet the needs of wet lab researchers. CmMDb is a freely available web resource that facilitates the researchers to select the most appropriate markers for marker-assisted selection in melons and to improve breeding strategies.
Mugshot Identification Database (MID)

National Institute of Standards and Technology Data Gateway

NIST Mugshot Identification Database (MID) (Web, free access) NIST Special Database 18 is being distributed for use in development and testing of automated mugshot identification systems. The database consists of three CD-ROMs, containing a total of 3248 images of variable size using lossless compression. A newer version of the compression/decompression software on the CDROM can be found at the website http://www.nist.gov/itl/iad/ig/nigos.cfm as part of the NBIS package.

Construction of a cDNA library from female adult of Toxocara canis, and analysis of EST and immune-related genes expressions.

PubMed

Zhou, Rongqiong; Xia, Qingyou; Huang, Hancheng; Lai, Min; Wang, Zhenxin

2011-10-01

Toxocara canis is a widespread intestinal nematode parasite of dogs, which can also cause disease in humans. We employed an expressed sequence tag (EST) strategy in order to study gene-expression including development, digestion and reproduction of T. canis. ESTs provided a rapid way to identify genes, particularly in organisms for which we have very little molecular information. In this study, a cDNA library was constructed from a female adult of T. canis and 215 high-quality ESTs from 5'-ends of the cDNA clones representing 79 unigenes were obtained. The titer of the primary cDNA library was 1.83×10(6)pfu/mL with a recombination rate of 99.33%. Most of the sequences ranged from 300 to 900bp with an average length of 656bp. Cluster analysis of these ESTs allowed identification of 79 unique sequences containing 28 contigs and 51 singletons. BLASTX searches revealed that 18 unigenes (22.78% of the total) or 70 ESTs (32.56% of the total) were novel genes that had no significant matches to any protein sequences in the public databases. The rest of the 61 unigenes (77.22% of the total) or 145 ESTs (67.44% of the total) were closely matched to the known genes or sequences deposited in the public databases. These genes were classified into seven groups based on their known or putative biological functions. We also confirmed the gene expression patterns of several immune-related genes using RT-PCR examination. This work will provide a valuable resource for the further investigations in the stage-, sex- and tissue-specific gene transcription or expression. Copyright © 2011. Published by Elsevier Inc.
The Web-Based DNA Vaccine Database DNAVaxDB and Its Usage for Rational DNA Vaccine Design.

PubMed

Racz, Rebecca; He, Yongqun

2016-01-01

A DNA vaccine is a vaccine that uses a mammalian expression vector to express one or more protein antigens and is administered in vivo to induce an adaptive immune response. Since the 1990s, a significant amount of research has been performed on DNA vaccines and the mechanisms behind them. To meet the needs of the DNA vaccine research community, we created DNAVaxDB ( http://www.violinet.org/dnavaxdb ), the first Web-based database and analysis resource of experimentally verified DNA vaccines. All the data in DNAVaxDB, which includes plasmids, antigens, vaccines, and sources, is manually curated and experimentally verified. This chapter goes over the detail of DNAVaxDB system and shows how the DNA vaccine database, combined with the Vaxign vaccine design tool, can be used for rational design of a DNA vaccine against a pathogen, such as Mycobacterium bovis.
Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

PubMed

Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

2017-06-26

Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
DNA analysis in Disaster Victim Identification.

PubMed

Montelius, Kerstin; Lindblom, Bertil

2012-06-01

DNA profiling and matching is one of the primary methods to identify missing persons in a disaster, as defined by the Interpol Disaster Victim Identification Guide. The process to identify a victim by DNA includes: the collection of the best possible ante-mortem (AM) samples, the choice of post-mortem (PM) samples, DNA-analysis, matching and statistical weighting of the genetic relationship or match. Each disaster has its own scenario, and each scenario defines its own methods for identification of the deceased.
DNA barcode-based molecular identification system for fish species.

PubMed

Kim, Sungmin; Eo, Hae-Seok; Koo, Hyeyoung; Choi, Jun-Kil; Kim, Won

2010-12-01

In this study, we applied DNA barcoding to identify species using short DNA sequence analysis. We examined the utility of DNA barcoding by identifying 53 Korean freshwater fish species, 233 other freshwater fish species, and 1339 saltwater fish species. We successfully developed a web-based molecular identification system for fish (MISF) using a profile hidden Markov model. MISF facilitates efficient and reliable species identification, overcoming the limitations of conventional taxonomic approaches. MISF is freely accessible at http://bioinfosys.snu.ac.kr:8080/MISF/misf.jsp .
Keys and the crisis in taxonomy: extinction or reinvention?

PubMed

Walter, David Evans; Winterton, Shaun

2007-01-01

Dichotomous keys that follow a single pathway of character state choices to an end point have been the primary tools for the identification of unknown organisms for more than two centuries. However, a revolution in computer diagnostics is now under way that may result in the replacement of traditional keys by matrix-based computer interactive keys that have many paths to a correct identification and make extensive use of hypertext to link to images, glossaries, and other support material. Progress is also being made on replacing keys entirely by optical matching of specimens to digital databases and DNA sequences. These new tools may go some way toward alleviating the taxonomic impediment to biodiversity studies and other ecological and evolutionary research, especially with better coordination between those who produce keys and those who use them and by integrating interactive keys into larger biological Web sites.
Cryptic diversity in Australian stick insects (Insecta; Phasmida) uncovered by the DNA barcoding approach.

PubMed

Velonà, A; Brock, P D; Hasenpusch, J; Mantovani, B

2015-05-18

The barcoding approach was applied to analyze 16 Australian morphospecies of the order Phasmida, with the aim to test if it could be suitable as a tool for phasmid species identification and if its discrimination power would allow uncovering of cryptic diversity. Both goals were reached. Eighty-two specimens representing twelve morphospecies (Sipyloidea sp. A, Candovia annulata, Candovia sp. A, Candovia sp. B, Candovia sp. C, Denhama austrocarinata, Xeroderus kirbii, Parapodacanthus hasenpuschorum, Tropidoderus childrenii, Cigarrophasma tessellatum, Acrophylla wuelfingi, Eurycantha calcarata) were correctly recovered as clades through the molecular approach, their sequences forming monophyletic and well-supported clusters. In four instances, Neighbor-Joining tree and barcoding gap analyses supported either a specific (Austrocarausius mercurius, Anchiale briareus) or a subspecific (Anchiale austrotessulata, Extatosoma tiaratum) level of divergence within the analyzed morphospecies. The lack of an appropriate database of homologous coxI sequences prevented more detailed identification of undescribed taxa.
DNA barcoding and real-time PCR detection of Bactrocera xanthodes (Tephritidae: Diptera) complex.

PubMed

Li, D; Waite, D W; Gunawardana, D N; McCarthy, B; Anderson, D; Flynn, A; George, S

2018-05-06

Immature fruit fly stages of the family Tephritidae are commonly intercepted on breadfruit from Pacific countries at the New Zealand border but are unable to be identified to the species level using morphological characters. Subsequent molecular identification showed that they belong to Bactrocera xanthodes, which is part of a species complex that includes Bactrocera paraxanthodes, Bactrocera neoxanthodes and an undescribed species. To establish a more reliable molecular identification system for B. xanthodes, a reference database of DNA barcode sequences for the 5'-fragment of COI gene region was constructed for B. xanthodes from Fiji, Samoa and Tonga. To better understand the species complex, B. neoxanthodes from Vanuatu and B. paraxanthodes from New Caledonia were also barcoded. Using the results of this analysis, real-time TaqMan polymerase chain reaction (PCR) assays for the detection of B. xanthodes complex and for the three individual species of the complex were developed and validated. The assay showed high specificity for the target species, with no cross-reaction observed for closely related organisms. Each of the real-time PCR assays is sensitive, detecting the target sequences at concentrations as low as ten copies µl-1 and can be used as either singleplex or multiplex formats. This real-time PCR assay for B. xanthodes has been successfully applied at the borders in New Zealand, leading to the rapid identification of intercepted Tephritidae eggs and larvae. The developed assays will be useful biosecurity tools for rapid detection of species in the B. xanthodes complex worldwide.
Development of real-time PCR assay for genetic identification of the mottled skate, Beringraja pulchra.

PubMed

Hwang, In Kwan; Lee, Hae Young; Kim, Min-Hee; Jo, Hyun-Su; Choi, Dong-Ho; Kang, Pil-Won; Lee, Yang-Han; Cho, Nam-Soo; Park, Ki-Won; Chae, Ho Zoon

2015-10-01

The mottled skate, Beringraja pulchra is one of the commercially important fishes in the market today. However, B. pulchra identification methods have not been well developed. The current study reports a novel real-time PCR method based on TaqMan technology developed for the genetic identification of B. pulchra. The mitochondrial cytochrome oxidase subunit 1 (COI) nucleotide sequences of 29 B. pulchra, 157 skates and rays reported in GenBank DNA database were comparatively analyzed and the COI sequences specific to B. pulchra was identified. Based on this information, a system of specific primers and Minor Groove Binding (MGB) TaqMan probe were designed. The assay successfully discriminated in 29 specimens of B. pulchra and 27 commercial samples with unknown species identity. For B. pulchra DNA, an average Threshold Cycle (Ct) value of 19.1±0.1 was obtained. Among 27 commercial samples, two samples showed average Ct values 19.1±0.0 and 26.7±0.1, respectively and were confirmed to be B. pulchra based on sequencing. The other samples tested showed undetectable or extremely weak signals for the target fragment, which was also consistent with the sequencing results. These results reveal that the method developed is a rapid and efficient tool to identify B. pulchra and might prevent fraud or mislabeling during the distribution of B. pulchra products. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Beyond the Colours: Discovering Hidden Diversity in the Nymphalidae of the Yucatan Peninsula in Mexico through DNA Barcoding

PubMed Central

Prado, Blanca R.; Pozo, Carmen; Valdez-Moreno, Martha; Hebert, Paul D. N.

2011-01-01

Background Recent studies have demonstrated the utility of DNA barcoding in the discovery of overlooked species and in the connection of immature and adult stages. In this study, we use DNA barcoding to examine diversity patterns in 121 species of Nymphalidae from the Yucatan Peninsula in Mexico. Our results suggest the presence of cryptic species in 8 of these 121 taxa. As well, the reference database derived from the analysis of adult specimens allowed the identification of nymphalid caterpillars providing new details on host plant use. Methodology/Principal Findings We gathered DNA barcode sequences from 857 adult Nymphalidae representing 121 different species. This total includes four species (Adelpha iphiclus, Adelpha malea, Hamadryas iphtime and Taygetis laches) that were initially overlooked because of their close morphological similarity to other species. The barcode results showed that each of the 121 species possessed a diagnostic array of barcode sequences. In addition, there was evidence of cryptic taxa; seven species included two barcode clusters showing more than 2% sequence divergence while one species included three clusters. All 71 nymphalid caterpillars were identified to a species level by their sequence congruence to adult sequences. These caterpillars represented 16 species, and included Hamadryas julitta, an endemic species from the Yucatan Peninsula whose larval stages and host plant (Dalechampia schottii, also endemic to the Yucatan Peninsula) were previously unknown. Conclusions/Significance This investigation has revealed overlooked species in a well-studied museum collection of nymphalid butterflies and suggests that there is a substantial incidence of cryptic species that await full characterization. The utility of barcoding in the rapid identification of caterpillars also promises to accelerate the assembly of information on life histories, a particularly important advance for hyperdiverse tropical insect assemblages. PMID:22132140
3DNALandscapes: a database for exploring the conformational features of DNA.

PubMed

Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K

2010-01-01

3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
Assessment of Multi Fragment Melting Analysis System (MFMAS) for the Identification of Food-Borne Yeasts.

PubMed

Kesmen, Zülal; Büyükkiraz, Mine E; Özbekar, Esra; Çelik, Mete; Özkök, F Özge; Kılıç, Özge; Çetin, Bülent; Yetim, Hasan

2018-06-01

Multi Fragment Melting Analysis System (MFMAS) is a novel approach that was developed for the species-level identification of microorganisms. It is a software-assisted system that performs concurrent melting analysis of 8 different DNA fragments to obtain a fingerprint of each strain analyzed. The identification is performed according to the comparison of these fingerprints with the fingerprints of known yeast species recorded in a database to obtain the best possible match. In this study, applicability of the yeast version of the MFMAS (MFMAS-yeast) was evaluated for the identification of food-associated yeast species. For this purpose, in this study, a total of 145 yeast strains originated from foods and beverages and 19 standard yeast strains were tested. The DNAs isolated from these yeast strains were analyzed by the MFMAS, and their species were successfully identified with a similarity rate of 95% or higher. It was shown that the strains belonged to 43 different yeast species that are widely found in the foods. A clear discrimination was also observed in the phylogenetically related species. In conclusion, it might be suggested that the MFMAS-yeast seems to be a highly promising approach for a rapid, accurate, and one-step identification of the yeasts isolated from food products and/or their processing environments.
DNA typing for the identification of old skeletal remains from Korean War victims.

PubMed

Lee, Hwan Young; Kim, Na Young; Park, Myung Jin; Sim, Jeong Eun; Yang, Woo Ick; Shin, Kyoung-Jin

2010-11-01

The identification of missing casualties of the Korean War (1950-1953) has been performed using mitochondrial DNA (mtDNA) profiles, but recent advances in DNA extraction techniques and approaches using smaller amplicons have significantly increased the possibility of obtaining DNA profiles from highly degraded skeletal remains. Therefore, 21 skeletal remains of Korean War victims and 24 samples from biological relatives of the supposed victims were selected based on circumstantial evidence and/or mtDNA-matching results and were analyzed to confirm the alleged relationship. Cumulative likelihood ratios were obtained from autosomal short tandem repeat, Y-chromosomal STR, and mtDNA-genotyping results, and mainly confirmed the alleged relationship with values over 10⁵. The present analysis emphasizes the value of mini- and Y-STR systems as well as an efficient DNA extraction method in DNA testing for the identification of old skeletal remains. © 2010 American Academy of Forensic Sciences.
Identification of species with DNA-based technology: current progress and challenges.

PubMed

Pereira, Filipe; Carneiro, João; Amorim, António

2008-01-01

One of the grand challenges of modern biology is to develop accurate and reliable technologies for a rapid screening of DNA sequence variation. This topic of research is of prime importance for the detection and identification of species in numerous fields of investigation, such as taxonomy, epidemiology, forensics, archaeology or ecology. Molecular identification is also central for the diagnosis, treatment and control of infections caused by different pathogens. In recent years, a variety of DNA-based approaches have been developed for the identification of individuals in a myriad of taxonomic groups. Here, we provide an overview of most commonly used assays, with emphasis on those based on DNA hybridizations, restriction enzymes, random PCR amplifications, species-specific PCR primers and DNA sequencing. A critical evaluation of all methods is presented focusing on their discriminatory power, reproducibility and user-friendliness. Having in mind that the current trend is to develop small-scale devices with a high-throughput capacity, we briefly review recent technological achievements for DNA analysis that offer great potentials for the identification of species.
Evaluating the statistical power of DNA-based identification, exemplified by 'The missing grandchildren of Argentina'.

PubMed

Kling, Daniel; Egeland, Thore; Piñero, Mariana Herrera; Vigeland, Magnus Dehli

2017-11-01

Methods and implementations of DNA-based identification are well established in several forensic contexts. However, assessing the statistical power of these methods has been largely overlooked, except in the simplest cases. In this paper we outline general methods for such power evaluation, and apply them to a large set of family reunification cases, where the objective is to decide whether a person of interest (POI) is identical to the missing person (MP) in a family, based on the DNA profile of the POI and available family members. As such, this application closely resembles database searching and disaster victim identification (DVI). If parents or children of the MP are available, they will typically provide sufficient statistical evidence to settle the case. However, if one must resort to more distant relatives, it is not a priori obvious that a reliable conclusion is likely to be reached. In these cases power evaluation can be highly valuable, for instance in the recruitment of additional family members. To assess the power in an identification case, we advocate the combined use of two statistics: the Probability of Exclusion, and the Probability of Exceedance. The former is the probability that the genotypes of a random, unrelated person are incompatible with the available family data. If this is close to 1, it is likely that a conclusion will be achieved regarding general relatedness, but not necessarily the specific relationship. To evaluate the ability to recognize a true match, we use simulations to estimate exceedance probabilities, i.e. the probability that the likelihood ratio will exceed a given threshold, assuming that the POI is indeed the MP. All simulations are done conditionally on available family data. Such conditional simulations have a long history in medical linkage analysis, but to our knowledge this is the first systematic forensic genetics application. Also, for forensic markers mutations cannot be ignored and therefore current models and implementations must be extended. All the tools are freely available in Familias (http://www.familias.no) empowered by the R library paramlink. The above approach is applied to a large and important data set: 'The missing grandchildren of Argentina'. We evaluate the power of 196 families from the DNA reference databank (Banco Nacional de Datos Genéticos, http://www.bndg.gob.ar. As a result we show that 58 of the families have poor statistical power and require additional genetic data to enable a positive identification. Copyright © 2017 Elsevier B.V. All rights reserved.
6 CFR 37.33 - DMV databases.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 6 Domestic Security 1 2012-01-01 2012-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
6 CFR 37.33 - DMV databases.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 6 Domestic Security 1 2010-01-01 2010-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
6 CFR 37.33 - DMV databases.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 6 Domestic Security 1 2014-01-01 2014-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
6 CFR 37.33 - DMV databases.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 6 Domestic Security 1 2013-01-01 2013-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...
6 CFR 37.33 - DMV databases.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 6 Domestic Security 1 2011-01-01 2011-01-01 false DMV databases. 37.33 Section 37.33 Domestic... IDENTIFICATION CARDS Other Requirements § 37.33 DMV databases. (a) States must maintain a State motor vehicle database that contains, at a minimum— (1) All data fields printed on driver's licenses and identification...

Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction.

PubMed

Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C

2015-02-25

Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Identification of novel peptides for horse meat speciation in highly processed foodstuffs.

PubMed

Claydon, Amy J; Grundy, Helen H; Charlton, Adrian J; Romero, M Rosario

2015-01-01

There is a need for robust analytical methods to support enforcement of food labelling legislation. Proteomics is emerging as a complementary methodology to existing tools such as DNA and antibody-based techniques. Here we describe the development of a proteomics strategy for the determination of meat species in highly processed foods. A database of specific peptides for nine relevant animal species was used to enable semi-targeted species determination. This principle was tested for horse meat speciation, and a range of horse-specific peptides were identified as heat stable marker peptides for the detection of low levels of horse meat in mixtures with other species.
Isolation and Identification of miRNAs in Jatropha curcas

PubMed Central

Wang, Chun Ming; Liu, Peng; Sun, Fei; Li, Lei; Liu, Peng; Ye, Jian; Yue, Gen Hua

2012-01-01

MicroRNAs (miRNAs) are small noncoding RNAs that play crucial regulatory roles by targeting mRNAs for silencing. To identify miRNAs in Jatropha curcas L, a bioenergy crop, cDNA clones from two small RNA libraries of leaves and seeds were sequenced and analyzed using bioinformatic tools. Fifty-two putative miRNAs were found from the two libraries, among them six were identical to known miRNAs and 46 were novel. Differential expression patterns of 15 miRNAs in root, stem, leave, fruit and seed were detected using quantitative real-time PCR. Ten miRNAs were highly expressed in fruit or seed, implying that they may be involved in seed development or fatty acids synthesis in seed. Moreover, 28 targets of the isolated miRNAs were predicted from a jatropha cDNA library database. The miRNA target genes were predicted to encode a broad range of proteins. Sixteen targets had clear BLASTX hits to the Uniprot database and were associated with genes belonging to the three major gene ontology categories of biological process, cellular component, and molecular function. Four targets were identified for JcumiR004. By silencing JcumiR004 primary miRNA, expressions of the four target genes were up-regulated and oil composition were modulated significantly, indicating diverse functions of JcumiR004. PMID:22419887
Developmental Validation of the Huaxia Platinum System and application in 3 main ethnic groups of China

PubMed Central

Wang, Zheng; Zhou, Di; Jia, Zhenjun; Li, Luyao; Wu, Wei; Li, Chengtao; Hou, Yiping

2016-01-01

STRs, scattered throughout the genome with higher mutation rate, are attractive to genetic application like forensic, anthropological and population genetics studies. STR profiling has now been applied in various aspects of human identification in forensic investigations. This work described the developmental validation of a novel and universal assay, the Huaxia Platinum System, which amplifies all markers in the expanded CODIS core loci and the Chinese National Database in one single PCR system. Developmental validation demonstrated that this novel assay is accurate, sensitive, reproducible and robust. No discordant calls were observed between the Huaxia Platinum System and other STR systems. Full genotypes could be achieved even with 250 pg of human DNA. Additionally, 402 unrelated individuals from 3 main ethnic groups of China (Han, Uygur and Tibetan) were genotyped to investigate the effectiveness of this novel assay. The CMP were 2.3094 × 10−27, 4.3791 × 10−28 and 6.9118 × 10−27, respectively, and the CPE were 0.99999999939059, 0.99999999989653 and 0.99999999976386, respectively. Aforementioned results suggested that the Huaxia Platinum System is polymorphic and informative, which provides efficient tool for national DNA database and facilitate international data sharing. PMID:27498550
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

PubMed

Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

2011-01-01

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
Examples of kinship analysis where Profiler Plus™ was not discriminatory enough for the identification of victims using DNA identification.

PubMed

Hartman, D; Benton, L; Morenos, L; Beyer, J; Spiden, M; Stock, A

2011-02-25

The identification of the victims of the 2009 Victorian bushfires disaster, as in other mass disasters, relied on a number of scientific disciplines - including DNA analysis. As part of the DVI response, DNA analysis was performed to assist in the identification of victims through kinship (familial matching to relatives) or direct (self source of sample) matching of DNA profiles. The majority of the DNA identifications made (82%) were achieved through kinship matching of familial reference samples to post mortem (PM) samples obtained from the victims. Although each location affected by the bushfires could be treated as a mini-disaster (having a small closed-set of victims), with many such sites spread over vast areas, DNA analysis requires that the short tandem repeat (STR) system used be able to afford enough discrimination between all the DVI cases to assign a match. This publication highlights that although a 9-loci multiplex was sufficient for a DVI of this nature, there were instances that brought to light the short comings of using a 9-loci multiplex for kinship matching--particularly where multiple family members are victims. Moreso it serves to reinforce the recommendation that a minimum of 12 autosomal STR markers (plus Amelogenin) be used for DNA identification of victims which relies heavily on kinship matching. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
9 CFR 79.2 - Identification of sheep and goats in interstate commerce.

Code of Federal Regulations, 2014 CFR

2014-01-01

... prefix that has been linked in the National Scrapie Database with the assigned premises identification... official identification method or device approved by the Administrator. (3) The owner of the flock of... premises identification if they are linked to the premises in the National Scrapie Database) will be...
9 CFR 79.2 - Identification of sheep and goats in interstate commerce.

Code of Federal Regulations, 2013 CFR

2013-01-01

... prefix that has been linked in the National Scrapie Database with the assigned premises identification... official identification method or device approved by the Administrator. (3) The owner of the flock of... premises identification if they are linked to the premises in the National Scrapie Database) will be...
9 CFR 79.2 - Identification of sheep and goats in interstate commerce.

Code of Federal Regulations, 2012 CFR

2012-01-01

... prefix that has been linked in the National Scrapie Database with the assigned premises identification... official identification method or device approved by the Administrator. (3) The owner of the flock of... premises identification if they are linked to the premises in the National Scrapie Database) will be...
DNA barcoding: an efficient tool to overcome authentication challenges in the herbal market.

PubMed

Mishra, Priyanka; Kumar, Amit; Nagireddy, Akshitha; Mani, Daya N; Shukla, Ashutosh K; Tiwari, Rakesh; Sundaresan, Velusamy

2016-01-01

The past couple of decades have witnessed global resurgence of herbal-based health care. As a result, the trade of raw drugs has surged globally. Accurate and fast scientific identification of the plant(s) is the key to success for the herbal drug industry. The conventional approach is to engage an expert taxonomist, who uses a mix of traditional and modern techniques for precise plant identification. However, for bulk identification at industrial scale, the process is protracted and time-consuming. DNA barcoding, on the other hand, offers an alternative and feasible taxonomic tool box for rapid and robust species identification. For the success of DNA barcode, the barcode loci must have sufficient information to differentiate unambiguously between closely related plant species and discover new cryptic species. For herbal plant identification, matK, rbcL, trnH-psbA, ITS, trnL-F, 5S-rRNA and 18S-rRNA have been used as successful DNA barcodes. Emerging advances in DNA barcoding coupled with next-generation sequencing and high-resolution melting curve analysis have paved the way for successful species-level resolution recovered from finished herbal products. Further, development of multilocus strategy and its application has provided new vistas to the DNA barcode-based plant identification for herbal drug industry. For successful and acceptable identification of herbal ingredients and a holistic quality control of the drug, DNA barcoding needs to work harmoniously with other components of the systems biology approach. We suggest that for effectively resolving authentication challenges associated with the herbal market, DNA barcoding must be used in conjunction with metabolomics along with need-based transcriptomics and proteomics. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

PubMed

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives

PubMed Central

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
Integration of deep transcriptome and proteome analyses reveals the components of alkaloid metabolism in opium poppy cell cultures

PubMed Central

2010-01-01

Background Papaver somniferum (opium poppy) is the source for several pharmaceutical benzylisoquinoline alkaloids including morphine, the codeine and sanguinarine. In response to treatment with a fungal elicitor, the biosynthesis and accumulation of sanguinarine is induced along with other plant defense responses in opium poppy cell cultures. The transcriptional induction of alkaloid metabolism in cultured cells provides an opportunity to identify components of this process via the integration of deep transcriptome and proteome databases generated using next-generation technologies. Results A cDNA library was prepared for opium poppy cell cultures treated with a fungal elicitor for 10 h. Using 454 GS-FLX Titanium pyrosequencing, 427,369 expressed sequence tags (ESTs) with an average length of 462 bp were generated. Assembly of these sequences yielded 93,723 unigenes, of which 23,753 were assigned Gene Ontology annotations. Transcripts encoding all known sanguinarine biosynthetic enzymes were identified in the EST database, 5 of which were represented among the 50 most abundant transcripts. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) of total protein extracts from cell cultures treated with a fungal elicitor for 50 h facilitated the identification of 1,004 proteins. Proteins were fractionated by one-dimensional SDS-PAGE and digested with trypsin prior to LC-MS/MS analysis. Query of an opium poppy-specific EST database substantially enhanced peptide identification. Eight out of 10 known sanguinarine biosynthetic enzymes and many relevant primary metabolic enzymes were represented in the peptide database. Conclusions The integration of deep transcriptome and proteome analyses provides an effective platform to catalogue the components of secondary metabolism, and to identify genes encoding uncharacterized enzymes. The establishment of corresponding transcript and protein databases generated by next-generation technologies in a system with a well-defined metabolite profile facilitates an improved linkage between genes, enzymes, and pathway components. The proteome database represents the most relevant alkaloid-producing enzymes, compared with the much deeper and more complete transcriptome library. The transcript database contained full-length mRNAs encoding most alkaloid biosynthetic enzymes, which is a key requirement for the functional characterization of novel gene candidates. PMID:21083930
Compressing DNA sequence databases with coil.

PubMed

White, W Timothy J; Hendy, Michael D

2008-05-20

Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Compressing DNA sequence databases with coil

PubMed Central

White, W Timothy J; Hendy, Michael D

2008-01-01

Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794
DNA barcoding and morphological identification of neotropical ichthyoplankton from the Upper Paraná and São Francisco.

PubMed

Becker, R A; Sales, N G; Santos, G M; Santos, G B; Carvalho, D C

2015-07-01

The identification of fish larvae from two neotropical hydrographic basins using traditional morphological taxonomy and DNA barcoding revealed no conflicting results between the morphological and barcode identification of larvae. A lower rate (25%) of correct morphological identification of eggs as belonging to migratory or non-migratory species was achieved. Accurate identification of ichthyoplankton by DNA barcoding is an important tool for fish reproductive behaviour studies, correct estimation of biodiversity by detecting eggs from rare species, as well as defining environmental and management strategies for fish conservation in the neotropics. © 2015 The Fisheries Society of the British Isles.
SwePep, a database designed for endogenous peptides and mass spectrometry.

PubMed

Fälth, Maria; Sköld, Karl; Norrman, Mathias; Svensson, Marcus; Fenyö, David; Andren, Per E

2006-06-01

A new database, SwePep, specifically designed for endogenous peptides, has been constructed to significantly speed up the identification process from complex tissue samples utilizing mass spectrometry. In the identification process the experimental peptide masses are compared with the peptide masses stored in the database both with and without possible post-translational modifications. This intermediate identification step is fast and singles out peptides that are potential endogenous peptides and can later be confirmed with tandem mass spectrometry data. Successful applications of this methodology are presented. The SwePep database is a relational database developed using MySql and Java. The database contains 4180 annotated endogenous peptides from different tissues originating from 394 different species as well as 50 novel peptides from brain tissue identified in our laboratory. Information about the peptides, including mass, isoelectric point, sequence, and precursor protein, is also stored in the database. This new approach holds great potential for removing the bottleneck that occurs during the identification process in the field of peptidomics. The SwePep database is available to the public.
Human Chromosome Y and Haplogroups; introducing YDHS Database.

PubMed

Tiirikka, Timo; Moilanen, Jukka S

2015-12-01

As the high throughput sequencing efforts generate more biological information, scientists from different disciplines are interpreting the polymorphisms that make us unique. In addition, there is an increasing trend in general public to research their own genealogy, find distant relatives and to know more about their biological background. Commercial vendors are providing analyses of mitochondrial and Y-chromosomal markers for such purposes. Clearly, an easy-to-use free interface to the existing data on the identified variants would be in the interest of general public and professionals less familiar with the field. Here we introduce a novel metadatabase YDHS that aims to provide such an interface for Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants. The database uses ISOGG Y-DNA tree as the source of mutations and haplogroups and by using genomic positions of the mutations the database links them to genes and other biological entities. YDHS contains analysis tools for deeper Y-SNP analysis. YDHS addresses the shortage of Y-DNA related databases. We have tested our database using a set of different cases from literature ranging from infertility to autism. The database is at http://www.semanticgen.net/ydhs Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants have not been in the scientific limelight, excluding certain specialized fields like forensics, mainly because there is not much freely available information or it is scattered in different sources. However, as we have demonstrated Y-SNPs do play a role in various cases on the haplogroup level and it is possible to create a free Y-DNA dedicated bioinformatics resource.
Ribosomal DNA stability is supported by many 'buffer genes'-introduction to the Yeast rDNA Stability Database.

PubMed

Kobayashi, Takehiko; Sasaki, Mariko

2017-01-01

The ribosomal RNA gene (rDNA) is the most abundant gene in yeast and other eukaryotic organisms. Due to its heavy transcription, repetitive structure and programmed replication fork pauses, the rDNA is one of the most unstable regions in the genome. Thus, the rDNA is the best region to study the mechanisms responsible for maintaining genome integrity. Recently, we screened a library of ∼4800 budding yeast gene knockout strains to identify mutants defective in the maintenance of rDNA stability. The results of this screen are summarized in the Yeast rDNA Stability (YRS) Database, in which the stability and copy number of rDNA in each mutant are presented. From this screen, we identified ∼700 genes that may contribute to the maintenance of rDNA stability. In addition, ∼50 mutants had abnormally high or low rDNA copy numbers. Moreover, some mutants with unstable rDNA displayed abnormalities in another chromosome. In this review, we introduce the YRS Database and discuss the roles of newly identified genes that contribute to rDNA maintenance and genome integrity. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
High-Throughput Block Optical DNA Sequence Identification.

PubMed

Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

2018-01-01

Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

MALDI-TOF mass spectrometry as a potential tool for Trichomonas vaginalis identification.

PubMed

Calderaro, Adriana; Piergianni, Maddalena; Montecchini, Sara; Buttrini, Mirko; Piccolo, Giovanna; Rossi, Sabina; Arcangeletti, Maria Cristina; Medici, Maria Cristina; Chezzi, Carlo; De Conto, Flora

2016-06-10

Trichomonas vaginalis is a flagellated protozoan causing trichomoniasis, a sexually transmitted human infection, with around 276.4 million new cases estimated by World Health Organization. Culture is the gold standard method for the diagnosis of T. vaginalis infection. Recently, immunochromatographic assays as well as PCR assays for the detection of T. vaginalis antigen or DNA, respectively, have been also available. Although the well-known genome sequence of T. vaginalis has made possible the application of proteomic studies, few data are available about the overall proteomic expression profiling of T. vaginalis. The aim of this study was to investigate the potential application of MALDI-TOF MS as a new tool for the identification of T. vaginalis. Twenty-one isolates were analysed by MALDI-TOF MS after the creation of a Main Spectrum Profile (MSP) from a T. vaginalis reference strain (G3) and its subsequent supplementation in the Bruker Daltonics database, not including any profile of protozoa. This was achieved after the development of a new identification method created by modifying the range setting (6-10 kDa) for the MALDI-TOF MS analysis in order to exclude the overlapping of peaks derived from the culture media used in this study. Two MSP reference spectra were created in 2 different range: 3-15 kDa (standard range setting) and 6-10 kDa (new range setting). Both MSP spectra were deposited in the MALDI BioTyper database for further identification of additional T. vaginalis strains. All the 21 strains analysed in this study were correctly identified by using the new identification method. In this study it was demonstrated that changes in the MALDI-TOF MS standard parameters usually used to identify bacteria and fungi allowed the identification of the protozoan T. vaginalis. This study shows the usefulness of MALDI-TOF MS in the reliable identification of microorganism grown on complex liquid media such as the protozoan T. vaginalis, on the basis of the proteic profile and not on the basis of single markers, by using a "new range setting" different from that developed for bacteria and fungi.
20 years since the introduction of DNA barcoding: from theory to application.

PubMed

Fišer Pečnikar, Živa; Buzan, Elena V

2014-02-01

Traditionally, taxonomic identification has relied upon morphological characters. In the last two decades, molecular tools based on DNA sequences of short standardised gene fragments, termed DNA barcodes, have been developed for species discrimination. The most common DNA barcode used in animals is a fragment of the cytochrome c oxidase (COI) mitochondrial gene, while for plants, two chloroplast gene fragments from the RuBisCo large subunit (rbcL) and maturase K (matK) genes are widely used. Information gathered from DNA barcodes can be used beyond taxonomic studies and will have far-reaching implications across many fields of biology, including ecology (rapid biodiversity assessment and food chain analysis), conservation biology (monitoring of protected species), biosecurity (early identification of invasive pest species), medicine (identification of medically important pathogens and their vectors) and pharmacology (identification of active compounds). However, it is important that the limitations of DNA barcoding are understood and techniques continually adapted and improved as this young science matures.
Identification of body fluid-specific DNA methylation markers for use in forensic science.

PubMed

Park, Jong-Lyul; Kwon, Oh-Hyung; Kim, Jong Hwan; Yoo, Hyang-Sook; Lee, Han-Chul; Woo, Kwang-Man; Kim, Seon-Young; Lee, Seung-Hwan; Kim, Yong Sung

2014-11-01

DNA methylation, which occurs at the 5'-position of the cytosine in CpG dinucleotides, has great potential for forensic identification of body fluids, because tissue-specific patterns of DNA methylation have been demonstrated, and DNA is less prone to degradation than proteins or RNA. Previous studies have reported several body fluid-specific DNA methylation markers, but DNA methylation differences are sometimes low in saliva and vaginal secretions. Moreover, specific DNA methylation markers in four types of body fluids (blood, saliva, semen, and vaginal secretions) have not been investigated with genome-wide profiling. Here, we investigated novel DNA methylation markers for identification of body fluids for use in forensic science using the Illumina HumanMethylation 450K bead array, which contains over 450,000 CpG sites. Using methylome data from 16 samples of blood, saliva, semen, and vaginal secretions, we first selected 2986 hypermethylated or hypomethylated regions that were specific for each type of body fluid. We then selected eight CpG sites as novel, forensically relevant DNA methylation markers: cg06379435 and cg08792630 for blood, cg26107890 and cg20691722 for saliva, cg23521140 and cg17610929 for semen, and cg01774894 and cg14991487 for vaginal secretions. These eight selected markers were evaluated in 80 body fluid samples using pyrosequencing, and all showed high sensitivity and specificity for identification of the target body fluid. We suggest that these eight DNA methylation markers may be good candidates for developing an effective molecular assay for identification of body fluids in forensic science. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
ESTuber db: an online database for Tuber borchii EST sequences.

PubMed

Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo

2007-03-08

The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
DR-GAS: a database of functional genetic variants and their phosphorylation states in human DNA repair systems.

PubMed

Sehgal, Manika; Singh, Tiratha Raj

2014-04-01

We present DR-GAS(1), a unique, consolidated and comprehensive DNA repair genetic association studies database of human DNA repair system. It presents information on repair genes, assorted mechanisms of DNA repair, linkage disequilibrium, haplotype blocks, nsSNPs, phosphorylation sites, associated diseases, and pathways involved in repair systems. DNA repair is an intricate process which plays an essential role in maintaining the integrity of the genome by eradicating the damaging effect of internal and external changes in the genome. Hence, it is crucial to extensively understand the intact process of DNA repair, genes involved, non-synonymous SNPs which perhaps affect the function, phosphorylated residues and other related genetic parameters. All the corresponding entries for DNA repair genes, such as proteins, OMIM IDs, literature references and pathways are cross-referenced to their respective primary databases. DNA repair genes and their associated parameters are either represented in tabular or in graphical form through images elucidated by computational and statistical analyses. It is believed that the database will assist molecular biologists, biotechnologists, therapeutic developers and other scientific community to encounter biologically meaningful information, and meticulous contribution of genetic level information towards treacherous diseases in human DNA repair systems. DR-GAS is freely available for academic and research purposes at: http://www.bioinfoindia.org/drgas. Copyright © 2014 Elsevier B.V. All rights reserved.
DNA analysis in perpetrator identification of terrorism-related disaster: suicide bombing of the Australian Embassy in Jakarta 2004.

PubMed

Sudoyo, Herawati; Widodo, Putut T; Suryadi, Helena; Lie, Yuliana S; Safari, Dodi; Widjajanto, Agung; Kadarmo, D Aji; Hidayat, Soegeng; Marzuki, Sangkot

2008-06-01

We report the strategy that we employed to identify the perpetrator of a suicide car bombing in front of the Australian Embassy in Jakarta, Indonesia, on 9 September 2004. The bomb was so massive that only small tissue pieces of the perpetrator could be recovered, preventing conventional approach to the identification of the bomber, necessitating the introduction of DNA analysis as the primary means for perpetrator identification. Crime scene investigation revealed the trajectory of the bomb blast, which was used to guide the collection of charred tissue fragments of the perpetrator. Mitochondrial DNA analysis was first conducted on 17 tissue fragments, recovered over large areas of the trajectory to, (a) confirm that they are of a common source, i.e. the perpetrator, and thus (b) establish the mtDNA HV1 sequence profile of the perpetrator. The mtDNA of the perpetrator matches that of a maternally related family member of one of four suspects. Standard autosomal STR analysis confirmed the identification. This case is of interest as an illustration of a successful application of DNA analysis as the primary means of disaster perpetrator identification.
Fungi in Thailand: a case study of the efficacy of an ITS barcode for automatically identifying species within the Annulohypoxylon and Hypoxylon genera.

PubMed

Suwannasai, Nuttika; Martín, María P; Phosri, Cherdchai; Sihanonth, Prakitsin; Whalley, Anthony J S; Spouge, John L

2013-01-01

Thailand, a part of the Indo-Burma biodiversity hotspot, has many endemic animals and plants. Some of its fungal species are difficult to recognize and separate, complicating assessments of biodiversity. We assessed species diversity within the fungal genera Annulohypoxylon and Hypoxylon, which produce biologically active and potentially therapeutic compounds, by applying classical taxonomic methods to 552 teleomorphs collected from across Thailand. Using probability of correct identification (PCI), we also assessed the efficacy of automated species identification with a fungal barcode marker, ITS, in the model system of Annulohypoxylon and Hypoxylon. The 552 teleomorphs yielded 137 ITS sequences; in addition, we examined 128 GenBank ITS sequences, to assess biases in evaluating a DNA barcode with GenBank data. The use of multiple sequence alignment in a barcode database like BOLD raises some concerns about non-protein barcode markers like ITS, so we also compared species identification using different alignment methods. Our results suggest the following. (1) Multiple sequence alignment of ITS sequences is competitive with pairwise alignment when identifying species, so BOLD should be able to preserve its present bioinformatics workflow for species identification for ITS, and possibly therefore with at least some other non-protein barcode markers. (2) Automated species identification is insensitive to a specific choice of evolutionary distance, contributing to resolution of a current debate in DNA barcoding. (3) Statistical methods are available to address, at least partially, the possibility of expert misidentification of species. Phylogenetic trees discovered a cryptic species and strongly supported monophyletic clades for many Annulohypoxylon and Hypoxylon species, suggesting that ITS can contribute usefully to a barcode for these fungi. The PCIs here, derived solely from ITS, suggest that a fungal barcode will require secondary markers in Annulohypoxylon and Hypoxylon, however. The URL http://tinyurl.com/spouge-barcode contains computer programs and other supplementary material relevant to this article.
Patient identification error among prostate needle core biopsy specimens--are we ready for a DNA time-out?

PubMed

Suba, Eric J; Pfeifer, John D; Raab, Stephen S

2007-10-01

Patient identification errors in surgical pathology often involve switches of prostate or breast needle core biopsy specimens among patients. We assessed strategies for decreasing the occurrence of these uncommon and yet potentially catastrophic events. Root cause analyses were performed following 3 cases of patient identification error involving prostate needle core biopsy specimens. Patient identification errors in surgical pathology result from slips and lapses of automatic human action that may occur at numerous steps during pre-laboratory, laboratory and post-laboratory work flow processes. Patient identification errors among prostate needle biopsies may be difficult to entirely prevent through the optimization of work flow processes. A DNA time-out, whereby DNA polymorphic microsatellite analysis is used to confirm patient identification before radiation therapy or radical surgery, may eliminate patient identification errors among needle biopsies.
DNA Barcoding of Neotropical Sand Flies (Diptera, Psychodidae, Phlebotominae): Species Identification and Discovery within Brazil

PubMed Central

Pinto, Israel de Souza; Chagas, Bruna Dias das; Rodrigues, Andressa Alencastre Fuzari; Ferreira, Adelson Luiz; Rezende, Helder Ricas; Bruno, Rafaela Vieira; Falqueto, Aloisio; Andrade-Filho, José Dilermando; Galati, Eunice Aparecida Bianchi; Shimabukuro, Paloma Helena Fernandes; Brazil, Reginaldo Peçanha

2015-01-01

DNA barcoding has been an effective tool for species identification in several animal groups. Here, we used DNA barcoding to discriminate between 47 morphologically distinct species of Brazilian sand flies. DNA barcodes correctly identified approximately 90% of the sampled taxa (42 morphologically distinct species) using clustering based on neighbor-joining distance, of which four species showed comparatively higher maximum values of divergence (range 4.23–19.04%), indicating cryptic diversity. The DNA barcodes also corroborated the resurrection of two species within the shannoni complex and provided an efficient tool to differentiate between morphologically indistinguishable females of closely related species. Taken together, our results validate the effectiveness of DNA barcoding for species identification and the discovery of cryptic diversity in sand flies from Brazil. PMID:26506007
DNA Barcoding of Neotropical Sand Flies (Diptera, Psychodidae, Phlebotominae): Species Identification and Discovery within Brazil.

PubMed

Pinto, Israel de Souza; Chagas, Bruna Dias das; Rodrigues, Andressa Alencastre Fuzari; Ferreira, Adelson Luiz; Rezende, Helder Ricas; Bruno, Rafaela Vieira; Falqueto, Aloisio; Andrade-Filho, José Dilermando; Galati, Eunice Aparecida Bianchi; Shimabukuro, Paloma Helena Fernandes; Brazil, Reginaldo Peçanha; Peixoto, Alexandre Afranio

2015-01-01

DNA barcoding has been an effective tool for species identification in several animal groups. Here, we used DNA barcoding to discriminate between 47 morphologically distinct species of Brazilian sand flies. DNA barcodes correctly identified approximately 90% of the sampled taxa (42 morphologically distinct species) using clustering based on neighbor-joining distance, of which four species showed comparatively higher maximum values of divergence (range 4.23-19.04%), indicating cryptic diversity. The DNA barcodes also corroborated the resurrection of two species within the shannoni complex and provided an efficient tool to differentiate between morphologically indistinguishable females of closely related species. Taken together, our results validate the effectiveness of DNA barcoding for species identification and the discovery of cryptic diversity in sand flies from Brazil.
Genetic variants of the DNA repair genes from Exome Aggregation Consortium (EXAC) database: significance in cancer.

PubMed

Das, Raima; Ghosh, Sankar Kumar

2017-04-01

DNA repair pathway is a primary defense system that eliminates wide varieties of DNA damage. Any deficiencies in them are likely to cause the chromosomal instability that leads to cell malfunctioning and tumorigenesis. Genetic polymorphisms in DNA repair genes have demonstrated a significant association with cancer risk. Our study attempts to give a glimpse of the overall scenario of the germline polymorphisms in the DNA repair genes by taking into account of the Exome Aggregation Consortium (ExAC) database as well as the Human Gene Mutation Database (HGMD) for evaluating the disease link, particularly in cancer. It has been found that ExAC DNA repair dataset (which consists of 228 DNA repair genes) comprises 30.4% missense, 12.5% dbSNP reported and 3.2% ClinVar significant variants. 27% of all the missense variants has the deleterious SIFT score of 0.00 and 6% variants carrying the most damaging Polyphen-2 score of 1.00, thus affecting the protein structure and function. However, as per HGMD, only a fraction (1.2%) of ExAC DNA repair variants was found to be cancer-related, indicating remaining variants reported in both the databases to be further analyzed. This, in turn, may provide an increased spectrum of the reported cancer linked variants in the DNA repair genes present in ExAC database. Moreover, further in silico functional assay of the identified vital cancer-associated variants, which is essential to get their actual biological significance, may shed some lights in the field of targeted drug development in near future. Copyright © 2017. Published by Elsevier B.V.
Efficacy of the core DNA barcodes in identifying processed and poorly conserved plant materials commonly used in South African traditional medicine

PubMed Central

Mankga, Ledile T.; Yessoufou, Kowiyou; Moteetee, Annah M.; Daru, Barnabas H.; van der Bank, Michelle

2013-01-01

Abstract Medicinal plants cover a broad range of taxa, which may be phylogenetically less related but morphologically very similar. Such morphological similarity between species may lead to misidentification and inappropriate use. Also the substitution of a medicinal plant by a cheaper alternative (e.g. other non-medicinal plant species), either due to misidentification, or deliberately to cheat consumers, is an issue of growing concern. In this study, we used DNA barcoding to identify commonly used medicinal plants in South Africa. Using the core plant barcodes, matK and rbcLa, obtained from processed and poorly conserved materials sold at the muthi traditional medicine market, we tested efficacy of the barcodes in species discrimination. Based on genetic divergence, PCR amplification efficiency and BLAST algorithm, we revealed varied discriminatory potentials for the DNA barcodes. In general, the barcodes exhibited high discriminatory power, indicating their effectiveness in verifying the identity of the most common plant species traded in South African medicinal markets. BLAST algorithm successfully matched 61% of the queries against a reference database, suggesting that most of the information supplied by sellers at traditional medicinal markets in South Africa is correct. Our findings reinforce the utility of DNA barcoding technique in limiting false identification that can harm public health. PMID:24453559
Applying pollen DNA metabarcoding to the study of plant–pollinator interactions1

PubMed Central

Bell, Karen L.; Fowler, Julie; Burgess, Kevin S.; Dobbs, Emily K.; Gruenewald, David; Lawley, Brice; Morozumi, Connor; Brosi, Berry J.

2017-01-01

Premise of the study: To study pollination networks in a changing environment, we need accurate, high-throughput methods. Previous studies have shown that more highly resolved networks can be constructed by studying pollen loads taken from bees, relative to field observations. DNA metabarcoding potentially allows for faster and finer-scale taxonomic resolution of pollen compared to traditional approaches (e.g., light microscopy), but has not been applied to pollination networks. Methods: We sampled pollen from 38 bee species collected in Florida from sites differing in forest management. We isolated DNA from pollen mixtures and sequenced rbcL and ITS2 gene regions from all mixtures in a single run on the Illumina MiSeq platform. We identified species from sequence data using comprehensive rbcL and ITS2 databases. Results: We successfully built a proof-of-concept quantitative pollination network using pollen metabarcoding. Discussion: Our work underscores that pollen metabarcoding is not quantitative but that quantitative networks can be constructed based on the number of interacting individuals. Due to the frequency of contamination and false positive reads, isolation and PCR negative controls should be used in every reaction. DNA metabarcoding has advantages in efficiency and resolution over microscopic identification of pollen, and we expect that it will have broad utility for future studies of plant–pollinator interactions. PMID:28690929
iDBPs: a web server for the identification of DNA binding proteins

PubMed Central

Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir

2010-01-01

Summary: The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. Availability: http://idbps.tau.ac.il/ Contact: NirB@tauex.tau.ac.il Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20089514
A multilocus database for the identification of Aspergillus and Penicillium species

USDA-ARS?s Scientific Manuscript database

Identification of Aspergillus and Penicillium isolates using phenotypic methods is increasingly complex and difficult but genetic tools allow recognition and description of species formerly unrecognized or cryptic. We constructed a web-based taxonomic database using BIGSdb for the identification of ...
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping

Treesearch

K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale

1998-01-01

DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
Partial characterization of normal and Haemophilus influenzae-infected mucosal complementary DNA libraries in chinchilla middle ear mucosa.

PubMed

Kerschner, Joseph E; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J Christopher; Ehrlich, Garth D

2010-04-01

We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription-polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis.
Partial Characterization of Normal and Haemophilus influenzae–Infected Mucosal Complementary DNA Libraries in Chinchilla Middle Ear Mucosa

PubMed Central

Kerschner, Joseph E.; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J. Christopher; Ehrlich, Garth D.

2010-01-01

Objectives We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Methods Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription–polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Results Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Conclusions Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis. PMID:20433028
Typing DNA profiles from previously enhanced fingerprints using direct PCR.

PubMed

Templeton, Jennifer E L; Taylor, Duncan; Handt, Oliva; Linacre, Adrian

2017-07-01

Fingermarks are a source of human identification both through the ridge patterns and DNA profiling. Typing nuclear STR DNA markers from previously enhanced fingermarks provides an alternative method of utilising the limited fingermark deposit that can be left behind during a criminal act. Dusting with fingerprint powders is a standard method used in classical fingermark enhancement and can affect DNA data. The ability to generate informative DNA profiles from powdered fingerprints using direct PCR swabs was investigated. Direct PCR was used as the opportunity to generate usable DNA profiles after performing any of the standard DNA extraction processes is minimal. Omitting the extraction step will, for many samples, be the key to success if there is limited sample DNA. DNA profiles were generated by direct PCR from 160 fingermarks after treatment with one of the following dactyloscopic fingerprint powders: white hadonite; silver aluminium; HiFi Volcano silk black; or black magnetic fingerprint powder. This was achieved by a combination of an optimised double-swabbing technique and swab media, omission of the extraction step to minimise loss of critical low-template DNA, and additional AmpliTaq Gold ® DNA polymerase to boost the PCR. Ninety eight out of 160 samples (61%) were considered 'up-loadable' to the Australian National Criminal Investigation DNA Database (NCIDD). The method described required a minimum of working steps, equipment and reagents, and was completed within 4h. Direct PCR allows the generation of DNA profiles from enhanced prints without the need to increase PCR cycle numbers beyond manufacturer's recommendations. Particular emphasis was placed on preventing contamination by applying strict protocols and avoiding the use of previously used fingerprint brushes. Based on this extensive survey, the data provided indicate minimal effects of any of these four powders on the chance of obtaining DNA profiles from enhanced fingermarks. Copyright © 2017 Elsevier B.V. All rights reserved.
Molecular identification and phylogenetic analysis of human Trichostrongylus species from an endemic area of Iran.

PubMed

Sharifdini, Meysam; Derakhshani, Sedigheh; Alizadeh, Safar Ali; Ghanbarzadeh, Laleh; Mirjalali, Hamed; Mobedi, Iraj; Saraei, Mehrzad

2017-12-01

Human infections with Trichostrongylus species have been reported in most parts of Iran. The aim of this study was the identification, molecular characterization and phylogenetic analysis of human Trichostrongylus species based on ITS2 region of ribosomal DNA from Guilan Province, northern Iran. Stool samples were collected from rural inhabitants and examined by formalin-ether concentration and agar plate culture techniques. After anthelmintic treatment, male adult worms were collected from five infected cases. Genomic DNA was extracted from one male worm of each species in every treated individual and one filariform larva isolated from each case. PCR amplification of ITS2-rDNA region was performed and the products were sequenced. Among 1508 individuals, 46 (3.05%) were found infected with Trichostrongylus species using parasitological methods. Male worms of T. colubriformis, T. vitrinus and T. longispicularis were expelled from five patients after treatment. Out of 41 filariform larvae, 40 were T. colubriformis, and the other one was T. axei. Phylogenetic analysis showed that each species was placed together with reference sequences submitted to GenBank database. Intra-species similarity for all species obtained in the current study was 100%. T. colubriformis was found to be probably the most common species in this region of Iran. For the first time, the authors of the present study report the occurrence of natural human infection by T. longispicularis in the world. Therefore, the number of Trichostrongylus species infecting human in Iran now increased to ten. Copyright © 2017. Published by Elsevier B.V.

Barcoding Sponges: An Overview Based on Comprehensive Sampling

PubMed Central

Vargas, Sergio; Schuster, Astrid; Sacher, Katharina; Büttner, Gabrielle; Schätzle, Simone; Läuchli, Benjamin; Hall, Kathryn; Hooper, John N. A.; Erpenbeck, Dirk; Wörheide, Gert

2012-01-01

Background Phylum Porifera includes ∼8,500 valid species distributed world-wide in aquatic ecosystems ranging from ephemeral fresh-water bodies to coastal environments and the deep-sea. The taxonomy and systematics of sponges is complicated, and morphological identification can be both time consuming and erroneous due to phenotypic convergence and secondary losses, etc. DNA barcoding can provide sponge biologists with a simple and rapid method for the identification of samples of unknown taxonomic membership. The Sponge Barcoding Project (www.spongebarcoding.org), the first initiative to barcode a non-bilaterian metazoan phylum, aims to provide a comprehensive DNA barcode database for Phylum Porifera. Methodology/Principal Findings ∼7,400 sponge specimens have been extracted, and amplification of the standard COI barcoding fragment has been attempted for approximately 3,300 museum samples with ∼25% mean amplification success. Based on this comprehensive sampling, we present the first report on the workflow and progress of the sponge barcoding project, and discuss some common pitfalls inherent to the barcoding of sponges. Conclusion A DNA-barcoding workflow capable of processing potentially large sponge collections has been developed and is routinely used for the Sponge Barcoding Project with success. Sponge specific problems such as the frequent co-amplification of non-target organisms have been detected and potential solutions are currently under development. The initial success of this innovative project have already demonstrated considerable refinement of sponge systematics, evaluating morphometric character importance, geographic phenotypic variability, and the utility of the standard barcoding fragment for Porifera (despite its conserved evolution within this basal metazoan phylum). PMID:22802937
DNA algorithms of implementing biomolecular databases on a biological computer.

PubMed

Chang, Weng-Long; Vasilakos, Athanasios V

2015-01-01

In this paper, DNA algorithms are proposed to perform eight operations of relational algebra (calculus), which include Cartesian product, union, set difference, selection, projection, intersection, join, and division, on biomolecular relational databases.
Validating DNA barcodes: A non-destructive extraction protocol enables simultaneous vouchering of DNA and morphological vouchers

USDA-ARS?s Scientific Manuscript database

Morphology-based keys support accurate identification of many taxa. However, identification can be difficult for taxa that are not well studied, very small, members of cryptic species complexes, or represented by immature stages. For such cases, DNA barcodes may provide diagnostic characters. Ecolog...
NPIDB: Nucleic acid-Protein Interaction DataBase.

PubMed

Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V

2013-01-01

The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
Low template STR typing: effect of replicate number and consensus method on genotyping reliability and DNA database search results.

PubMed

Benschop, Corina C G; van der Beek, Cornelis P; Meiland, Hugo C; van Gorp, Ankie G M; Westen, Antoinette A; Sijen, Titia

2011-08-01

To analyze DNA samples with very low DNA concentrations, various methods have been developed that sensitize short tandem repeat (STR) typing. Sensitized DNA typing is accompanied by stochastic amplification effects, such as allele drop-outs and drop-ins. Therefore low template (LT) DNA profiles are interpreted with care. One can either try to infer the genotype by a consensus method that uses alleles confirmed in replicate analyses, or one can use a statistical model to evaluate the strength of the evidence in a direct comparison with a known DNA profile. In this study we focused on the first strategy and we show that the procedure by which the consensus profile is assembled will affect genotyping reliability. In order to gain insight in the roles of replicate number and requested level of reproducibility, we generated six independent amplifications of samples of known donors. The LT methods included both increased cycling and enhanced capillary electrophoresis (CE) injection [1]. Consensus profiles were assembled from two to six of the replications using four methods: composite (include all alleles), n-1 (include alleles detected in all but one replicate), n/2 (include alleles detected in at least half of the replicates) and 2× (include alleles detected twice). We compared the consensus DNA profiles with the DNA profile of the known donor, studied the stochastic amplification effects and examined the effect of the consensus procedure on DNA database search results. From all these analyses we conclude that the accuracy of LT DNA typing and the efficiency of database searching improve when the number of replicates is increased and the consensus method is n/2. The most functional number of replicates within this n/2 method is four (although a replicate number of three suffices for samples showing >25% of the alleles in standard STR typing). This approach was also the optimal strategy for the analysis of 2-person mixtures, although modified search strategies may be needed to retrieve the minor component in database searches. From the database searches follows the recommendation to specifically mark LT DNA profiles when entering them into the DNA database. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Bacterial identification and subtyping using DNA microarray and DNA sequencing.

PubMed

Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

2012-01-01

The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.
DNA identification of human remains in Disaster Victim Identification (DVI): An efficient sampling method for muscle, bone, bone marrow and teeth.

PubMed

de Boer, Hans H; Maat, George J R; Kadarmo, D Aji; Widodo, Putut T; Kloosterman, Ate D; Kal, Arnoud J

2018-06-04

In disaster victim identification (DVI), DNA profiling is considered to be one of the most reliable and efficient means to identify bodies or separated body parts. This requires a post mortem DNA sample, and an ante mortem DNA sample of the presumed victim or their biological relative(s). Usually the collection of an adequate ante mortem sample is technically simple, but the acquisition of a good quality post mortem sample under unfavourable DVI circumstances is complicated due to the variable degree of preservation of the human remains and the high risk of DNA (cross) contamination. This paper provides the community with an efficient method to collect post-mortem DNA samples from muscle, bone, bone marrow and teeth, with a minimal risk of contamination. Our method has been applied in a recent, challenging DVI operation (i.e. the identification of the 298 victims of the MH17 airplane crash in 2014). 98,2% of the collected PM samples provided the DVI team with highly informative DNA genotyping results without the risk of contamination and consequent mistyping the victim's DNA. Moreover, the method is easy, cheap and quick. This paper provides the DVI community with a step-wise instructions with recommendations for the type of tissue to be sampled and the site of excision (preferably the upper leg). Although initially designed for DVI purposes, the method is also suited for the identification of individual victims. Copyright © 2018 Elsevier B.V. All rights reserved.
Diversity of Bacteria at Healthy Human Conjunctiva

PubMed Central

Dong, Qunfeng; Brulc, Jennifer M.; Iovieno, Alfonso; Bates, Brandon; Garoutte, Aaron; Miller, Darlene; Revanna, Kashi V.; Gao, Xiang; Antonopoulos, Dionysios A.; Slepak, Vladlen Z.

2011-01-01

Purpose. Ocular surface (OS) microbiota contributes to infectious and autoimmune diseases of the eye. Comprehensive analysis of microbial diversity at the OS has been impossible because of the limitations of conventional cultivation techniques. This pilot study aimed to explore true diversity of human OS microbiota using DNA sequencing-based detection and identification of bacteria. Methods. Composition of the bacterial community was characterized using deep sequencing of the 16S rRNA gene amplicon libraries generated from total conjunctival swab DNA. The DNA sequences were classified and the diversity parameters measured using bioinformatics software ESPRIT and MOTHUR and tools available through the Ribosomal Database Project-II (RDP-II). Results. Deep sequencing of conjunctival rDNA from four subjects yielded a total of 115,003 quality DNA reads, corresponding to 221 species-level phylotypes per subject. The combined bacterial community classified into 5 phyla and 59 distinct genera. However, 31% of all DNA reads belonged to unclassified or novel bacteria. The intersubject variability of individual OS microbiomes was very significant. Regardless, 12 genera—Pseudomonas, Propionibacterium, Bradyrhizobium, Corynebacterium, Acinetobacter, Brevundimonas, Staphylococci, Aquabacterium, Sphingomonas, Streptococcus, Streptophyta, and Methylobacterium—were ubiquitous among the analyzed cohort and represented the putative “core” of conjunctival microbiota. The other 47 genera accounted for <4% of the classified portion of this microbiome. Unexpectedly, healthy conjunctiva contained many genera that are commonly identified as ocular surface pathogens. Conclusions. The first DNA sequencing-based survey of bacterial population at the conjunctiva have revealed an unexpectedly diverse microbial community. All analyzed samples contained ubiquitous (core) genera that included commensal, environmental, and opportunistic pathogenic bacteria. PMID:21571682
Genomic survey and expression analysis of DNA repair genes in the genus Leptospira.

PubMed

Martins-Pinheiro, Marinalva; Schons-Fonseca, Luciane; da Silva, Josefa B; Domingos, Renan H; Momo, Leonardo Hiroyuki Santos; Simões, Ana Carolina Quirino; Ho, Paulo Lee; da Costa, Renata M A

2016-04-01

Leptospirosis is an emerging zoonosis with important economic and public health consequences and is caused by pathogenic leptospires. The genus Leptospira belongs to the order Spirochaetales and comprises saprophytic (L. biflexa), pathogenic (L. interrogans) and host-dependent (L. borgpetersenii) members. Here, we present an in silico search for DNA repair pathways in Leptospira spp. The relevance of such DNA repair pathways was assessed through the identification of mRNA levels of some genes during infection in animal model and after exposition to spleen cells. The search was performed by comparison of available Leptospira spp. genomes in public databases with known DNA repair-related genes. Leptospires exhibit some distinct and unexpected characteristics, for instance the existence of a redundant mechanism for repairing a chemically diverse spectrum of alkylated nucleobases, a new mutS-like gene and a new shorter version of uvrD. Leptospira spp. shares some characteristics from Gram-positive, as the presence of PcrA, two RecQ paralogs and two SSB proteins; the latter is considered a feature shared by naturally competent bacteria. We did not find a significant reduction in the number of DNA repair-related genes in both pathogenic and host-dependent species. Pathogenic leptospires were enriched for genes dedicated to base excision repair and non-homologous end joining. Their evolutionary history reveals a remarkable importance of lateral gene transfer events for the evolution of the genus. Up-regulation of specific DNA repair genes, including components of SOS regulon, during infection in animal model validates the critical role of DNA repair mechanisms for the complex interplay between host/pathogen.
The influence of diet on faecal DNA amplification and sex identification in brown bears (Ursus arctos)

USGS Publications Warehouse

Murphy, M.A.; Waits, L.P.; Kendall, K.C.

2003-01-01

To evaluate the influence of diet on faecal DNA amplification, 11 captive brown bears (Ursus arctos) were placed on six restricted diets: grass (Trifolium spp., Haplopappus hirtus and Poa pratensis), alfalfa (Lupinus spp.), carrots (Daucus spp.), white-tailed deer (Odocoileus virginianus), blueberries (Vaccinium spp.) and salmon (Salmo spp.). DNA was extracted from 50 faecal samples of each restricted diet, and amplification of brown bear DNA was attempted for a mitochondrial DNA (mtDNA) locus and nuclear DNA (nDNA) locus. For mtDNA, no significant differences were observed in amplification success rates across diets. For nDNA, amplification success rates for salmon diet extracts were significantly lower than all other diet extracts (P < 0.001). To evaluate the accuracy of faecal DNA sex identification when female carnivores consume male mammalian prey, female bears were fed male white-tailed deer. Four of 10 extracts amplified, and all extracts were incorrectly scored as male due to amplification of X and Y-chromosome fragments. The potential biases highlighted in this study have broad implications for researchers using faecal DNA for individual and sex identification, and should be evaluated in other species.
Single-Stranded DNA Aptamers against Pathogens and Toxins: Identification and Biosensing Applications

PubMed Central

Hong, Ka Lok

2015-01-01

Molecular recognition elements (MREs) can be short sequences of single-stranded DNA, RNA, small peptides, or antibody fragments. They can bind to user-defined targets with high affinity and specificity. There has been an increasing interest in the identification and application of nucleic acid molecular recognition elements, commonly known as aptamers, since they were first described in 1990 by the Gold and Szostak laboratories. A large number of target specific nucleic acids MREs and their applications are currently in the literature. This review first describes the general methodologies used in identifying single-stranded DNA (ssDNA) aptamers. It then summarizes advancements in the identification and biosensing application of ssDNA aptamers specific for bacteria, viruses, their associated molecules, and selected chemical toxins. Lastly, an overview of the basic principles of ssDNA aptamer-based biosensors is discussed. PMID:26199940
Dental DNA fingerprinting in identification of human remains

PubMed Central

Girish, KL; Rahman, Farzan S; Tippu, Shoaib R

2010-01-01

The recent advances in molecular biology have revolutionized all aspects of dentistry. DNA, the language of life yields information beyond our imagination, both in health or disease. DNA fingerprinting is a tool used to unravel all the mysteries associated with the oral cavity and its manifestations during diseased conditions. It is being increasingly used in analyzing various scenarios related to forensic science. The technical advances in molecular biology have propelled the analysis of the DNA into routine usage in crime laboratories for rapid and early diagnosis. DNA is an excellent means for identification of unidentified human remains. As dental pulp is surrounded by dentin and enamel, which forms dental armor, it offers the best source of DNA for reliable genetic type in forensic science. This paper summarizes the recent literature on use of this technique in identification of unidentified human remains. PMID:21731342
Broad spectrum microarray for fingerprint-based bacterial species identification

PubMed Central

2010-01-01

Background Microarrays are powerful tools for DNA-based molecular diagnostics and identification of pathogens. Most target a limited range of organisms and are based on only one or a very few genes for specific identification. Such microarrays are limited to organisms for which specific probes are available, and often have difficulty discriminating closely related taxa. We have developed an alternative broad-spectrum microarray that employs hybridisation fingerprints generated by high-density anonymous markers distributed over the entire genome for identification based on comparison to a reference database. Results A high-density microarray carrying 95,000 unique 13-mer probes was designed. Optimized methods were developed to deliver reproducible hybridisation patterns that enabled confident discrimination of bacteria at the species, subspecies, and strain levels. High correlation coefficients were achieved between replicates. A sub-selection of 12,071 probes, determined by ANOVA and class prediction analysis, enabled the discrimination of all samples in our panel. Mismatch probe hybridisation was observed but was found to have no effect on the discriminatory capacity of our system. Conclusions These results indicate the potential of our genome chip for reliable identification of a wide range of bacterial taxa at the subspecies level without laborious prior sequencing and probe design. With its high resolution capacity, our proof-of-principle chip demonstrates great potential as a tool for molecular diagnostics of broad taxonomic groups. PMID:20163710
Rapid Molecular Identification of Pathogenic Yeasts by Pyrosequencing Analysis of 35 Nucleotides of Internal Transcribed Spacer 2 ▿

PubMed Central

Borman, Andrew M.; Linton, Christopher J.; Oliver, Debra; Palmer, Michael D.; Szekely, Adrien; Johnson, Elizabeth M.

2010-01-01

Rapid identification of yeast species isolates from clinical samples is particularly important given their innately variable antifungal susceptibility profiles. Here, we have evaluated the utility of pyrosequencing analysis of a portion of the internal transcribed spacer 2 region (ITS2) for identification of pathogenic yeasts. A total of 477 clinical isolates encompassing 43 different fungal species were subjected to pyrosequencing analysis in a strictly blinded study. The molecular identifications produced by pyrosequencing were compared with those obtained using conventional biochemical tests (AUXACOLOR2) and following PCR amplification and sequencing of the D1-D2 portion of the nuclear 28S large rRNA gene. More than 98% (469/477) of isolates encompassing 40 of the 43 fungal species tested were correctly identified by pyrosequencing of only 35 bp of ITS2. Moreover, BLAST searches of the public synchronized databases with the ITS2 pyrosequencing signature sequences revealed that there was only minimal sequence redundancy in the ITS2 under analysis. In all cases, the pyrosequencing signature sequences were unique to the yeast species (or species complex) under investigation. Finally, when pyrosequencing was combined with the Whatman FTA paper technology for the rapid extraction of fungal genomic DNA, molecular identification could be accomplished within 6 h from the time of starting from pure cultures. PMID:20702674
Development and in silico evaluation of large-scale metabolite identification methods using functional group detection for metabolomics

PubMed Central

Mitchell, Joshua M.; Fan, Teresa W.-M.; Lane, Andrew N.; Moseley, Hunter N. B.

2014-01-01

Large-scale identification of metabolites is key to elucidating and modeling metabolism at the systems level. Advances in metabolomics technologies, particularly ultra-high resolution mass spectrometry (MS) enable comprehensive and rapid analysis of metabolites. However, a significant barrier to meaningful data interpretation is the identification of a wide range of metabolites including unknowns and the determination of their role(s) in various metabolic networks. Chemoselective (CS) probes to tag metabolite functional groups combined with high mass accuracy provide additional structural constraints for metabolite identification and quantification. We have developed a novel algorithm, Chemically Aware Substructure Search (CASS) that efficiently detects functional groups within existing metabolite databases, allowing for combined molecular formula and functional group (from CS tagging) queries to aid in metabolite identification without a priori knowledge. Analysis of the isomeric compounds in both Human Metabolome Database (HMDB) and KEGG Ligand demonstrated a high percentage of isomeric molecular formulae (43 and 28%, respectively), indicating the necessity for techniques such as CS-tagging. Furthermore, these two databases have only moderate overlap in molecular formulae. Thus, it is prudent to use multiple databases in metabolite assignment, since each major metabolite database represents different portions of metabolism within the biosphere. In silico analysis of various CS-tagging strategies under different conditions for adduct formation demonstrate that combined FT-MS derived molecular formulae and CS-tagging can uniquely identify up to 71% of KEGG and 37% of the combined KEGG/HMDB database vs. 41 and 17%, respectively without adduct formation. This difference between database isomer disambiguation highlights the strength of CS-tagging for non-lipid metabolite identification. However, unique identification of complex lipids still needs additional information. PMID:25120557
Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)

PubMed Central

Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing

1998-01-01

The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330
Standardized molecular diagnostic tool for the identification of cryptic species within the Bemisia tabaci complex.

PubMed

Elfekih, Samia; Tay, Wee Tek; Gordon, Karl; Court, Leon N; De Barro, Paul J

2018-01-01

The whitefly Bemisia tabaci complex harbours over 40 cryptic species that have been placed in 11 phylogenetically distinct clades based on the molecular characterization of partial mitochondrial DNA COI (mtCOI) gene region. Four cryptic species are currently within the invasive clade, i.e. MED, MEAM1, MEAM2 and IO. Correct identification of these species is a critical step towards implementing reliable measures for plant biosecurity and border protection; however, no standardized B. tabaci-specific primers are currently available which has caused inconsistencies in the species identification processes. We report three sets of polymerase chain reaction (PCR) primers developed to amplify the mtCOI region which can be used for genotyping MED, MEAM1 and IO species, and tested these primers on 91 MED, 35 MEAM1 and five IO individuals. PCR and sequencing of amplicons identified a total of 21, six and one haplotypes in MED, MEAM1 and IO respectively, of which six haplotypes were new to the B. tabaci database. These primer pairs enabled standardization and robust molecular species identification via mtCOI screening of the targeted invasive cryptic species and will improve quarantine decisions. Use of this diagnostic tool could be extended to other species within the complex. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
A distributed computational search strategy for the identification of diagnostics targets: application to finding aptamer targets for methicillin-resistant staphylococci.

PubMed

Flanagan, Keith; Cockell, Simon; Harwood, Colin; Hallinan, Jennifer; Nakjang, Sirintra; Lawry, Beth; Wipat, Anil

2014-06-30

The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.
A distributed computational search strategy for the identification of diagnostics targets: Application to finding aptamer targets for methicillin-resistant staphylococci.

PubMed

Flanagan, Keith; Cockell, Simon; Harwood, Colin; Hallinan, Jennifer; Nakjang, Sirintra; Lawry, Beth; Wipat, Anil

2014-06-01

The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.
Specialized microbial databases for inductive exploration of microbial genome sequences

PubMed Central

Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine

2005-01-01

Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474

HAEdb: a novel interactive, locus-specific mutation database for the C1 inhibitor gene.

PubMed

Kalmár, Lajos; Hegedüs, Tamás; Farkas, Henriette; Nagy, Melinda; Tordai, Attila

2005-01-01

Hereditary angioneurotic edema (HAE) is an autosomal dominant disorder characterized by episodic local subcutaneous and submucosal edema and is caused by the deficiency of the activated C1 esterase inhibitor protein (C1-INH or C1INH; approved gene symbol SERPING1). Published C1-INH mutations are represented in large universal databases (e.g., OMIM, HGMD), but these databases update their data rather infrequently, they are not interactive, and they do not allow searches according to different criteria. The HAEdb, a C1-INH gene mutation database (http://hae.biomembrane.hu) was created to contribute to the following expectations: 1) help the comprehensive collection of information on genetic alterations of the C1-INH gene; 2) create a database in which data can be searched and compared according to several flexible criteria; and 3) provide additional help in new mutation identification. The website uses MySQL, an open-source, multithreaded, relational database management system. The user-friendly graphical interface was written in the PHP web programming language. The website consists of two main parts, the freely browsable search function, and the password-protected data deposition function. Mutations of the C1-INH gene are divided in two parts: gross mutations involving DNA fragments >1 kb, and micro mutations encompassing all non-gross mutations. Several attributes (e.g., affected exon, molecular consequence, family history) are collected for each mutation in a standardized form. This database may facilitate future comprehensive analyses of C1-INH mutations and also provide regular help for molecular diagnostic testing of HAE patients in different centers.
Identification of apple cultivars on the basis of simple sequence repeat markers.

PubMed

Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y

2014-09-12

DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.
Mass spectrometry-based protein identification by integrating de novo sequencing with database searching.

PubMed

Wang, Penghao; Wilson, Susan R

2013-01-01

Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.
Characterisation of Asian Snakehead Murrel Channa striata (Channidae) in Malaysia: An Insight into Molecular Data and Morphological Approach

PubMed Central

Song, Li Min; Munian, Kaviarasu; Abd Rashid, Zulkafli; Bhassu, Subha

2013-01-01

Conservation is imperative for the Asian snakeheads Channa striata, as the species has been overfished due to its high market demand. Using maternal markers (mitochondrial cytochrome c oxidase subunit 1 gene (COI)), we discovered that evolutionary forces that drove population divergence did not show any match between the genetic and morphological divergence pattern. However, there is evidence of incomplete divergence patterns between the Borneo population and the populations from Peninsular Malaysia. This supports the claim of historical coalescence of C. striata during Pleistocene glaciations. Ecological heterogeneity caused high phenotypic variance and was not correlated with genetic variance among the populations. Spatial conservation assessments are required to manage different stock units. Results on DNA barcoding show no evidence of cryptic species in C. striata in Malaysia. The newly obtained sequences add to the database of freshwater fish DNA barcodes and in future will provide information relevant to identification of species. PMID:24396312
Measuring the Electronic Properties of DNA-Specific Schottky Diodes Towards Detecting and Identifying Basidiomycetes DNA

PubMed Central

Periasamy, Vengadesh; Rizan, Nastaran; Al-Ta’ii, Hassan Maktuff Jaber; Tan, Yee Shin; Tajuddin, Hairul Annuar; Iwamoto, Mitsumasa

2016-01-01

The discovery of semiconducting behavior of deoxyribonucleic acid (DNA) has resulted in a large number of literatures in the study of DNA electronics. Sequence-specific electronic response provides a platform towards understanding charge transfer mechanism and therefore the electronic properties of DNA. It is possible to utilize these characteristic properties to identify/detect DNA. In this current work, we demonstrate a novel method of DNA-based identification of basidiomycetes using current-voltage (I-V) profiles obtained from DNA-specific Schottky barrier diodes. Electronic properties such as ideality factor, barrier height, shunt resistance, series resistance, turn-on voltage, knee-voltage, breakdown voltage and breakdown current were calculated and used to quantify the identification process as compared to morphological and molecular characterization techniques. The use of these techniques is necessary in order to study biodiversity, but sometimes it can be misleading and unreliable and is not sufficiently useful for the identification of fungi genera. Many of these methods have failed when it comes to identification of closely related species of certain genus like Pleurotus. Our electronics profiles, both in the negative and positive bias regions were however found to be highly characteristic according to the base-pair sequences. We believe that this simple, low-cost and practical method could be useful towards identifying and detecting DNA in biotechnology and pathology. PMID:27435636
Measuring the Electronic Properties of DNA-Specific Schottky Diodes Towards Detecting and Identifying Basidiomycetes DNA

NASA Astrophysics Data System (ADS)

Periasamy, Vengadesh; Rizan, Nastaran; Al-Ta'Ii, Hassan Maktuff Jaber; Tan, Yee Shin; Tajuddin, Hairul Annuar; Iwamoto, Mitsumasa

2016-07-01

The discovery of semiconducting behavior of deoxyribonucleic acid (DNA) has resulted in a large number of literatures in the study of DNA electronics. Sequence-specific electronic response provides a platform towards understanding charge transfer mechanism and therefore the electronic properties of DNA. It is possible to utilize these characteristic properties to identify/detect DNA. In this current work, we demonstrate a novel method of DNA-based identification of basidiomycetes using current-voltage (I-V) profiles obtained from DNA-specific Schottky barrier diodes. Electronic properties such as ideality factor, barrier height, shunt resistance, series resistance, turn-on voltage, knee-voltage, breakdown voltage and breakdown current were calculated and used to quantify the identification process as compared to morphological and molecular characterization techniques. The use of these techniques is necessary in order to study biodiversity, but sometimes it can be misleading and unreliable and is not sufficiently useful for the identification of fungi genera. Many of these methods have failed when it comes to identification of closely related species of certain genus like Pleurotus. Our electronics profiles, both in the negative and positive bias regions were however found to be highly characteristic according to the base-pair sequences. We believe that this simple, low-cost and practical method could be useful towards identifying and detecting DNA in biotechnology and pathology.
A grass molecular identification system for forensic botany: a critical evaluation of the strengths and limitations.

PubMed

Ward, Jodie; Gilmore, Simon R; Robertson, James; Peakall, Rod

2009-11-01

Plant material is frequently encountered in criminal investigations but often overlooked as potential evidence. We designed a DNA-based molecular identification system for 100 Australian grasses that consisted of a series of polymerase chain reaction assays that enabled the progressive identification of grasses to different taxonomic levels. The identification system was based on DNA sequence variation at four chloroplast and two mitochondrial loci. Seventeen informative indels and 68 single-nucleotide polymorphisms were utilized as molecular markers for subfamily to species-level identification. To identify an unknown sample to subfamily level required a minimum of four markers or nine markers for species identification. The accuracy of the system was confirmed by blind tests. We have demonstrated "proof of concept" of a molecular identification system for trace botanical samples. Our evaluation suggests that the adoption of a system that combines this approach with DNA sequencing could assist the morphological identification of grasses found as forensic evidence.
Progress and challenges in bioinformatics approaches for enhancer identification

PubMed Central

Kleftogiannis, Dimitrios; Kalnis, Panos

2016-01-01

Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration. PMID:26634919
Vector soup: high-throughput identification of Neotropical phlebotomine sand flies using metabarcoding.

PubMed

Kocher, Arthur; Gantier, Jean-Charles; Gaborit, Pascal; Zinger, Lucie; Holota, Helene; Valiere, Sophie; Dusfour, Isabelle; Girod, Romain; Bañuls, Anne-Laure; Murienne, Jerome

2017-03-01

Phlebotomine sand flies are haematophagous dipterans of primary medical importance. They represent the only proven vectors of leishmaniasis worldwide and are involved in the transmission of various other pathogens. Studying the ecology of sand flies is crucial to understand the epidemiology of leishmaniasis and further control this disease. A major limitation in this regard is that traditional morphological-based methods for sand fly species identifications are time-consuming and require taxonomic expertise. DNA metabarcoding holds great promise in overcoming this issue by allowing the identification of multiple species from a single bulk sample. Here, we assessed the reliability of a short insect metabarcode located in the mitochondrial 16S rRNA for the identification of Neotropical sand flies, and constructed a reference database for 40 species found in French Guiana. Then, we conducted a metabarcoding experiment on sand flies mixtures of known content and showed that the method allows an accurate identification of specimens in pools. Finally, we applied metabarcoding to field samples caught in a 1-ha forest plot in French Guiana. Besides providing reliable molecular data for species-level assignations of phlebotomine sand flies, our study proves the efficiency of metabarcoding based on the mitochondrial 16S rRNA for studying sand fly diversity from bulk samples. The application of this high-throughput identification procedure to field samples can provide great opportunities for vector monitoring and eco-epidemiological studies. © 2016 John Wiley & Sons Ltd.
MitoBreak: the mitochondrial DNA breakpoints database.

PubMed

Damas, Joana; Carneiro, João; Amorim, António; Pereira, Filipe

2014-01-01

Mitochondrial DNA (mtDNA) rearrangements are key events in the development of many diseases. Investigations of mtDNA regions affected by rearrangements (i.e. breakpoints) can lead to important discoveries about rearrangement mechanisms and can offer important clues about the causes of mitochondrial diseases. Here, we present the mitochondrial DNA breakpoints database (MitoBreak; http://mitobreak.portugene.com), a free, web-accessible comprehensive list of breakpoints from three classes of somatic mtDNA rearrangements: circular deleted (deletions), circular partially duplicated (duplications) and linear mtDNAs. Currently, MitoBreak contains >1400 mtDNA rearrangements from seven species (Homo sapiens, Mus musculus, Rattus norvegicus, Macaca mulatta, Drosophila melanogaster, Caenorhabditis elegans and Podospora anserina) and their associated phenotypic information collected from nearly 400 publications. The database allows researchers to perform multiple types of data analyses through user-friendly interfaces with full or partial datasets. It also permits the download of curated data and the submission of new mtDNA rearrangements. For each reported case, MitoBreak also documents the precise breakpoint positions, junction sequences, disease or associated symptoms and links to the related publications, providing a useful resource to study the causes and consequences of mtDNA structural alterations.
Pharmacogenomics and its potential impact on drug and formulation development.

PubMed

Regnstrom, Karin; Burgess, Diane J

2005-01-01

Recent advances in genomic research have provided the basis for new insights into the importance of genetic and genomic markers during the different stages of drug development. A new field of research, pharmacogenomics, which studies the relationship between drug effects and the genome, has emerged. Structural pharmacogenomics maps the complete DNA sequences of whole genomes (genotypes) including individual variations, and functional pharmacogenomics assesses the expression levels of thousands of genes in one single experiment. Together, these two areas of pharmacogenomics have generated massive databases, which have become a challenge for the research field of informatics and have fostered a new branch of research, bioinformatics. If skillfully used, the databases generated by pharmacogenomics together with data mining on the Web promise to improve the drug development process in a variety of areas: identification of drug targets, evaluation of toxicity, classification of diseases, evaluation of formulations, assessment of drug response and treatment, post-marketing applications, and development of personalized medicines.
The Real maccoyii: Identifying Tuna Sushi with DNA Barcodes – Contrasting Characteristic Attributes and Genetic Distances

PubMed Central

Lowenstein, Jacob H.; Amato, George; Kolokotronis, Sergios-Orestis

2009-01-01

Background The use of DNA barcodes for the identification of described species is one of the least controversial and most promising applications of barcoding. There is no consensus, however, as to what constitutes an appropriate identification standard and most barcoding efforts simply attempt to pair a query sequence with reference sequences and deem identification successful if it falls within the bounds of some pre-established cutoffs using genetic distance. Since the Renaissance, however, most biological classification schemes have relied on the use of diagnostic characters to identify and place species. Methodology/Principal Findings Here we developed a cytochrome c oxidase subunit I character-based key for the identification of all tuna species of the genus Thunnus, and compared its performance with distance-based measures for identification of 68 samples of tuna sushi purchased from 31 restaurants in Manhattan (New York City) and Denver, Colorado. Both the character-based key and GenBank BLAST successfully identified 100% of the tuna samples, while the Barcode of Life Database (BOLD) as well as genetic distance thresholds, and neighbor-joining phylogenetic tree building performed poorly in terms of species identification. A piece of tuna sushi has the potential to be an endangered species, a fraud, or a health hazard. All three of these cases were uncovered in this study. Nineteen restaurant establishments were unable to clarify or misrepresented what species they sold. Five out of nine samples sold as a variant of “white tuna” were not albacore (T. alalunga), but escolar (Lepidocybium flavorunneum), a gempylid species banned for sale in Italy and Japan due to health concerns. Nineteen samples were northern bluefin tuna (T. thynnus) or the critically endangered southern bluefin tuna (T. maccoyii), though nine restaurants that sold these species did not state these species on their menus. Conclusions/Significance The Convention on International Trade Endangered Species (CITES) requires that listed species must be identifiable in trade. This research fulfills this requirement for tuna, and supports the nomination of northern bluefin tuna for CITES listing in 2010. PMID:19924239
Wolbachia and DNA barcoding insects: patterns, potential, and problems.

PubMed

Smith, M Alex; Bertrand, Claudia; Crosby, Kate; Eveleigh, Eldon S; Fernandez-Triana, Jose; Fisher, Brian L; Gibbs, Jason; Hajibabaei, Mehrdad; Hallwachs, Winnie; Hind, Katharine; Hrcek, Jan; Huang, Da-Wei; Janda, Milan; Janzen, Daniel H; Li, Yanwei; Miller, Scott E; Packer, Laurence; Quicke, Donald; Ratnasingham, Sujeevan; Rodriguez, Josephine; Rougerie, Rodolphe; Shaw, Mark R; Sheffield, Cory; Stahlhut, Julie K; Steinke, Dirk; Whitfield, James; Wood, Monty; Zhou, Xin

2012-01-01

Wolbachia is a genus of bacterial endosymbionts that impacts the breeding systems of their hosts. Wolbachia can confuse the patterns of mitochondrial variation, including DNA barcodes, because it influences the pathways through which mitochondria are inherited. We examined the extent to which these endosymbionts are detected in routine DNA barcoding, assessed their impact upon the insect sequence divergence and identification accuracy, and considered the variation present in Wolbachia COI. Using both standard PCR assays (Wolbachia surface coding protein--wsp), and bacterial COI fragments we found evidence of Wolbachia in insect total genomic extracts created for DNA barcoding library construction. When >2 million insect COI trace files were examined on the Barcode of Life Datasystem (BOLD) Wolbachia COI was present in 0.16% of the cases. It is possible to generate Wolbachia COI using standard insect primers; however, that amplicon was never confused with the COI of the host. Wolbachia alleles recovered were predominantly Supergroup A and were broadly distributed geographically and phylogenetically. We conclude that the presence of the Wolbachia DNA in total genomic extracts made from insects is unlikely to compromise the accuracy of the DNA barcode library; in fact, the ability to query this DNA library (the database and the extracts) for endosymbionts is one of the ancillary benefits of such a large scale endeavor--which we provide several examples. It is our conclusion that regular assays for Wolbachia presence and type can, and should, be adopted by large scale insect barcoding initiatives. While COI is one of the five multi-locus sequence typing (MLST) genes used for categorizing Wolbachia, there is limited overlap with the eukaryotic DNA barcode region.
Launching the Greek forensic DNA database. The legal framework and arising ethical issues.

PubMed

Voultsos, Polychronis; Njau, Samuel; Tairis, Nikolaos; Psaroulis, Dimitrios; Kovatsi, Leda

2011-11-01

Since the creation of the first national DNA database in Europe in 1995, many European countries have legislated laws for initiating and regulating their own databases. The Greek government legislated a law in 2008, by which the National DNA Database of Greece was founded and regulated. According to this law, only DNA profiles from convicted criminals were recorded. Nevertheless, a year later, in 2009, the law was amended to permit the creation of an expanded database including innocent people and children. Unfortunately, the new law is very vague in many aspects and does not respect the principle of proportionality. Therefore, according to our opinion, it will soon need to be re-amended. Furthermore, prior to legislating the new law, there was no debate with the community itself in order to clarify what system would best suit Greece and what the citizens would be willing to accept. We present the current legal framework in Greece, we highlight issues that need to be clarified and we discuss possible ethical issues that may arise. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Evaluation of suitable DNA regions for molecular identification of high value medicinal plants in genus Kaempferia.

PubMed

Osathanunkul, Maslin; Dheeranupattana, Srisulak; Rotarayanont, Siriphron; Sookkhee, Siriwoot; Osathanunkul, Khukrit; Madesis, Panagiotis

2017-12-02

DNA barcoding coupled high resolution melting (Bar-HRM) is an emerging method for species discrimination based on DNA dissociation kinetics. The aim of this work was to evaluate the suitability of different primer sets, derived from selected DNA regions, for Bar-HRM analysis of species in Kaempferia (Zingiberaceae). Four primer pairs were evaluated (rbcL, rpoC, trnL and ITS1). It was observed that the ITS1 barcode was the most useful DNA barcoding region overall for species discrimination out of all of the regions and primers assessed. Thus, the primer pair derived from the ITS1 region was the single most effective region for the identification of the tested species, whereas the rbcL primer pair gave the lowest resolution. Our Bar-HRM developed here would not only be useful for identification of Kaempferia plant specimens lacking essential parts for morphological identification but will be useful for authenticating products in powdered form of a high value medicinal species Kaempferia parviflora, in particular.
DNA barcode reference data for the Korean herpetofauna and their applications.

PubMed

Jeong, Tae Jin; Jun, Jumin; Han, Sanghoon; Kim, Hyun Tae; Oh, Kyunghee; Kwak, Myounghai

2013-11-01

Recently, amphibians and reptiles have drawn attention because of declines in species and populations caused mainly by habitat loss, overexploitation and climate change. This study constructed a DNA barcode database for the Korean herpetofauna, including all the recorded amphibians and 68% of the recorded reptiles, to provide a useful, standardized tool for species identification in monitoring and management. A total of 103 individuals from 18 amphibian and 17 reptile species were used to generate barcode sequences using partial sequences of the mitochondrial cytochrome c oxidase subunit I (COI) gene and to compare it with other suggested barcode loci. Comparing 16S rRNA, cytochrome b (Cytb) and COI for amphibians and 12S rRNA, Cytb and COI for reptiles, our results revealed that COI is better than the other markers in terms of a high level of sequence variation without length variation and moderate amplification success. Although the COI marker had no clear barcoding gap because of the high level of intraspecific variation, all of the analysed individuals from the same species clustered together in a neighbour-joining tree. High intraspecific variation suggests the possibility of cryptic species. Finally, using this database, confiscated snakes were identified as Elaphe schrenckii, designated as endangered in Korea and a food contaminant was identified as the lizard Takydromus amurensis. © 2013 John Wiley & Sons Ltd.
DNAtraffic--a new database for systems biology of DNA dynamics during the cell life.

PubMed

Kuchta, Krzysztof; Barszcz, Daniela; Grzesiuk, Elzbieta; Pomorski, Pawel; Krwawicz, Joanna

2012-01-01

DNAtraffic (http://dnatraffic.ibb.waw.pl/) is dedicated to be a unique comprehensive and richly annotated database of genome dynamics during the cell life. It contains extensive data on the nomenclature, ontology, structure and function of proteins related to the DNA integrity mechanisms such as chromatin remodeling, histone modifications, DNA repair and damage response from eight organisms: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Escherichia coli and Arabidopsis thaliana. DNAtraffic contains comprehensive information on the diseases related to the assembled human proteins. DNAtraffic is richly annotated in the systemic information on the nomenclature, chemistry and structure of DNA damage and their sources, including environmental agents or commonly used drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA network analysis. Database includes illustrations of pathways, damage, proteins and drugs. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines, it has to be extensively linked to numerous external data sources. Our database represents the result of the manual annotation work aimed at making the DNAtraffic much more useful for a wide range of systems biology applications.
DNAtraffic—a new database for systems biology of DNA dynamics during the cell life

PubMed Central

Kuchta, Krzysztof; Barszcz, Daniela; Grzesiuk, Elzbieta; Pomorski, Pawel; Krwawicz, Joanna

2012-01-01

DNAtraffic (http://dnatraffic.ibb.waw.pl/) is dedicated to be a unique comprehensive and richly annotated database of genome dynamics during the cell life. It contains extensive data on the nomenclature, ontology, structure and function of proteins related to the DNA integrity mechanisms such as chromatin remodeling, histone modifications, DNA repair and damage response from eight organisms: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Escherichia coli and Arabidopsis thaliana. DNAtraffic contains comprehensive information on the diseases related to the assembled human proteins. DNAtraffic is richly annotated in the systemic information on the nomenclature, chemistry and structure of DNA damage and their sources, including environmental agents or commonly used drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA network analysis. Database includes illustrations of pathways, damage, proteins and drugs. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines, it has to be extensively linked to numerous external data sources. Our database represents the result of the manual annotation work aimed at making the DNAtraffic much more useful for a wide range of systems biology applications. PMID:22110027
The effect on cadaver blood DNA identification by the use of targeted and whole body post-mortem computed tomography angiography.

PubMed

Rutty, Guy N; Barber, Jade; Amoroso, Jasmin; Morgan, Bruno; Graham, Eleanor A M

2013-12-01

Post-mortem computed tomography angiography (PMCTA) involves the injection of contrast agents. This could have both a dilution effect on biological fluid samples and could affect subsequent post-contrast analytical laboratory processes. We undertook a small sample study of 10 targeted and 10 whole body PMCTA cases to consider whether or not these two methods of PMCTA could affect post-PMCTA cadaver blood based DNA identification. We used standard methodology to examine DNA from blood samples obtained before and after the PMCTA procedure. We illustrate that neither of these PMCTA methods had an effect on the alleles called following short tandem repeat based DNA profiling, and therefore the ability to undertake post-PMCTA blood based DNA identification.
Identification and characterization of the zebrafish glutathione S-transferase Pi-1.

PubMed

Abunnaja, Maryam S; Kurogi, Katsuhisa; Mohammed, Yasir I; Sakakibara, Yoichi; Suiko, Masahito; Hassoun, Ezdihar A; Liu, Ming-Cheh

2017-10-01

Zebrafish has in recent years emerged as a popular vertebrate model for use in pharmacological and toxicological studies. While there have been sporadic studies on the zebrafish glutathione S-transferases (GSTs), the zebrafish GST gene superfamily still awaits to be fully elucidated. We report here the identification of 15 zebrafish cytosolic GST genes in NCBI GenBank database and the expression, purification, and enzymatic characterization of the zebrafish cytosolic GST Pi-1 (GSTP1). The cDNA encoding the zebrafish GSTP1 was cloned from a 3-month-old female zebrafish, expressed in Eschelichia coli host cells, and purified. Purified GSTP1 displayed glutathione-conjugating activity toward 1-chloro-2,4-dinitrobenzene as a representative substrate. The enzymatic characteristics of the zebrafish GSTP1, including pH-dependency, effects of metal cations, and kinetic parameters, were studied. Moreover, the expression of zebrafish GSTP1 at different developmental stages during embryogenesis, throughout larval development, onto maturity was examined. © 2017 Wiley Periodicals, Inc.

IDENTIFICATION OF STEROCHEMICAL CONFIGERATION OF CYCLOPENTA[CD]PYRENE-DNA ADDUCTS IN STRAIN A/J MOUSE LUNG AND C3H10T1/2CL8

EPA Science Inventory

The definitive identification of stereochemical configurations of DNA adducts detected by 32P-postlabeling requires co-chromatography of adducts with synthetic chromatographic standards. Four major and several minor DNA adducts are formed by cyclopenta[cd]pyrene (CPP) in strain A...
Serogroup-level resolution of the “Super-7” Shiga toxin-producing Escherichia coli using nanopore single-molecule DNA sequencing

USDA-ARS?s Scientific Manuscript database

DNA sequencing and other DNA-based methods, such as PCR, are now broadly used for detection and identification of bacterial foodborne pathogens. For the identification of foodborne bacterial pathogens, it is important to make taxonomic assignments to the species, or even subspecies level. Long-read ...
OnTheFly: a database of Drosophila melanogaster transcription factors and their binding sites.

PubMed

Shazman, Shula; Lee, Hunjoong; Socol, Yakov; Mann, Richard S; Honig, Barry

2014-01-01

We present OnTheFly (http://bhapp.c2b2.columbia.edu/OnTheFly/index.php), a database comprising a systematic collection of transcription factors (TFs) of Drosophila melanogaster and their DNA-binding sites. TFs predicted in the Drosophila melanogaster genome are annotated and classified and their structures, obtained via experiment or homology models, are provided. All known preferred TF DNA-binding sites obtained from the B1H, DNase I and SELEX methodologies are presented. DNA shape parameters predicted for these sites are obtained from a high throughput server or from crystal structures of protein-DNA complexes where available. An important feature of the database is that all DNA-binding domains and their binding sites are fully annotated in a eukaryote using structural criteria and evolutionary homology. OnTheFly thus provides a comprehensive view of TFs and their binding sites that will be a valuable resource for deciphering non-coding regulatory DNA.
STRBase: a short tandem repeat DNA database for the human identity testing community

PubMed Central

Ruitberg, Christian M.; Reeder, Dennis J.; Butler, John M.

2001-01-01

The National Institute of Standards and Technology (NIST) has compiled and maintained a Short Tandem Repeat DNA Internet Database (http://www.cstl.nist.gov/biotech/strbase/) since 1997 commonly referred to as STRBase. This database is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. STRBase consolidates and organizes the abundant literature on this subject to facilitate on-going efforts in DNA typing. Observed alleles and annotated sequence for each STR locus are described along with a review of STR analysis technologies. Additionally, commercially available STR multiplex kits are described, published polymerase chain reaction (PCR) primer sequences are reported, and validation studies conducted by a number of forensic laboratories are listed. To supplement the technical information, addresses for scientists and hyperlinks to organizations working in this area are available, along with the comprehensive reference list of over 1300 publications on STRs used for DNA typing purposes. PMID:11125125
The repetitive landscape of the chicken genome.

PubMed

Wicker, Thomas; Robertson, Jon S; Schulze, Stefan R; Feltus, F Alex; Magrini, Vincent; Morrison, Jason A; Mardis, Elaine R; Wilson, Richard K; Peterson, Daniel G; Paterson, Andrew H; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7 x coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.
The repetitive landscape of the chicken genome

PubMed Central

Wicker, Thomas; Robertson, Jon S.; Schulze, Stefan R.; Feltus, F. Alex; Magrini, Vincent; Morrison, Jason A.; Mardis, Elaine R.; Wilson, Richard K.; Peterson, Daniel G.; Paterson, Andrew H.; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available. PMID:15256510
oriTfinder: a web-based tool for the identification of origin of transfers in DNA sequences of bacterial mobile genetic elements.

PubMed

Li, Xiaobin; Xie, Yingzhou; Liu, Meng; Tai, Cui; Sun, Jingyong; Deng, Zixin; Ou, Hong-Yu

2018-05-04

oriTfinder is a web server that facilitates the rapid identification of the origin of transfer site (oriT) of a conjugative plasmid or chromosome-borne integrative and conjugative element. The utilized back-end database oriTDB was built upon more than one thousand known oriT regions of bacterial mobile genetic elements (MGEs) as well as the known MGE-encoding relaxases and type IV coupling proteins (T4CP). With a combination of similarity searches for the oriTDB-archived oriT nucleotide sequences and the co-localization of the flanking relaxase homologous genes, the oriTfinder can predict the oriT region with high accuracy in the DNA sequence of a bacterial plasmid or chromosome in minutes. The server also detects the other transfer-related modules, including the potential relaxase gene, T4CP gene and the type IV secretion system gene cluster, and the putative genes coding for virulence factors and acquired antibiotic resistance determinants. oriTfinder may contribute to meeting the increasing demands of re-annotations for bacterial conjugative, mobilizable or non-transferable elements and aid in the rapid risk accession of disease-relevant trait dissemination in pathogenic bacteria of interest. oriTfinder is freely available to all users without any login requirement at http://bioinfo-mml.sjtu.edu.cn/oriTfinder.
Identification of desiccation tolerance transcripts potentially involved in rape (Brassica napus L.) seeds development and germination.

PubMed

Lang, Sirui; Liu, Xiaoxia; Ma, Gang; Lan, QinYing; Wang, Xiaofeng

2014-10-01

To investigate regulatory processes and protective mechanisms leading to desiccation tolerance (DT) in seeds, cDNA amplified fragment length polymorphism (cDNA-AFLP) in conjunction with 128 primer combinations was used to detect differential gene expression in rape seeds in response to DT during seed development and germination. We obtained approximately 8000 transcript-derived fragments (TDFs), of which 394 TDFs with differential expression patterns ("sustained expression", "up-regulated", "couple with seed DT", and "down-regulated") were excised from gels and re-amplified by polymerase chain reaction (PCR). After sequencing and comparison with the National Center for Biotechnology Information database, 176 TDFs presented significant similarity with known genes that could be classified into the following categories: metabolism and energy, stress resistance and defense, storage, signal transduction, and other functional categories. Using semiquantitative reverse-transcription PCR and real-time PCR approaches, the significance of the differences was further confirmed in fresh seeds and dehydrated seeds. The genes that encode superoxide dismutase, peroxiredoxin, caleosin, oleosin S3, steroleosin, late embryogenesis abundant protein, glutathione reductase, β-glucosidase, S23 transcriptional repressor, and some heat-shock proteins could be associated with DT. The results of this study will aid in the identification of candidate genes for future experiments that seek to understand seed DT. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
In search for the grave of 100 Poles executed on March 20, 1942 in Zgierz, Poland - research by SIGO (Network for Genetic Identification of Victims).

PubMed

Jacewicz, Renata; Ossowski, Andrzej; Ławrynowicz, Olgierd; Jędrzejczyk, Maciej; Prośniak, Adam; Bąbol-Pokora, Katarzyna; Diepenbroek, Marta; Szargut, Maria; Zielińska, Grażyna; Berent, Jarosław

2017-01-01

It can be reasonably assumed that remains exhumed in 2012 and 2013 during archaeological explorations conducted in the Lućmierz Forest, an important area on the map of the German Nazi terror in the region of Lodz (Poland), are in fact the remains of a hundred Poles murdered by the Nazis in Zgierz on March 20, 1942. By virtue of a decision of the Polish Institute of National Remembrance's Commission for the Prosecution of Crimes Against the Polish Nation, the verification of this research hypothesis was entrusted to SIGO (Network for Genetic Identification of Victims) Consortium appointed by virtue of an agreement of December 11, 2015. The Consortium is an extension of the PBGOT (Polish Genetic Database of Totalitarianisms Victims). So far, the researchers have retrieved 14 DNA profiles from among the examined remains, including 12 male and 2 female profiles. Furthermore, 12 DNA profiles of the victims' family members have been collected. Due to the fact that next-of-kin relatives of the victims of the Zgierz massacre are of advanced age, it is of key importance to collect genetic material as soon as possible from the other surviving family members, identified on the basis of a list of victims that has been nearly completely compiled by the Polish Institute of National Remembrance (IPN) and is presented in this paper.
FORENSIC DNA BANKING LEGISLATION IN DEVELOPING COUNTRIES: PRIVACY AND CONFIDENTIALITY CONCERNS REGARDING A DRAFT FROM TURKISH LEGISLATION.

PubMed

Ilgili, Önder; Arda, Berna

This paper presents and analyses, in terms of privacy and confidentiality, the Turkish Draft Law on National DNA Database prepared in 2004, and concerning the use of DNA analysis for forensic objectives and identity verification in Turkey. After a short introduction including related concepts, we evaluate the draft law and provide articles about confidentiality. The evaluation reminded us of some important topics at international level for the developing countries. As a result, the need for sophisticated legislations about DNA databases, for solutions to issues related to the education of employees, and the technological dependency to other countries emerged as main challenges in terms of confidentiality for the developing countries. As seen in the Turkish Draft Law on National DNA Database, the protection of the fundamental rights and freedoms requires more care during the legislative efforts.
[Evaluation of mass spectrometry for the identification of clinically interesting yeasts].

PubMed

Galán, Fátima; García-Agudo, Lidia; Guerrero, Inmaculada; Marín, Pilar; García-Tapia, Ana; García-Martos, Pedro; Rodríguez-Iglesias, Manuel

2015-01-01

Identification of yeasts is based on morphological, biochemical and nutritional characteristics, and using molecular methods. Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, a new method for the identification of microorganisms, has demonstrated to be very useful. The aim of this study is to evaluate this new method in the identification of yeasts. A total of 600 strains of yeasts isolated from clinical specimens belonging to 9 genera and 43 species were tested. Identification was made by sequencing of the ITS regions of ribosomal DNA, assimilation of carbon compounds (ID 32C), and mass spectrometry on a Microflex spectrometer (Bruker Daltonics GmbH, Germany). A total of 569 strains (94.8%) were identified to species level by ID 32C, and 580 (96.7%) by MALDI-TOF. Concordance between both methods was observed for 553 strains (92.2%), with 100% in clinically relevant species: C. albicans, C. glabrata, C. parapsilosis, C. tropicalis, and almost 100% in C. krusei. MALDI-TOF identified species requiring molecular methods: Candida dubliniensis, C. nivariensis, C. metapsilosis and C. orthopsilosis. Some irregularities were observed in the identification of arthroconidia yeast and basidiomycetes. MALDI-TOF is a rapid, effective and economic method, which enables the identification of most clinically important yeasts and the differentiation of closely related species. It would be desirable to include more species in its database to expand its performance. Copyright © 2014 Elsevier España, S.L.U. y Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.
Integrating a DNA barcoding project with an ecological survey: a case study on temperate intertidal polychaete communities in Qingdao, China

NASA Astrophysics Data System (ADS)

Zhou, Hong; Zhang, Zhinan; Chen, Haiyan; Sun, Renhua; Wang, Hui; Guo, Lei; Pan, Haijian

2010-07-01

In this study, we integrated a DNA barcoding project with an ecological survey on intertidal polychaete communities and investigated the utility of CO1 gene sequence as a DNA barcode for the classification of the intertidal polychaetes. Using 16S rDNA as a complementary marker and combining morphological and ecological characterization, some of dominant and common polychaete species from Chinese coasts were assessed for their taxonomic status. We obtained 22 haplotype gene sequences of 13 taxa, including 10 CO1 sequences and 12 16S rDNA sequences. Based on intra- and inter-specific distances, we built phylogenetic trees using the neighbor-joining method. Our study suggested that the mitochondrial CO1 gene was a valid DNA barcoding marker for species identification in polychaetes, but other genes, such as 16S rDNA, could be used as a complementary genetic marker. For more accurate species identification and effective testing of species hypothesis, DNA barcoding should be incorporated with morphological, ecological, biogeographical, and phylogenetic information. The application of DNA barcoding and molecular identification in the ecological survey on the intertidal polychaete communities demonstrated the feasibility of integrating DNA taxonomy and ecology.
DNA Barcoding of Recently Diverged Species: Relative Performance of Matching Methods

PubMed Central

van Velzen, Robin; Weitschek, Emanuel; Felici, Giovanni; Bakker, Freek T.

2012-01-01

Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a ‘barcode gap’ and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification. PMID:22272356
DNA barcoding of recently diverged species: relative performance of matching methods.

PubMed

van Velzen, Robin; Weitschek, Emanuel; Felici, Giovanni; Bakker, Freek T

2012-01-01

Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a 'barcode gap' and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification.
Forensic Analysis of Human DNA from Samples Contamined with Bioweapons Agents

DTIC Science & Technology

2011-10-01

Forensic analysis of human DNA from samples contaminated with bioweapons agents Jason Timbers Kathryn Wright Royal Canadian Mounted...Police Forensic Science and Identification Service Prepared By: Royal Canadian Mounted Police RCMP Forensic Science Identification Services... Royal Canadian Mounted Police Forensic Science and Identification Service Prepared By: Royal Canadian Mounted Police RCMP Forensic Science
Database extraction strategies for low-template evidence.

PubMed

Bleka, Øyvind; Dørum, Guro; Haned, Hinda; Gill, Peter

2014-03-01

Often in forensic cases, the profile of at least one of the contributors to a DNA evidence sample is unknown and a database search is needed to discover possible perpetrators. In this article we consider two types of search strategies to extract suspects from a database using methods based on probability arguments. The performance of the proposed match scores is demonstrated by carrying out a study of each match score relative to the level of allele drop-out in the crime sample, simulating low-template DNA. The efficiency was measured by random man simulation and we compared the performance using the SGM Plus kit and the ESX 17 kit for the Norwegian population, demonstrating that the latter has greatly enhanced power to discover perpetrators of crime in large national DNA databases. The code for the database extraction strategies will be prepared for release in the R-package forensim. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Public participation in genetic databases: crossing the boundaries between biobanks and forensic DNA databases through the principle of solidarity

PubMed Central

Machado, Helena; Silva, Susana

2015-01-01

The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of ‘solidarity’, traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. PMID:26139851
SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

PubMed

Couvin, David; Zozio, Thierry; Rastogi, Nalin

2017-07-01

Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
BLAST and FASTA similarity searching for multiple sequence alignment.

PubMed

Pearson, William R

2014-01-01

BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Development of a conceptual integrated traffic safety problem identification database

DOT National Transportation Integrated Search

1999-12-01

The project conceptualized a traffic safety risk management information system and statistical database for improved problem-driver identification, countermeasure development, and resource allocation. The California Department of Motor Vehicles Drive...

9 CFR 55.25 - Animal identification.

Code of Federal Regulations, 2011 CFR

2011-01-01

... Database. The second animal identification must be unique for the individual animal within the herd and also must be linked to that animal and herd in the CWD National Database. (Approved by the Office of...
9 CFR 55.25 - Animal identification.

Code of Federal Regulations, 2012 CFR

2012-01-01

... Database. The second animal identification must be unique for the individual animal within the herd and also must be linked to that animal and herd in the CWD National Database. (Approved by the Office of...
9 CFR 55.25 - Animal identification.

Code of Federal Regulations, 2010 CFR

2010-01-01

... Database. The second animal identification must be unique for the individual animal within the herd and also must be linked to that animal and herd in the CWD National Database. (Approved by the Office of...
The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data.

PubMed

Lammers, Youri; Peelen, Tamara; Vos, Rutger A; Gravendeel, Barbara

2014-02-06

Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation' barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker.
The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data

PubMed Central

2014-01-01

Background Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. Results The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation’ barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. Conclusions The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker. PMID:24502833
The changing epitome of species identification – DNA barcoding

PubMed Central

Ajmal Ali, M.; Gyulai, Gábor; Hidvégi, Norbert; Kerti, Balázs; Al Hemaid, Fahad M.A.; Pandey, Arun K.; Lee, Joongku

2014-01-01

The discipline taxonomy (the science of naming and classifying organisms, the original bioinformatics and a basis for all biology) is fundamentally important in ensuring the quality of life of future human generation on the earth; yet over the past few decades, the teaching and research funding in taxonomy have declined because of its classical way of practice which lead the discipline many a times to a subject of opinion, and this ultimately gave birth to several problems and challenges, and therefore the taxonomist became an endangered race in the era of genomics. Now taxonomy suddenly became fashionable again due to revolutionary approaches in taxonomy called DNA barcoding (a novel technology to provide rapid, accurate, and automated species identifications using short orthologous DNA sequences). In DNA barcoding, complete data set can be obtained from a single specimen irrespective to morphological or life stage characters. The core idea of DNA barcoding is based on the fact that the highly conserved stretches of DNA, either coding or non coding regions, vary at very minor degree during the evolution within the species. Sequences suggested to be useful in DNA barcoding include cytoplasmic mitochondrial DNA (e.g. cox1) and chloroplast DNA (e.g. rbcL, trnL-F, matK, ndhF, and atpB rbcL), and nuclear DNA (ITS, and house keeping genes e.g. gapdh). The plant DNA barcoding is now transitioning the epitome of species identification; and thus, ultimately helping in the molecularization of taxonomy, a need of the hour. The ‘DNA barcodes’ show promise in providing a practical, standardized, species-level identification tool that can be used for biodiversity assessment, life history and ecological studies, forensic analysis, and many more. PMID:24955007
Recovery Based Nanowire Field-Effect Transistor Detection of Pathogenic Avian Influenza DNA

NASA Astrophysics Data System (ADS)

Lin, Chih-Heng; Chu, Chia-Jung; Teng, Kang-Ning; Su, Yi-Jr; Chen, Chii-Dong; Tsai, Li-Chu; Yang, Yuh-Shyong

2012-02-01

Fast and accurate diagnosis is critical in infectious disease surveillance and management. We proposed a DNA recovery system that can easily be adapted to DNA chip or DNA biosensor for fast identification and confirmation of target DNA. This method was based on the re-hybridization of DNA target with a recovery DNA to free the DNA probe. Functionalized silicon nanowire field-effect transistor (SiNW FET) was demonstrated to monitor such specific DNA-DNA interaction using high pathogenic strain virus hemagglutinin 1 (H1) DNA of avian influenza (AI) as target. Specific electric changes were observed in real-time for AI virus DNA sensing and device recovery when nanowire surface of SiNW FET was modified with complementary captured DNA probe. The recovery based SiNW FET biosensor can be further developed for fast identification and further confirmation of a variety of influenza virus strains and other infectious diseases.
Ongoing revolution in bacteriology: routine identification of bacteria by matrix-assisted laser desorption ionization time-of-flight mass spectrometry.

PubMed

Seng, Piseth; Drancourt, Michel; Gouriet, Frédérique; La Scola, Bernard; Fournier, Pierre-Edouard; Rolain, Jean Marc; Raoult, Didier

2009-08-15

Matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry accurately identifies both selected bacteria and bacteria in select clinical situations. It has not been evaluated for routine use in the clinic. We prospectively analyzed routine MALDI-TOF mass spectrometry identification in parallel with conventional phenotypic identification of bacteria regardless of phylum or source of isolation. Discrepancies were resolved by 16S ribosomal RNA and rpoB gene sequence-based molecular identification. Colonies (4 spots per isolate directly deposited on the MALDI-TOF plate) were analyzed using an Autoflex II Bruker Daltonik mass spectrometer. Peptidic spectra were compared with the Bruker BioTyper database, version 2.0, and the identification score was noted. Delays and costs of identification were measured. Of 1660 bacterial isolates analyzed, 95.4% were correctly identified by MALDI-TOF mass spectrometry; 84.1% were identified at the species level, and 11.3% were identified at the genus level. In most cases, absence of identification (2.8% of isolates) and erroneous identification (1.7% of isolates) were due to improper database entries. Accurate MALDI-TOF mass spectrometry identification was significantly correlated with having 10 reference spectra in the database (P=.01). The mean time required for MALDI-TOF mass spectrometry identification of 1 isolate was 6 minutes for an estimated 22%-32% cost of current methods of identification. MALDI-TOF mass spectrometry is a cost-effective, accurate method for routine identification of bacterial isolates in <1 h using a database comprising > or =10 reference spectra per bacterial species and a 1.9 identification score (Brucker system). It may replace Gram staining and biochemical identification in the near future.
Alignment of high-throughput sequencing data inside in-memory databases.

PubMed

Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias

2014-01-01

In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.).

PubMed

Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A

2015-10-26

Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
Barcoding of fresh water fishes from Pakistan.

PubMed

Karim, Asma; Iqbal, Asad; Akhtar, Rehan; Rizwan, Muhammad; Amar, Ali; Qamar, Usman; Jahan, Shah

2016-07-01

DNA bar-coding is a taxonomic method that uses small genetic markers in organisms' mitochondrial DNA (mt DNA) for identification of particular species. It uses sequence diversity in a 658-base pair fragment near the 5' end of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene as a tool for species identification. DNA barcoding is more accurate and reliable method as compared with the morphological identification. It is equally useful in juveniles as well as adult stages of fishes. The present study was conducted to identify three farm fish species of Pakistan (Cyprinus carpio, Cirrhinus mrigala, and Ctenopharyngodon idella) genetically. All of them belonged to family cyprinidae. CO1 gene was amplified. PCR products were sequenced and analyzed by bioinformatic software. Conspecific, congenric, and confamilial k2P nucleotide divergence was estimated. From these findings, it was concluded that the gene sequence, CO1, may serve as milestone for the identification of related species at molecular level.
Molecular-based rapid inventories of sympatric diversity: a comparison of DNA barcode clustering methods applied to geography-based vs clade-based sampling of amphibians.

PubMed

Paz, Andrea; Crawford, Andrew J

2012-11-01

Molecular markers offer a universal source of data for quantifying biodiversity. DNA barcoding uses a standardized genetic marker and a curated reference database to identify known species and to reveal cryptic diversity within wellsampled clades. Rapid biological inventories, e.g. rapid assessment programs (RAPs), unlike most barcoding campaigns, are focused on particular geographic localities rather than on clades. Because of the potentially sparse phylogenetic sampling, the addition of DNA barcoding to RAPs may present a greater challenge for the identification of named species or for revealing cryptic diversity. In this article we evaluate the use of DNA barcoding for quantifying lineage diversity within a single sampling site as compared to clade-based sampling, and present examples from amphibians. We compared algorithms for identifying DNA barcode clusters (e.g. species, cryptic species or Evolutionary Significant Units) using previously published DNA barcode data obtained from geography-based sampling at a site in Central Panama, and from clade-based sampling in Madagascar. We found that clustering algorithms based on genetic distance performed similarly on sympatric as well as clade-based barcode data, while a promising coalescent-based method performed poorly on sympatric data. The various clustering algorithms were also compared in terms of speed and software implementation. Although each method has its shortcomings in certain contexts, we recommend the use of the ABGD method, which not only performs fairly well under either sampling method, but does so in a few seconds and with a user-friendly Web interface.
Pay Attention to the Overlooked Cryptic Diversity in Existing Barcoding Data: the Case of Mollusca with Character-Based DNA Barcoding.

PubMed

Zou, Shanmei; Li, Qi

2016-06-01

With the global biodiversity crisis, DNA barcoding aims for fast species identification and cryptic species diversity revelation. For more than 10 years, large amounts of DNA barcode data have been accumulating in publicly available databases, most of which were conducted by distance or tree-building methods that have often been argued, especially for cryptic species revelation. In this context, overlooked cryptic diversity may exist in the available barcoding data. The character-based DNA barcoding, however, has a good chance for detecting the overlooked cryptic diversity. In this study, marine mollusk was as the ideal case for detecting the overlooked potential cryptic species from existing cytochrome c oxidase I (COI) sequences with character-based DNA barcode. A total of 1081 COI sequences of mollusks, belonging to 176 species of 25 families of Gastropoda, Cephalopoda, and Lamellibranchia, were conducted by character analysis. As a whole, the character-based barcoding results were consistent with previous distance and tree-building analysis for species discrimination. More importantly, quite a number of species analyzed were divided into distinct clades with unique diagnostical characters. Based on the concept of cryptic species revelation of character-based barcoding, these species divided into separate taxonomic groups might be potential cryptic species. The detection of the overlooked potential cryptic diversity proves that the character-based barcoding mode possesses more advantages of revealing cryptic biodiversity. With the development of DNA barcoding, making the best use of barcoding data is worthy of our attention for species conservation.
Sugarcane giant borer transcriptome analysis and identification of genes related to digestion.

PubMed

Fonseca, Fernando Campos de Assis; Firmino, Alexandre Augusto Pereira; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; de Souza Júnior, José Dijair Antonino; de Sousa Júnior, José Dijair Antonino; Silva-Junior, Orzenil Bonfim; Togawa, Roberto Coiti; Pappas, Georgios Joannis; de Góis, Luiz Avelar Brandão; da Silva, Maria Cristina Mattar; Grossi-de-Sá, Maria Fátima

2015-01-01

Sugarcane is a widely cultivated plant that serves primarily as a source of sugar and ethanol. Its annual yield can be significantly reduced by the action of several insect pests including the sugarcane giant borer (Telchin licus licus), a lepidopteran that presents a long life cycle and which efforts to control it using pesticides have been inefficient. Although its economical relevance, only a few DNA sequences are available for this species in the GenBank. Pyrosequencing technology was used to investigate the transcriptome of several developmental stages of the insect. To maximize transcript diversity, a pool of total RNA was extracted from whole body insects and used to construct a normalized cDNA database. Sequencing produced over 650,000 reads, which were de novo assembled to generate a reference library of 23,824 contigs. After quality score and annotation, 43% of the contigs had at least one BLAST hit against the NCBI non-redundant database, and 40% showed similarities with the lepidopteran Bombyx mori. In a further analysis, we conducted a comparison with Manduca sexta midgut sequences to identify transcripts of genes involved in digestion. Of these transcripts, many presented an expansion or depletion in gene number, compared to B. mori genome. From the sugarcane giant borer (SGB) transcriptome, a number of aminopeptidase N (APN) cDNAs were characterized based on homology to those reported as Cry toxin receptors. This is the first report that provides a large-scale EST database for the species. Transcriptome analysis will certainly be useful to identify novel developmental genes, to better understand the insect's biology and to guide the development of new strategies for insect-pest control.
A catalog for the transcripts from the venomous structures of the caterpillar Lonomia obliqua: identification of the proteins potentially involved in the coagulation disorder and hemorrhagic syndrome

PubMed Central

Veiga, Ana B. G.; Ribeiro, José M. C.; Guimarães, Jorge A.; Francischetti, Ivo M.B.

2010-01-01

Accidents with the caterpillar Lonomia obliqua are often associated with a coagulation disorder and hemorrhagic syndrome in humans. In the present study, we have constructed cDNA libraries from two venomous structures of the caterpillar, namely the tegument and the bristle. High-throughput sequencing and bioinformatics analyses were performed in parallel. Over one thousand cDNAs were obtained and clustered to produce a database of 538 contigs and singletons (clusters) for the tegument library and 368 for the bristle library. We have thus identified dozens of full-length cDNAs coding for proteins with sequence homology to snake venom prothrombin activator, trypsin-like enzymes, blood coagulation factors and prophenoloxidase cascade activators. We also report cDNA coding for cysteine proteases, Group III phospholipase A2, C-type lectins, lipocalins, in addition to protease inhibitors including serpins, Kazal-type inhibitors, cystatins and trypsin inhibitor-like molecules. Antibacterial proteins and housekeeping genes are also described. A significant number of sequences were devoid of database matches, suggesting that their biologic function remains to be defined. We also report the N-terminus of the most abundant proteins present in the bristle, tegument, hemolymph, and "cryosecretion". Thus, we have created a catalog that contains the predicted molecular weight, isoelectric point, accession number, and putative function for each selected molecule from the venomous structures of L. obliqua. The role of these molecules in the coagulation disorder and hemorrhagic syndrome caused by envenomation with this caterpillar is discussed. All sequence information and the Supplemental Data, including Figures and Tables with hyperlinks to FASTA-formatted files for each contig and the best match to the Databases, are available at http://www.ncbi.nih.gov/projects/omes. PMID:16023793
Sugarcane Giant Borer Transcriptome Analysis and Identification of Genes Related to Digestion

PubMed Central

de Assis Fonseca, Fernando Campos; Firmino, Alexandre Augusto Pereira; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; de Sousa Júnior, José Dijair Antonino; Silva-Junior, Orzenil Bonfim; Togawa, Roberto Coiti; Pappas, Georgios Joannis; de Góis, Luiz Avelar Brandão; da Silva, Maria Cristina Mattar; Grossi-de-Sá, Maria Fátima

2015-01-01

Sugarcane is a widely cultivated plant that serves primarily as a source of sugar and ethanol. Its annual yield can be significantly reduced by the action of several insect pests including the sugarcane giant borer (Telchin licus licus), a lepidopteran that presents a long life cycle and which efforts to control it using pesticides have been inefficient. Although its economical relevance, only a few DNA sequences are available for this species in the GenBank. Pyrosequencing technology was used to investigate the transcriptome of several developmental stages of the insect. To maximize transcript diversity, a pool of total RNA was extracted from whole body insects and used to construct a normalized cDNA database. Sequencing produced over 650,000 reads, which were de novo assembled to generate a reference library of 23,824 contigs. After quality score and annotation, 43% of the contigs had at least one BLAST hit against the NCBI non-redundant database, and 40% showed similarities with the lepidopteran Bombyx mori. In a further analysis, we conducted a comparison with Manduca sexta midgut sequences to identify transcripts of genes involved in digestion. Of these transcripts, many presented an expansion or depletion in gene number, compared to B. mori genome. From the sugarcane giant borer (SGB) transcriptome, a number of aminopeptidase N (APN) cDNAs were characterized based on homology to those reported as Cry toxin receptors. This is the first report that provides a large-scale EST database for the species. Transcriptome analysis will certainly be useful to identify novel developmental genes, to better understand the insect’s biology and to guide the development of new strategies for insect-pest control. PMID:25706301
Short tandem repeat profiling: part of an overall strategy for reducing the frequency of cell misidentification.

PubMed

Nims, Raymond W; Sykes, Greg; Cottrill, Karin; Ikonomi, Pranvera; Elmore, Eugene

2010-12-01

The role of cell authentication in biomedical science has received considerable attention, especially within the past decade. This quality control attribute is now beginning to be given the emphasis it deserves by granting agencies and by scientific journals. Short tandem repeat (STR) profiling, one of a few DNA profiling technologies now available, is being proposed for routine identification (authentication) of human cell lines, stem cells, and tissues. The advantage of this technique over methods such as isoenzyme analysis, karyotyping, human leukocyte antigen typing, etc., is that STR profiling can establish identity to the individual level, provided that the appropriate number and types of loci are evaluated. To best employ this technology, a standardized protocol and a data-driven, quality-controlled, and publically searchable database will be necessary. This public STR database (currently under development) will enable investigators to rapidly authenticate human-based cultures to the individual from whom the cells were sourced. Use of similar approaches for non-human animal cells will require developing other suitable loci sets. While implementing STR analysis on a more routine basis should significantly reduce the frequency of cell misidentification, additional technologies may be needed as part of an overall authentication paradigm. For instance, isoenzyme analysis, PCR-based DNA amplification, and sequence-based barcoding methods enable rapid confirmation of a cell line's species of origin while screening against cross-contaminations, especially when the cells present are not recognized by the species-specific STR method. Karyotyping may also be needed as a supporting tool during establishment of an STR database. Finally, good cell culture practices must always remain a major component of any effort to reduce the frequency of cell misidentification.
DNA analysis of hair and scat collected along snow tracks to document the presence of Canada Lynx.

Treesearch

Kevin S. McKelvey; Jeffrey von Kienast; Keith B. Aubry; Gary M. Koehler; Bejamin T. Maletzke; John R. Squires; Edward L. Lindquist; Steve Loch; Michael K. Schwartz

2006-01-01

Snow tracking is often used to inventory carnivore communities, but species identification using this method can produce ambiguous and misleading results. DNA can be extracted from hair and scat samples collected from tracks made in snow. Using DNA analysis could allow positive track identification across a broad range of snow conditions, thus increasing survey...
Unique identification code for medical fundus images using blood vessel pattern for tele-ophthalmology applications.

PubMed

Singh, Anushikha; Dutta, Malay Kishore; Sharma, Dilip Kumar

2016-10-01

Identification of fundus images during transmission and storage in database for tele-ophthalmology applications is an important issue in modern era. The proposed work presents a novel accurate method for generation of unique identification code for identification of fundus images for tele-ophthalmology applications and storage in databases. Unlike existing methods of steganography and watermarking, this method does not tamper the medical image as nothing is embedded in this approach and there is no loss of medical information. Strategic combination of unique blood vessel pattern and patient ID is considered for generation of unique identification code for the digital fundus images. Segmented blood vessel pattern near the optic disc is strategically combined with patient ID for generation of a unique identification code for the image. The proposed method of medical image identification is tested on the publically available DRIVE and MESSIDOR database of fundus image and results are encouraging. Experimental results indicate the uniqueness of identification code and lossless recovery of patient identity from unique identification code for integrity verification of fundus images. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
DNA barcoding of Arctic Ocean holozooplankton for species identification and recognition

NASA Astrophysics Data System (ADS)

Bucklin, Ann; Hopcroft, Russell R.; Kosobokova, Ksenia N.; Nigro, Lisa M.; Ortman, Brian D.; Jennings, Robert M.; Sweetman, Christopher J.

2010-01-01

Zooplankton species diversity and distribution are important measures of environmental change in the Arctic Ocean, and may serve as 'rapid-responders' of climate-induced changes in this fragile ecosystem. The scarcity of taxonomists hampers detailed and up-to-date monitoring of these patterns for the rarer and more problematic species. DNA barcodes (short DNA sequences for species recognition and discovery) provide an alternative approach to accurate identification of known species, and can speed routine analysis of zooplankton samples. During 2004-2008, zooplankton samples were collected during cruises to the central Arctic Ocean and Chukchi Sea. A ˜700 base-pair region of the mitochondrial cytochrome oxidase I (mtCOI) gene was amplified and sequenced for 82 identified specimens of 41 species, including cnidarians (six hydrozoans, one scyphozoan), arthropod crustaceans (five amphipods, 24 copepods, one decapod, and one euphausiid); two chaetognaths; and one nemertean. Phylogenetic analysis used the Neighbor-Joining algorithm with Kimura-2-Parameter (K-2-P) distances, with 1000-fold bootstrapping. K-2-P genetic distances between individuals of the same species ranged from 0.0 to 0.2; genetic distances between species ranged widely from 0.1 to 0.7. The mtCOI gene tree showed monophyly (at 100% bootstrap value) for each of the 26 species for which more than one individual was analyzed. Of seven genera for which more than one species was analyzed, four were shown to be monophyletic; three genera were not resolved. At higher taxonomic levels, only the crustacean order Copepoda was resolved, with bootstrap value of 83%. The mtCOI barcodes accurately discriminated and identified known species of 10 taxonomic groups of Arctic Ocean holozooplankton. A comprehensive DNA barcode database for the estimated 300 described species of Arctic holozooplankton will allow rapid assessment of species diversity and distribution in this climate-vulnerable ocean ecosystem.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.