Novel primers for complete mitochondrial cytochrome b genesequencing in mammals
Naidu, Ashwin; Fitak, Robert R.; Munguia-Vega, Adrian; Culver, Melanie
2011-01-01
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.
Balakirev, Evgeniy S; Saveliev, Pavel A; Ayala, Francisco J
2017-01-01
The complete mitochondrial (mt) genome is sequenced in 2 individuals of the Cherskii's sculpin Cottus czerskii . A surprisingly high level of sequence divergence (10.3%) has been detected between the 2 genomes of C czerskii studied here and the GenBank mt genome of C czerskii (KJ956027). At the same time, a surprisingly low level of divergence (1.4%) has been detected between the GenBank C czerskii (KJ956027) and the Amur sculpin Cottus szanaga (KX762049, KX762050). We argue that the observed discrepancies are due to incorrect taxonomic identification so that the GenBank accession number KJ956027 represents actually the mt genome of C szanaga erroneously identified as C czerskii . Our results are of consequence concerning the GenBank database quality, highlighting the potential negative consequences of entry errors, which once they are introduced tend to be propagated among databases and subsequent publications. We illustrate the premise with the data on recombinant mt genome of the Siberian taimen Hucho taimen (NCBI Reference Sequence Database NC_016426.1; GenBank accession number HQ897271.1), bearing 2 introgressed fragments (≈0.9 kb [kilobase]) from 2 lenok subspecies, Brachymystax lenok and Brachymystax lenok tsinlingensis , submitted to GenBank on June 12, 2011. Since the time of submission, the H taimen recombinant mt genome leading to incorrect phylogenetic inferences was propagated in multiple subsequent publications despite the fact that nonrecombinant H taimen genomes were also available (submitted to GenBank on August 2, 2014; KJ711549, KJ711550). Other examples of recombinant sequences persisting in GenBank are also considered. A GenBank Entry Error Depositary is urgently needed to monitor and avoid a progressive accumulation of wrong biological information.
ERIC Educational Resources Information Center
Harzbecker, Joseph, Jr.
1993-01-01
Describes the National Institute of Health's GenBank DNA sequence database and how it can be accessed through the Internet. A real reference question, which was answered successfully using the database, is reproduced to illustrate and elaborate on the potential of the Internet for information retrieval. (10 references) (KRN)
Balakirev, Evgeniy S; Saveliev, Pavel A; Ayala, Francisco J
2017-01-01
The complete mitochondrial (mt) genome is sequenced in 2 individuals of the Cherskii’s sculpin Cottus czerskii. A surprisingly high level of sequence divergence (10.3%) has been detected between the 2 genomes of C czerskii studied here and the GenBank mt genome of C czerskii (KJ956027). At the same time, a surprisingly low level of divergence (1.4%) has been detected between the GenBank C czerskii (KJ956027) and the Amur sculpin Cottus szanaga (KX762049, KX762050). We argue that the observed discrepancies are due to incorrect taxonomic identification so that the GenBank accession number KJ956027 represents actually the mt genome of C szanaga erroneously identified as C czerskii. Our results are of consequence concerning the GenBank database quality, highlighting the potential negative consequences of entry errors, which once they are introduced tend to be propagated among databases and subsequent publications. We illustrate the premise with the data on recombinant mt genome of the Siberian taimen Hucho taimen (NCBI Reference Sequence Database NC_016426.1; GenBank accession number HQ897271.1), bearing 2 introgressed fragments (≈0.9 kb [kilobase]) from 2 lenok subspecies, Brachymystax lenok and Brachymystax lenok tsinlingensis, submitted to GenBank on June 12, 2011. Since the time of submission, the H taimen recombinant mt genome leading to incorrect phylogenetic inferences was propagated in multiple subsequent publications despite the fact that nonrecombinant H taimen genomes were also available (submitted to GenBank on August 2, 2014; KJ711549, KJ711550). Other examples of recombinant sequences persisting in GenBank are also considered. A GenBank Entry Error Depositary is urgently needed to monitor and avoid a progressive accumulation of wrong biological information. PMID:28890653
Jose, Jency; Jalali, S K; Shivalingaswamy, T M; Kumar, N K Krishna; Bhatnagar, R; Bandyopadhyay, A
2013-06-01
A PCR based method for detection of viral DNA in nucleopolyhedrovirus of three lepidopterans, Spodoptera litura, Amsacta albistriga and Helicoverpa armigera, was developed by employing the late expression factor-8 (lef-8) gene of three NPV using specific primers. The amplicons of 689, 699 and 665 bp were amplified, respectively, and the nucleotide sequences were submitted to GenBank and the accession numbers were obtained. The sequences of lef-8 gene of S. litura NPV and H. armigera NPV matched with those of their respective references in the GenBank database, thereby confirming their identity, however, the sequence of A. albistriga NPV was the first sequence submitted to the GenBank database. The sequence similarity analysis between the three lef-8 gene of NPV sequenced in the present study revealed that there was no significant similarity between them, however A. albistriga NPV and S. litura NPV were found to be closely related. CLUSTAL alignment of the sequences generated revealed general relatedness among NPVs lef-8 gene. The study confirmed that lef-8 gene can be used for quick and correct discriminatory identification of insect viruses.
Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike
2018-01-01
ABSTRACT Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection. PMID:29564396
Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S
2018-01-01
Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection.
Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.
2007-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (). PMID:17202161
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2008-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.
2008-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov PMID:18073190
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2007-01-01
GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage (www.ncbi.nlm.nih.gov).
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2005-01-01
GenBank is a comprehensive database that contains publicly available DNA sequences for more than 165,000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps to ensure worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Wheeler, David L
2006-01-01
GenBank (R) is a comprehensive database that contains publicly available DNA sequences for more than 205 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the Web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at www.ncbi.nlm.nih.gov.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2010-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.
Sonet, Gontran; Jordaens, Kurt; Braet, Yves; Bourguignon, Luc; Dupont, Eréna; Backeljau, Thierry; De Meyer, Marc; Desmyter, Stijn
2013-01-01
Abstract Fly larvae living on dead corpses can be used to estimate post-mortem intervals. The identification of these flies is decisive in forensic casework and can be facilitated by using DNA barcodes provided that a representative and comprehensive reference library of DNA barcodes is available. We constructed a local (Belgium and France) reference library of 85 sequences of the COI DNA barcode fragment (mitochondrial cytochrome c oxidase subunit I gene), from 16 fly species of forensic interest (Calliphoridae, Muscidae, Fanniidae). This library was then used to evaluate the ability of two public libraries (GenBank and the Barcode of Life Data Systems – BOLD) to identify specimens from Belgian and French forensic cases. The public libraries indeed allow a correct identification of most specimens. Yet, some of the identifications remain ambiguous and some forensically important fly species are not, or insufficiently, represented in the reference libraries. Several search options offered by GenBank and BOLD can be used to further improve the identifications obtained from both libraries using DNA barcodes. PMID:24453564
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2009-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank(R) staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
Mining metadata from unidentified ITS sequences in GenBank: A case study in Inocybe (Basidiomycota)
2008-01-01
Background The lack of reference sequences from well-identified mycorrhizal fungi often poses a challenge to the inference of taxonomic affiliation of sequences from environmental samples, and many environmental sequences are thus left unidentified. Such unidentified sequences belonging to the widely distributed ectomycorrhizal fungal genus Inocybe (Basidiomycota) were retrieved from GenBank and divided into species that were identified in a phylogenetic context using a reference dataset from an ongoing study of the genus. The sequence metadata of the unidentified Inocybe sequences stored in GenBank, as well as data from the corresponding original papers, were compiled and used to explore the ecology and distribution of the genus. In addition, the relative occurrence of Inocybe was contrasted to that of other mycorrhizal genera. Results Most species of Inocybe were found to have less than 3% intraspecific variability in the ITS2 region of the nuclear ribosomal DNA. This cut-off value was used jointly with phylogenetic analysis to delimit and identify unidentified Inocybe sequences to species level. A total of 177 unidentified Inocybe ITS sequences corresponding to 98 species were recovered, 32% of which were successfully identified to species level in this study. These sequences account for an unexpectedly large proportion of the publicly available unidentified fungal ITS sequences when compared with other mycorrhizal genera. Eight Inocybe species were reported from multiple hosts and some even from hosts forming arbutoid or orchid mycorrhizae. Furthermore, Inocybe sequences have been reported from four continents and in climate zones ranging from cold temperate to equatorial climate. Out of the 19 species found in more than one study, six were found in both Europe and North America and one was found in both Europe and Japan, indicating that at least many north temperate species have a wide distribution. Conclusion Although DNA-based species identification and circumscription are associated with practical and conceptual difficulties, they also offer new possibilities and avenues for research. Metadata assembly holds great potential to synthesize valuable information from community studies for use in a species and taxonomy-oriented framework. PMID:18282272
Pruitt, Kim D.; Tatusova, Tatiana; Maglott, Donna R.
2005-01-01
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff. PMID:15608248
USDA-ARS?s Scientific Manuscript database
32 reference transcriptome sequences described herein are filed with the National Center for Biotechnology Information (NCBI), GenBank Bioproject PRJNA236444. Transcriptome Shotgun Assembly (TSA) will also be submitted when upload instructions are received from gb-admin....
Wheeler, David
2007-01-01
GenBank(R) is a comprehensive database of publicly available DNA sequences for more than 205,000 named organisms and for more than 60,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Daily data exchange with the European Molecular Biology Laboratory (EMBL) in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases with taxonomy, genome, mapping, protein structure, and domain information and the biomedical journal literature through PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available through FTP. GenBank usage scenarios ranging from local analyses of the data available through FTP to online analyses supported by the NCBI Web-based tools are discussed. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2011-01-01
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
Büssow, Konrad; Hoffmann, Steve; Sievert, Volker
2002-12-19
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
Ryberg, Martin; Kristiansson, Erik; Sjökvist, Elisabet; Nilsson, R Henrik
2009-01-01
The environmental and distributional data associated with fungal internal transcribed spacer (ITS) sequences in GenBank are investigated and a new web-based tool with which these sequences can be explored is introduced. All fungal ITS sequences in GenBank were classified as either identified to species level or insufficiently identified and compared using BLAST. The results are made available as a biweekly updated web service that can be queried to retrieve all insufficiently identified sequences (IIS) associated with any fungal genus. The most commonly available annotation items in GenBank are isolation source (55%); country of origin (50%); and specific host (38%). The molecular sampling of fungi shows a bias towards North America, Europe, China, and Japan whereas vast geographical areas remain effectively unexplored. Mycorrhizal and parasitic genera are on average associated with more IIS than are saprophytic taxa. Glomus, Alternaria, and Tomentella are the genera represented by the highest number of insufficiently identified ITS sequences in GenBank. The web service presented (http://andromeda.botany.gu.se/emerencia.html#genus_search) offers new means, particularly for mycorrhizal and plant pathogenic fungi, to examine the IIS in GenBank in a taxon-oriented framework and to explore their metadata in an easily accessible and time-efficient manner.
Buruiana, Adrian M; Mihali, Ciprian V; Popescu, Cristina
2015-12-01
Impaired hair at blepharo-ciliaris area by dermatophytes is a rare clinical entity. This infection is often misdiagnosed or underdiagnosed, being mistakenly referred to as an infection of bacterial origin. Herein, we present a rare case of tinea blepharo-ciliaris associated with tinea barbae in an adult male. Considering the two lesions of the patient, mycological examination was performed by phenotypic methods, including environmental electronic scanning microscopy. Trichophyton interdigitale zoophilic strain was identified as the etiological agent by direct examination of the hair, primary culture analysis of the developed colonies and PCR sequencing of the ITS1 region of the rDNA gene. Homology search showed 100% similarity with T. interdigitale (GenBank accession number: KC595993), Arthroderma vanbreuseghemii (GenBank accession number: JQ407190) and zoophilic strain of T. interdigitale (GenBank accession number: AY062119.1.). Four weeks of oral and local treatment with itraconazole (100 mg twice a day) and fluconazole 0.3% (eyedrops) induced complete remission. To our knowledge, this is the first report of tinea blepharo-ciliaris associated with tinea barbae in Romania.
Franzo, Giovanni; Cortey, Martí; Olvera, Alex; Novosel, Dinko; Castro, Alessandra Marnie Martins Gomes De; Biagini, Philippe; Segalés, Joaquim; Drigo, Michele
2015-08-28
PCV2 has emerged as one of the most devastating viral infections of swine farming, causing a relevant economic impact due to direct losses and control strategies expenses. Epidemiological and experimental studies have evidenced that genetic diversity is potentially affecting the virulence of PVC2. The growing number of PCV2 complete genomes and partial sequences available at GenBank questioned the accepted PCV2 classification. Nine hundred seventy five PCV2 complete genomes and 1,270 ORF2 sequences available from GenBank were subjected to recombination, PASC and phylogenetic analyses and results were used for comparison with previous classification scheme. The outcome of these analyses favors the recognition of four genotypes on the basis of ORF2 sequences, namely PCV2a, PCV2b, PCV2c and PCV2d-mPCV2b. To deal with the difficulty of founding an unambiguous classification and accounting the impossibility to define a p-distance cut-off, a set of reference sequences that could be used in further phylogenetic studies for PCV2 genotyping was established. Being aware that extensive phylogenetic analyses are time-consuming and often impracticable during routine diagnostic activity, ORF2 nucleotide positions adequately conserved in the reference sequences were identified and reported to allow a quick genotype differentiation. Globally, the present work provides an updated scenario of PCV2 genotypes distribution and, based on the limits of the previous classification criteria, proposes new rapid and effective schemes for differentiating the four defined PCV2 genotypes.
Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz
2015-01-01
Aim To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Methods Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequences were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Results Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Conclusion Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material. PMID:25727040
Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz
2015-02-01
To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequen-ces were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material.
USDA-ARS?s Scientific Manuscript database
2 new gene sequences were identified from workers of Solenopsis invicta, and submitted to the National Center for Biotechnology Information GenBank. GenBank accession numbers are HM130684-HM130685. This information will provide scientists with genetic tools to study the populations of this ant....
USDA-ARS?s Scientific Manuscript database
15 new gene sequences were identified from workers of Brachymyrmex patagonicus, and submitted to the National Center for Biotechnology Information GenBank. GenBank accession numbers are GU582126-GU582140. This information will provide scientists with genetic tools to study the development and the p...
DNA barcode data accurately assign higher spider taxa
Coddington, Jonathan A.; Agnarsson, Ingi; Cheng, Ren-Chung; Čandek, Klemen; Driskell, Amy; Frick, Holger; Gregorič, Matjaž; Kostanjšek, Rok; Kropf, Christian; Kweskin, Matthew; Lokovšek, Tjaša; Pipan, Miha; Vidergar, Nina
2016-01-01
The use of unique DNA sequences as a method for taxonomic identification is no longer fundamentally controversial, even though debate continues on the best markers, methods, and technology to use. Although both existing databanks such as GenBank and BOLD, as well as reference taxonomies, are imperfect, in best case scenarios “barcodes” (whether single or multiple, organelle or nuclear, loci) clearly are an increasingly fast and inexpensive method of identification, especially as compared to manual identification of unknowns by increasingly rare expert taxonomists. Because most species on Earth are undescribed, a complete reference database at the species level is impractical in the near term. The question therefore arises whether unidentified species can, using DNA barcodes, be accurately assigned to more inclusive groups such as genera and families—taxonomic ranks of putatively monophyletic groups for which the global inventory is more complete and stable. We used a carefully chosen test library of CO1 sequences from 49 families, 313 genera, and 816 species of spiders to assess the accuracy of genus and family-level assignment. We used BLAST queries of each sequence against the entire library and got the top ten hits. The percent sequence identity was reported from these hits (PIdent, range 75–100%). Accurate assignment of higher taxa (PIdent above which errors totaled less than 5%) occurred for genera at PIdent values >95 and families at PIdent values ≥ 91, suggesting these as heuristic thresholds for accurate generic and familial identifications in spiders. Accuracy of identification increases with numbers of species/genus and genera/family in the library; above five genera per family and fifteen species per genus all higher taxon assignments were correct. We propose that using percent sequence identity between conventional barcode sequences may be a feasible and reasonably accurate method to identify animals to family/genus. However, the quality of the underlying database impacts accuracy of results; many outliers in our dataset could be attributed to taxonomic and/or sequencing errors in BOLD and GenBank. It seems that an accurate and complete reference library of families and genera of life could provide accurate higher level taxonomic identifications cheaply and accessibly, within years rather than decades. PMID:27547527
Fietz, Katharina; Graves, Jeff A; Olsen, Morten Tange
2013-01-01
Genetic data can provide a powerful tool for those interested in the biology, management and conservation of wildlife, but also lead to erroneous conclusions if appropriate controls are not taken at all steps of the analytical process. This particularly applies to data deposited in public repositories such as GenBank, whose utility relies heavily on the assumption of high data quality. Here we report on an in-depth reassessment and comparison of GenBank and chromatogram mtDNA sequence data generated in a previous study of Baltic grey seals. By re-editing the original chromatogram data we found that approximately 40% of the grey seal mtDNA haplotype sequences posted in GenBank contained errors. The re-analysis of the edited chromatogram data yielded overall similar results and conclusions as the original study. However, a significantly different outcome was observed when using the uncorrected dataset based on the GenBank haplotypes. We therefore suggest disregarding the existing GenBank data and instead using the correct haplotypes reported here. Our study serves as an illustrative example reiterating the importance of quality control through every step of a research project, from data generation to interpretation and submission to an online repository. Errors conducted in any step may lead to biased results and conclusions, and could impact management decisions.
Fietz, Katharina; Graves, Jeff A.; Olsen, Morten Tange
2013-01-01
Genetic data can provide a powerful tool for those interested in the biology, management and conservation of wildlife, but also lead to erroneous conclusions if appropriate controls are not taken at all steps of the analytical process. This particularly applies to data deposited in public repositories such as GenBank, whose utility relies heavily on the assumption of high data quality. Here we report on an in-depth reassessment and comparison of GenBank and chromatogram mtDNA sequence data generated in a previous study of Baltic grey seals. By re-editing the original chromatogram data we found that approximately 40% of the grey seal mtDNA haplotype sequences posted in GenBank contained errors. The re-analysis of the edited chromatogram data yielded overall similar results and conclusions as the original study. However, a significantly different outcome was observed when using the uncorrected dataset based on the GenBank haplotypes. We therefore suggest disregarding the existing GenBank data and instead using the correct haplotypes reported here. Our study serves as an illustrative example reiterating the importance of quality control through every step of a research project, from data generation to interpretation and submission to an online repository. Errors conducted in any step may lead to biased results and conclusions, and could impact management decisions. PMID:23977362
Mitogenome metadata: current trends and proposed standards.
Strohm, Jeff H T; Gwiazdowski, Rodger A; Hanner, Robert
2016-09-01
Mitogenome metadata are descriptive terms about the sequence, and its specimen description that allow both to be digitally discoverable and interoperable. Here, we review a sampling of mitogenome metadata published in the journal Mitochondrial DNA between 2005 and 2014. Specifically, we have focused on a subset of metadata fields that are available for GenBank records, and specified by the Genomics Standards Consortium (GSC) and other biodiversity metadata standards; and we assessed their presence across three main categories: collection, biological and taxonomic information. To do this we reviewed 146 mitogenome manuscripts, and their associated GenBank records, and scored them for 13 metadata fields. We also explored the potential for mitogenome misidentification using their sequence diversity, and taxonomic metadata on the Barcode of Life Datasystems (BOLD). For this, we focused on all Lepidoptera and Perciformes mitogenomes included in the review, along with additional mitogenome sequence data mined from Genbank. Overall, we found that none of 146 mitogenome projects provided all the metadata we looked for; and only 17 projects provided at least one category of metadata across the three main categories. Comparisons using mtDNA sequences from BOLD, suggest that some mitogenomes may be misidentified. Lastly, we appreciate the research potential of mitogenomes announced through this journal; and we conclude with a suggestion of 13 metadata fields, available on GenBank, that if provided in a mitogenomes's GenBank record, would increase their research value.
The complete genome sequence and proteomics of Yersinia pestis phage Yep-phi.
Zhao, Xiangna; Wu, Weili; Qi, Zhizhen; Cui, Yujun; Yan, Yanfeng; Guo, Zhaobiao; Wang, Zuyun; Wang, Hu; Deng, Haijun; Xue, Yan; Chen, Weijun; Wang, Xiaoyi; Yang, Ruifu
2011-01-01
Yep-phi, a lytic phage of Yersinia pestis, was isolated in China and is routinely used as a diagnostic phage for the identification of the plague pathogen. Yep-phi has an isometric hexagonal head containing dsDNA and a short non-contractile conical tail. In this study, we sequenced the Yep-phi genome (GenBank accession no. HQ333270) and performed proteomics analysis. The genome consists of 38 ,616 bp of DNA, including direct terminal repeats of 222 bp, and is predicted to contain 45 ORFs. Most structural proteins were identified by proteomics analysis. Compared with the three available genome sequences of lytic phages for Y. pestis, the phages could be divided into two subgroups. Yep-phi displays marked homology to the bacteriophages Berlin (GenBank accession no. AM183667) and Yepe2 (GenBank accession no. EU734170), and these comprise one subgroup. The other subgroup is represented by bacteriophage ΦA1122 (GenBank accession no. AY247822). Potential recombination was detected among the Yep-phi subgroup.
Numerical classification of coding sequences
NASA Technical Reports Server (NTRS)
Collins, D. W.; Liu, C. C.; Jukes, T. H.
1992-01-01
DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.
Setoh, Yin Xiang; Amarilla, Alberto A; Peng, Nias Y; Slonchak, Andrii; Periasamy, Parthiban; Figueiredo, Luiz T M; Aquino, Victor H; Khromykh, Alexander A
2018-01-01
Rocio virus (ROCV) is an arbovirus belonging to the genus Flavivirus, family Flaviviridae. We present an updated sequence of ROCV strain SPH 34675 (GenBank: AY632542.4), the only available full genome sequence prior to this study. Using next-generation sequencing of the entire genome, we reveal substantial sequence variation from the prototype sequence, with 30 nucleotide differences amounting to 14 amino acid changes, as well as significant changes to predicted 3'UTR RNA structures. Our results present an updated and corrected sequence of a potential emerging human-virulent flavivirus uniquely indigenous to Brazil (GenBank: MF461639).
Capel, K C C; Migotto, A E; Zilberberg, C; Lin, M F; Forsman, Z; Miller, D J; Kitahara, M V
2016-09-30
Members of the azooxanthellate coral genus Tubastraea are invasive species with particular concern because they have become established and are fierce competitors in the invaded areas in many parts of the world. Pacific Tubastraea species are spreading fast throughout the Atlantic Ocean, occupying over 95% of the available substrate in some areas and out-competing native endemic species. Approximately half of all known coral species are azooxanthellate but these are seriously under-represented compared to zooxanthellate corals in terms of the availability of mitochondrial (mt) genome data. In the present study, the complete mt DNA sequences of Atlantic individuals of the invasive scleractinian species Tubastraea coccinea and Tubastraea tagusensis were determined and compared to the GenBank reference sequence available for a Pacific "T. coccinea" individual. At 19,094bp (compared to 19,070bp for the GenBank specimen), the mt genomes assembled for the Atlantic T. coccinea and T. tagusensis were among the longest sequence determined to date for "Complex" scleractinians. Comparisons of genomes data showed that the "T. coccinea" sequence deposited on GenBank was more closely related to that from Dendrophyllia arbuscula than to the Atlantic Tubastraea spp., in terms of genome length and base pair similarities. This was confirmed by phylogenetic analysis, suggesting that the former was misidentified and might actually be a member from the genus Dendrophyllia. In addition, although in general the COX1 locus has a slow evolutionary rate in Scleractinia, it was the most variable region of the Tubastraea mt genome and can be used as markers for genus or species identification. Given the limited data available for azooxanthellate corals, the results presented here represent an important contribution to our understanding of phylogenetic relationships and the evolutionary history of the Scleractinia. Copyright © 2016 Elsevier B.V. All rights reserved.
USDA-ARS?s Scientific Manuscript database
5 new gene sequences were identified from workers of Caribbean crazy ant, Nylanderia cf. pubens, and submitted to the National Center for Biotechnology Information GenBank. GenBank accession numbers are JF815100-JF815104. This information will provide scientists with genetic tools to study the popu...
Complete cDNAs from Nylanderia sp. nr. pubens (Hymenoptera: Formicidae). GenBank GU980916-GU980928.
USDA-ARS?s Scientific Manuscript database
13 new gene sequences were identified from workers of Rasberry crazy ant, Nylanderia sp.nr. pubens, and submitted to the National Center for Biotechnology Information GenBank. GenBank accession numbers are GU980916-GU980928. This information will provide scientists with genetic tools to study the p...
Genome sequencing and annotation of Serratia sp. strain TEL.
Lephoto, Tiisetso E; Gray, Vincent M
2015-12-01
We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000.
... this page please turn Javascript on. Unique DNA database has helped advance scientific discoveries worldwide Since its origin 25 years ago, the database of nucleic acid sequences known as GenBank has ...
Khamis, F M; Rwomushana, I; Ombura, L O; Cook, G; Mohamed, S A; Tanga, C M; Nderitu, P W; Borgemeister, C; Sétamou, M; Grout, T G; Ekesi, S
2017-12-05
Citrus (Citrus spp.) production continues to decline in East Africa, particularly in Kenya and Tanzania, the two major producers in the region. This decline is attributed to pests and diseases including infestation by the African citrus triozid, Trioza erytreae (Del Guercio) (Hemiptera: Triozidae). Besides direct feeding damage by adults and immature stages, T. erytreae is the main vector of 'Candidatus Liberibacter africanus', the causative agent of Greening disease in Africa, closely related to Huanglongbing. This study aimed to generate a novel barcode reference library for T. erytreae in order to use DNA barcoding as a rapid tool for accurate identification of the pest to aid phytosanitary measures. Triozid samples were collected from citrus orchards in Kenya, Tanzania, and South Africa and from alternative host plants. Sequences generated from populations in the study showed very low variability within acceptable ranges of species. All samples analyzed were linked to T. erytreae of GenBank accession number KU517195. Phylogeny of samples in this study and other Trioza reference species was inferred using the Maximum Likelihood method. The phylogenetic tree was paraphyletic with two distinct branches. The first branch had two clusters: 1) cluster of all populations analyzed with GenBank accession of T. erytreae and 2) cluster of all the other GenBank accession of Trioza species analyzed except T. incrustata Percy, 2016 (KT588307.1), T. eugeniae Froggatt (KY294637.1), and T. grallata Percy, 2016 (KT588308.1) that occupied the second branch as outgroups forming sister clade relationships. These results were further substantiated with genetic distance values and principal component analyses. © The Author(s) 2017. Published by Oxford University Press on behalf of Entomological Society of America.
Lammers, Youri; Peelen, Tamara; Vos, Rutger A; Gravendeel, Barbara
2014-02-06
Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation' barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker.
2014-01-01
Background Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. Results The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation’ barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. Conclusions The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker. PMID:24502833
Further insight into genetic variation and haplotype diversity of Cherry virus A from China
Candresse, Thierry; He, Zhen; Li, Shifang; Ma, Yuxin
2017-01-01
Cherry virus A (CVA) infection appears to be prevalent in cherry plantations worldwide. In this study, the diversity of CVA isolates from 31 cherry samples collected from different orchards around Bohai Bay in northeastern China was analyzed. The complete genome of one of these isolates, ChYT52, was found to be 7,434 nt in length excluding the poly (A) tail. It shares between 79.9–98.7% identity with CVA genome sequences in GenBank, while its RdRp core is more divergent (79.1–90.7% nt identity), likely as a consequence of a recombination event. Phylogenetic analysis of ChYT52 genome with CVA genomes in Genbank resulted in at least 7 major clusters plus additional 5 isolates alone at the end of long branches suggesting the existence of further phylogroups diversity in CVA. The genetic diversity of Chinese CVA isolates from 31 samples and GenBank sequences were analyzed in three genomic regions that correspond to the coat protein, the RNA-dependent RNA polymerase core region, and the movement protein genes. With few exceptions likely representing further recombination impact, the trees various trees are largely congruent, indicating that each region provides valuable phylogenetic information. In all cases, the majority of the Chinese CVA isolates clustering in phylogroup I, together with the X82547 reference sequence from Germany. Statistically significant negative values were obtained for Tajima’s D in the three genes for phylogroup I, suggesting that it may be undergoing a period of expansion. There was considerable haplotype diversity in the individual samples and more than half samples contained genetically diverse haplotypes belonging to different phylogroups. In addition, a number of statistically significant recombination events were detected in CVA genomes or in the partial genomic sequences indicating an important contribution of recombination to CVA evolution. This work provides a foundation for elucidation of the epidemiological characteristics and evolutionary history of CVA populations. PMID:29020049
Corruption of genomic databases with anomalous sequence.
Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L
1992-06-11
We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%.
Park, Chung Hwa; Song, Eun Gyeong; Ryu, Ki Hyun
2018-01-01
Co-infection with two virus species was previously reported in some cactus plants. Here, we showed that Notocactus leninghausii f. cristatus can be co-infected with six different viruses: cactus mild mottle virus (CMMoV)-Nl, cactus virus X (CVX)-Nl, pitaya virus X (PiVX)-Nl, rattail cactus necrosis-associated virus (RCNaV)-Nl, schlumbergera virus X (SchVX)-Nl, and zygocactus virus X (ZyVX)-Nl. The coat protein sequences of these viruses were compared with those of previously reported viruses. CMMoV-Nl, CVX-Nl, PiVX-Nl, RCNaV-Nl, SchVX-Nl, and ZyVX-Nl showed the greatest nucleotide sequence homology to CMMoV-Kr (99.8% identity, GenBank accession NC_011803), CVX-Jeju (77.5% identity, GenBank accession LC12841), PiVX-P37 (98.4% identity, GenBank accession NC_024458), RCNaV (99.4% identity, GenBank accession NC_016442), SchVX-K11 (95.7% identity, GenBank accession NC_011659), and ZyVX-B1 (97.9% identity, GenBank accession NC_006059), respectively. This study is the first report of co-infection with six virus species in N. leninghausii f. cristatus in South Korea. PMID:29422789
Aung, Win Pa Pa; Htoon, Thi Thi; Tin, Htay Htay; Thinn, Kyi Kyi; Sanpool, Oranuch; Jongthawin, Jurairat; Sadaow, Lakkhana; Phosuk, Issarapong; Rodpai, Rutchanee; Intapan, Pewpan M; Maleewong, Wanchai
2017-01-01
Opisthorchis viverrini is endemic in the South East Asian region, especially in Cambodia, Lao People's Democratic Republic, Vietnam and Thailand, but there have been no previous records from Myanmar. During stool surveys of rural populations in three regions of Lower Myanmar, Opisthorchis-like eggs were found in 34 out of 364 (9.3%) participants by stool microscopy after using the modified formalin-ether concentration technique. DNA was extracted from these positive stool samples and a portion of the mitochondrial cytochrome c oxidase subunit I (cox1) gene was amplified using the polymerase chain reaction and then sequenced. DNA sequences, successfully obtained from 18 of 34 positive samples (Bago Region, n = 13; Mon State, n = 3; Yangon Region, n = 2), confirmed that the eggs were of O. viverrini. Sequences showed 99.7% identity with O. viverrini mitochondrial cox1 (GenBank accession no. JF739555) but 95%, 88.7%, 82.6% and 81.4% identities with those of Opisthorchis lobatus from Lao People's Democratic Republic (GenBank accession nos. HQ328539-HQ328541), Metorchis orientalis from China (KT239342), Clonorchis sinensis from China (JF729303) and Opisthorchis felineus from Russia (EU921260), respectively. When alignement with other Opisthorchiidae trematodes, 81% similarity with Metorchis bilis from Czech Republic (GenBank accession nos. KT740966, KT740969, KT740970) and Slovakia (GenBank accession nos. KT740971-KT740973), 84.6% similarity with Metorchis xanthosomus from Czech Republic (GenBank accession no. KT740974), 78.6% similarity with M. xanthosomus from Poland (GenBank accession no. KT740968) and 82.2% similarity with Euamphimerus pancreaticus from Czech Republic (GenBank accession no. KT740975) were revealed. This study demonstrated, for the first time, O. viverrini from rural people in Myanmar using molecular methods and is an urgent call for surveillance and control activities against opisthorchiasis in Myanmar.
Molecular phylogeny of some avian species using Cytochrome b gene sequence analysis
Awad, A; Khalil, S. R; Abd-Elhakim, Y. M
2015-01-01
Veritable identification and differentiation of avian species is a vital step in conservative, taxonomic, forensic, legal and other ornithological interventions. Therefore, this study involved the application of molecular approach to identify some avian species i.e. Chicken (Gallus gallus), Muskovy duck (Cairina moschata), Japanese quail (Coturnix japonica), Laughing dove (Streptopelia senegalensis), and Rock pigeon (Columba livia). Genomic DNA was extracted from blood samples and partial sequence of the mitochondrial cytochrome b gene (358 bp) was amplified and sequenced using universal primers. Sequences alignment and phylogenetic analyses were performed by CLC main workbench program. The obtained five sequences were deposited in GenBank and compared with those previously registered in GenBank. The similarity percentage was 88.60% between Gallus gallus and Coturnix japonica and 80.46% between Gallus gallus and Columba livia. The percentage of identity between the studied species and GenBank species ranged from 77.20% (Columba oenas and Anas platyrhynchos) to 100% (Gallus gallus and Gallus sonneratii, Coturnix coturnix and Coturnix japonica, Meleagris gallopavo and Columba livia). Amplification of the partial sequence of mitochondrial cytochrome b gene proved to be practical for identification of an avian species unambiguously. PMID:27175180
cDNAs from Nylanderia sp nr pubens (Hymenoptera: Formicidae)
USDA-ARS?s Scientific Manuscript database
7 new gene sequences were identified from workers of Rasberry crazy ant, Nylanderia sp.nr. pubens, and submitted to the National Center for Biotechnology Information GenBank. GenBank accession numbers are HQ636472-HQ636478. This information will provide scientists with genetic tools to study the pop...
Taxonomic evaluation of selected Ganoderma species and database sequence validation
Jargalmaa, Suldbold; Eimes, John A.; Park, Myung Soo; Park, Jae Young; Oh, Seung-Yoon
2017-01-01
Species in the genus Ganoderma include several ecologically important and pathogenic fungal species whose medicinal and economic value is substantial. Due to the highly similar morphological features within the Ganoderma, identification of species has relied heavily on DNA sequencing using BLAST searches, which are only reliable if the GenBank submissions are accurately labeled. In this study, we examined 113 specimens collected from 1969 to 2016 from various regions in Korea using morphological features and multigene analysis (internal transcribed spacer, translation elongation factor 1-α, and the second largest subunit of RNA polymerase II). These specimens were identified as four Ganoderma species: G. sichuanense, G. cf. adspersum, G. cf. applanatum, and G. cf. gibbosum. With the exception of G. sichuanense, these species were difficult to distinguish based solely on morphological features. However, phylogenetic analysis at three different loci yielded concordant phylogenetic information, and supported the four species distinctions with high bootstrap support. A survey of over 600 Ganoderma sequences available on GenBank revealed that 65% of sequences were either misidentified or ambiguously labeled. Here, we suggest corrected annotations for GenBank sequences based on our phylogenetic validation and provide updated global distribution patterns for these Ganoderma species. PMID:28761785
2013-01-01
Background Accurate and complete identification of mobile elements is a challenging task in the current era of sequencing, given their large numbers and frequent truncations. Group II intron retroelements, which consist of a ribozyme and an intron-encoded protein (IEP), are usually identified in bacterial genomes through their IEP; however, the RNA component that defines the intron boundaries is often difficult to identify because of a lack of strong sequence conservation corresponding to the RNA structure. Compounding the problem of boundary definition is the fact that a majority of group II intron copies in bacteria are truncated. Results Here we present a pipeline of 11 programs that collect and analyze group II intron sequences from GenBank. The pipeline begins with a BLAST search of GenBank using a set of representative group II IEPs as queries. Subsequent steps download the corresponding genomic sequences and flanks, filter out non-group II introns, assign introns to phylogenetic subclasses, filter out incomplete and/or non-functional introns, and assign IEP sequences and RNA boundaries to the full-length introns. In the final step, the redundancy in the data set is reduced by grouping introns into sets of ≥95% identity, with one example sequence chosen to be the representative. Conclusions These programs should be useful for comprehensive identification of group II introns in sequence databases as data continue to rapidly accumulate. PMID:24359548
Sharifdini, Meysam; Heidari, Zahra; Hesari, Zahra; Vatandoost, Sajad; Kia, Eshrat Beigom
2017-06-01
The present study was performed to analyze molecularly the phylogenetic positions of human-infecting Trichostrongylus species in Mazandaran Province, Iran, which is an endemic area for trichostrongyliasis. DNA from 7 Trichostrongylus infected stool samples were extracted by using in-house (IH) method. PCR amplification of ITS2-rDNA region was performed, and products were sequenced. Phylogenetic analysis of the nucleotide sequence data was performed using MEGA 5.0 software. Six out of 7 isolates had high similarity with Trichostrongylus colubriformis , while the other one showed high homology with Trichostrongylus axei registered in GenBank reference sequences. Intra-specific variations within isolates of T. colubriformis and T. axei amounted to 0-1.8% and 0-0.6%, respectively. Trichostrongylus species obtained in the present study were in a cluster with the relevant reference sequences from previous studies. BLAST analysis indicated that there was 100% homology among all 6 ITS2 sequences of T. colubriformis in the present study and most previously registered sequences of T. colubriformis from human, sheep, and goat isolates from Iran and also human isolates from Laos, Thailand, and France. The ITS2 sequence of T. axei exhibited 99.4% homology with the human isolate of T. axei from Thailand, sheep isolates from New Zealand and Iran, and cattle isolate from USA.
Al-Shahrani, Sarah A; Alajmi, Reem A; Ayaad, Tahany H; Al-Shahrani, Mohammed A; Shaurub, El-Sayed H
2017-10-01
The present work aimed at investigating the genetic diversity of the head louse Pediculus humanus capitis (P. humanus capitis) among infested primary school girls at Bisha governorate, Saudi Arabia, based on the sequence of mitochondrial cytochrome b (mt cyt b) gene of 121 P. humanus capitis adults. Additionally, the prevalence of pediculosis capitis was surveyed. The results of sequencing were compared with the sequence of human head lice that are genotyped previously. Phylogenetic tree analysis showed the presence of 100% identity (n = 26) of louse specimens with clade A (prevalent worldwide) of the GenBank data base. Louse individuals (n = 50) showed 99.8% similarity with the same clade A reference having a single base pair difference. Also, a number of 22 louse individuals revealed 99.8% identity with clade B reference (prevalent in North and Central Americas, Europe, and Australia) with individual diversity in two base pairs. Moreover, 14 louse individual sequences revealed 99.4% identity with three base pair differences. It was concluded that moderate pediculosis (~13%) prevailed among the female students of the primary schools. It was age-and hair texture (straight or curly)-dependent. P. humanus capitis prevalence diversity is of clades A and B genotyping.
Comparative analysis of myostatin gene and promoter sequences of Qinchuan and Red Angus cattle.
He, Y L; Wu, Y H; Quan, F S; Liu, Y G; Zhang, Y
2013-09-04
To better understand the function of the myostatin gene and its promoter region in bovine, we amplified and sequenced the myostatin gene and promoter from the blood of Qinchuan and Red Angus cattle by using polymerase chain reaction. The sequences of Qinchuan and Red Angus cattle were compared with those of other cattle breeds available in GenBank. Exon splice sites were confirmed by mRNA sequencing. Compared to the published sequence (GenBank accession No. AF320998), 69 single nucleotide polymorphisms (SNPs) were identified in the Qinchuan myostatin gene, only one of which was an insertion mutation in Qinchuan cattle. There was a 16-bp insertion in the first 705-bp intron in 3 Qinchuan cattle. A total of 7 SNPs were identified in exon 3, in which the mutation occurred in the third base of the codon and was synonymous. On comparing the Qinchuan myostatin gene sequence to that of Red Angus cattle, a total of 50 SNPs were identified in the first and third exons. In addition, there were 18 SNPs identified in the Qinchuan cattle promoter region compared with those of other cattle compared to the Red Angus cattle myostatin promoter region. breeds (GenBank accession No. AF348479), but only 14 SNPs when compared to the Red Angus cattle myostatin promoter region.
Nilsson, R Henrik; Kristiansson, Erik; Ryberg, Martin; Larsson, Karl-Henrik
2005-07-18
During the last few years, DNA sequence analysis has become one of the primary means of taxonomic identification of species, particularly so for species that are minute or otherwise lack distinct, readily obtainable morphological characters. Although the number of sequences available for comparison in public databases such as GenBank increases exponentially, only a minuscule fraction of all organisms have been sequenced, leaving taxon sampling a momentous problem for sequence-based taxonomic identification. When querying GenBank with a set of unidentified sequences, a considerable proportion typically lack fully identified matches, forming an ever-mounting pile of sequences that the researcher will have to monitor manually in the hope that new, clarifying sequences have been submitted by other researchers. To alleviate these concerns, a project to automatically monitor select unidentified sequences in GenBank for taxonomic progress through repeated local BLAST searches was initiated. Mycorrhizal fungi--a field where species identification often is prohibitively complex--and the much used ITS locus were chosen as test bed. A Perl script package called emerencia is presented. On a regular basis, it downloads select sequences from GenBank, separates the identified sequences from those insufficiently identified, and performs BLAST searches between these two datasets, storing all results in an SQL database. On the accompanying web-service http://emerencia.math.chalmers.se, users can monitor the taxonomic progress of insufficiently identified sequences over time, either through active searches or by signing up for e-mail notification upon disclosure of better matches. Other search categories, such as listing all insufficiently identified sequences (and their present best fully identified matches) publication-wise, are also available. The ever-increasing use of DNA sequences for identification purposes largely falls back on the assumption that public sequence databases contain a thorough sampling of taxonomically well-annotated sequences. Taxonomy, held by some to be an old-fashioned trade, has accordingly never been more important. emerencia does not automate the taxonomic process, but it does allow researchers to focus their efforts elsewhere than countless manual BLAST runs and arduous sieving of BLAST hit lists. The emerencia system is available on an open source basis for local installation with any organism and gene group as targets.
Véliz, David; Vega-Retter, Caren; Quezada-Romegialli, Claudio
2016-01-01
The complete sequence of the mitochondrial genome for the Chilean silverside Basilichthys microlepidotus is reported for the first time. The entire mitochondrial genome was 16,544 bp in length (GenBank accession no. KM245937); gene composition and arrangement was conformed to that reported for most fishes and contained the typical structure of 2 rRNAs, 13 protein-coding genes, 22 tRNAs and a non-coding region. The assembled mitogenome was validated against sequences of COI and Control Region previously sequenced in our lab, functional genes from RNA-Seq data for the same species and the mitogenome of two other atherinopsid species available in Genbank.
Dessimoz, Christophe; Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro
2011-09-01
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.
Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro
2011-01-01
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. PMID:21712341
UniPrime2: a web service providing easier Universal Primer design.
Boutros, Robin; Stokes, Nicola; Bekaert, Michaël; Teeling, Emma C
2009-07-01
The UniPrime2 web server is a publicly available online resource which automatically designs large sets of universal primers when given a gene reference ID or Fasta sequence input by a user. UniPrime2 works by automatically retrieving and aligning homologous sequences from GenBank, identifying regions of conservation within the alignment, and generating suitable primers that can be used to amplify variable genomic regions. In essence, UniPrime2 is a suite of publicly available software packages (Blastn, T-Coffee, GramAlign, Primer3), which reduces the laborious process of primer design, by integrating these programs into a single software pipeline. Hence, UniPrime2 differs from previous primer design web services in that all steps are automated, linked, saved and phylogenetically delimited, only requiring a single user-defined gene reference ID or input sequence. We provide an overview of the web service and wet-laboratory validation of the primers generated. The system is freely accessible at: http://uniprime.batlab.eu. UniPrime2 is licenced under a Creative Commons Attribution Noncommercial-Share Alike 3.0 Licence.
Heinrichs, Guido; de Hoog, G. Sybren
2012-01-01
Herpotrichiellaceous black yeasts and relatives comprise severe pathogens flanked by nonpathogenic environmental siblings. Reliable identification by conventional methods is notoriously difficult. Molecular identification is hampered by the sequence variability in the internal transcribed spacer (ITS) domain caused by difficult-to-sequence homopolymeric regions and by poor taxonomic attribution of sequences deposited in GenBank. Here, we present a potential solution using short barcode identifiers (27 to 50 bp) based on ITS2 ribosomal DNA (rDNA), which allows unambiguous definition of species-specific fragments. Starting from proven sequences of ex-type and authentic strains, we were able to describe 103 identifiers. Multiple BLAST searches of these proposed barcode identifiers in GenBank revealed uniqueness for 100 taxonomic entities, whereas the three remaining identifiers each matched with two entities, but the species of these identifiers could easily be discriminated by differences in the remaining ITS regions. Using the proposed barcode identifiers, a 4.1-fold increase of 100% matches in GenBank was achieved in comparison to the classical approach using the complete ITS sequences. The proposed barcode identifiers will be made accessible for the diagnostic laboratory in a permanently updated online database, thereby providing a highly practical, reliable, and cost-effective tool for identification of clinically important black yeasts and relatives. PMID:22785187
Using populations of human and microbial genomes for organism detection in metagenomes
Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel; ...
2015-04-29
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less
Using populations of human and microbial genomes for organism detection in metagenomes.
Ames, Sasha K; Gardner, Shea N; Marti, Jose Manuel; Slezak, Tom R; Gokhale, Maya B; Allen, Jonathan E
2015-07-01
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected. © 2015 Ames et al.; Published by Cold Spring Harbor Laboratory Press.
Using populations of human and microbial genomes for organism detection in metagenomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less
Yin, Guohua; Zhang, Yuliang; Pennerman, Kayla K.; Wu, Guangxi; Hua, Sui Sheng T.; Yu, Jiujiang; Jurick, Wayne M.; Guo, Anping; Bennett, Joan W.
2017-01-01
Penicillium is a large genus of common molds with over 400 described species; however, identification of individual species is difficult, including for those species that cause postharvest rots. In this study, blue rot fungi from stored apples and pears were isolated from a variety of hosts, locations, and years. Based on morphological and cultural characteristics and partial amplification of the β-tubulin locus, the isolates were provisionally identified as several different species of Penicillium. These isolates were investigated further using a suite of molecular DNA markers and compared to sequences of the ex-type for cognate species in GenBank, and were identified as P. expansum (3 isolates), P. solitum (3 isolates), P. carneum (1 isolate), and P. paneum (1 isolate). Three of the markers we used (ITS, internal transcribed spacer rDNA sequence; benA, β-tubulin; CaM, calmodulin) were suitable for distinguishing most of our isolates from one another at the species level. In contrast, we were unable to amplify RPB2 sequences from four of the isolates. Comparison of our sequences with cognate sequences in GenBank from isolates with the same species names did not always give coherent data, reinforcing earlier studies that have shown large intraspecific variability in many Penicillium species, as well as possible errors in some sequence data deposited in GenBank. PMID:29371531
Akhtar, Nasrin; Ghauri, Muhammad A.; Iqbal, Aamira; Anwar, Munir A.; Akhtar, Kalsoom
2008-01-01
Culturable bacterial biodiversity and industrial importance of the isolates indigenous to Khewra salt mine, Pakistan was assessed. PCR Amplification of 16S rDNA of isolates was carried out by using universal primers FD1 and rP1and products were sequenced commercially. These gene sequences were compared with other gene sequences in the GenBank databases to find the closely related sequences. The alignment of these sequences with sequences available from GenBank database was carried out to construct a phylogenetic tree for these bacteria. These genes were deposited to GenBank and accession numbers were obtained. Most of the isolates belonged to different species of genus Bacillus, sharing 92-99% 16S rDNA identity with the respective type strain. Other isolates had close similarities with Escherichia coli, Staphylococcus arlettae and Staphylococcus gallinarum with 97%, 98% and 99% 16S rDNA similarity respectively. The abilities of isolates to produce industrial enzymes (amylase, carboxymethylcellulase, xylanase, cellulase and protease) were checked. All isolates were tested against starch, carboxymethylcellulose (CMC), xylane, cellulose, and casein degradation in plate assays. BPT-5, 11,18,19 and 25 indicated the production of copious amounts of carbohydrates and protein degrading enzymes. Based on this study it can be concluded that Khewra salt mine is populated with diverse bacterial groups, which are potential source of industrial enzymes for commercial applications. PMID:24031194
GBParsy: a GenBank flatfile parser library with high speed.
Lee, Tae-Ho; Kim, Yeon-Ki; Nahm, Baek Hie
2008-07-25
GenBank flatfile (GBF) format is one of the most popular sequence file formats because of its detailed sequence features and ease of readability. To use the data in the file by a computer, a parsing process is required and is performed according to a given grammar for the sequence and the description in a GBF. Currently, several parser libraries for the GBF have been developed. However, with the accumulation of DNA sequence information from eukaryotic chromosomes, parsing a eukaryotic genome sequence with these libraries inevitably takes a long time, due to the large GBF file and its correspondingly large genomic nucleotide sequence and related feature information. Thus, there is significant need to develop a parsing program with high speed and efficient use of system memory. We developed a library, GBParsy, which was C language-based and parses GBF files. The parsing speed was maximized by using content-specified functions in place of regular expressions that are flexible but slow. In addition, we optimized an algorithm related to memory usage so that it also increased parsing performance and efficiency of memory usage. GBParsy is at least 5-100x faster than current parsers in benchmark tests. GBParsy is estimated to extract annotated information from almost 100 Mb of a GenBank flatfile for chromosomal sequence information within a second. Thus, it should be used for a variety of applications such as on-time visualization of a genome at a web site.
Quality scores for 32,000 genomes
Land, Miriam L.; Hyatt, Doug; Jun, Se-Ran; ...
2014-12-08
More than 80% of the microbial genomes in GenBank are of ‘draft’ quality (12,553 draft vs. 2,679 finished, as of October, 2013). In this study, we have examined all the microbial DNA sequences available for complete, draft, and Sequence Read Archive genomes in GenBank as well as three other major public databases, and assigned quality scores for more than 30,000 prokaryotic genome sequences. Scores were assigned using four categories: the completeness of the assembly, the presence of full-length rRNA genes, tRNA composition and the presence of a set of 102 conserved genes in prokaryotes. Most (~88%) of the genomes hadmore » quality scores of 0.8 or better and can be safely used for standard comparative genomics analysis. We compared genomes across factors that may influence the score. We found that although sequencing depth coverage of over 100x did not ensure a better score, sequencing read length was a better indicator of sequencing quality. With few exceptions, most of the 30,000 genomes have nearly all the 102 essential genes. The score can be used to set thresholds for screening data when analyzing “all published genomes” and reference data is either not available or not applicable. The scores highlighted organisms for which commonly used tools do not perform well. This information can be used to improve tools and to serve a broad group of users as more diverse organisms are sequenced. Finally and unexpectedly, the comparison of predicted tRNAs across 15,000 high quality genomes showed that anticodons beginning with an ‘A’ (codons ending with a ‘U’) are almost non-existent, with the exception of one arginine codon (CGU); this has been noted previously in the literature for a few genomes, but not with the depth found here.« less
Quality scores for 32,000 genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Land, Miriam L.; Hyatt, Doug; Jun, Se-Ran
More than 80% of the microbial genomes in GenBank are of ‘draft’ quality (12,553 draft vs. 2,679 finished, as of October, 2013). In this study, we have examined all the microbial DNA sequences available for complete, draft, and Sequence Read Archive genomes in GenBank as well as three other major public databases, and assigned quality scores for more than 30,000 prokaryotic genome sequences. Scores were assigned using four categories: the completeness of the assembly, the presence of full-length rRNA genes, tRNA composition and the presence of a set of 102 conserved genes in prokaryotes. Most (~88%) of the genomes hadmore » quality scores of 0.8 or better and can be safely used for standard comparative genomics analysis. We compared genomes across factors that may influence the score. We found that although sequencing depth coverage of over 100x did not ensure a better score, sequencing read length was a better indicator of sequencing quality. With few exceptions, most of the 30,000 genomes have nearly all the 102 essential genes. The score can be used to set thresholds for screening data when analyzing “all published genomes” and reference data is either not available or not applicable. The scores highlighted organisms for which commonly used tools do not perform well. This information can be used to improve tools and to serve a broad group of users as more diverse organisms are sequenced. Finally and unexpectedly, the comparison of predicted tRNAs across 15,000 high quality genomes showed that anticodons beginning with an ‘A’ (codons ending with a ‘U’) are almost non-existent, with the exception of one arginine codon (CGU); this has been noted previously in the literature for a few genomes, but not with the depth found here.« less
[Application of Nested PCR in the Diagnosis of Imported Plasmodium Ovale Infection].
Huang, Bing-cheng; Xu, Chao; Li, Jin; Xiao, Ting; Yin, Kun; Liu, Gong-zhen; Wang, Wei-yan; Zhao, Gui-hua; Wei, Yan-bin; Wang, Yong-bin; Zhao, Chang-lei; Wei, Qing-kuan
2015-02-01
To identity Plasmodium ovale infection by 18S rRNA gene nested PCR. Whole blood and filter paper blood samples of malaria patients in Shandong Province were collected during 2012-2013. The parasites were observed under a microscope with Giemsa staining. The genome DNA of blood samples were extracted as PCR templates. Genus- and species-specific primers were designed according to the Plasmodium 18S rRNA gene sequences. Plasmodium ovale-positive specimens were identified by nested PCR as well as verified by sequencing. There were 7 imported cases of P. ovale infection in the province during 2012-2013. Nested PCR results showed that the P. ovale specific band (800 bp) was amplified in all the 7 specimens. Blast results indicated that the PCR products were consistent with the Plasmodium ovale reference sequence in GenBank. Seven imported cases of ovale malaria in Shandong Province in 2012-2013 are confirmed by nested PCR.
Budiman, Muhammad A.; Mao, Long; Wood, Todd C.; Wing, Rod A.
2000-01-01
Recently a new strategy using BAC end sequences as sequence-tagged connectors (STCs) was proposed for whole-genome sequencing projects. In this study, we present the construction and detailed characterization of a 15.0 haploid genome equivalent BAC library for the cultivated tomato, Lycopersicon esculentum cv. Heinz 1706. The library contains 129,024 clones with an average insert size of 117.5 kb and a chloroplast content of 1.11%. BAC end sequences from 1490 ends were generated and analyzed as a preliminary evaluation for using this library to develop an STC framework to sequence the tomato genome. A total of 1205 BAC end sequences (80.9%) were obtained, with an average length of 360 high-quality bases, and were searched against the GenBank database. Using a cutoff expectation value of <10−6, and combining the results from BLASTN, BLASTX, and TBLASTX searches, 24.3% of the BAC end sequences were similar to known sequences, of which almost half (48.7%) share sequence similarities to retrotransposons and 7% to known genes. Some of the transposable element sequences were the first reported in tomato, such as sequences similar to maize transposon Activator (Ac) ORF and tobacco pararetrovirus-like sequences. Interestingly, there were no BAC end sequences similar to the highly repeated TGRI and TGRII elements. However, the majority (70.3%) of STCs did not share significant sequence similarities to any sequences in GenBank at either the DNA or predicted protein levels, indicating that a large portion of the tomato genome is still unknown. Our data demonstrate that this BAC library is suitable for developing an STC database to sequence the tomato genome. The advantages of developing an STC framework for whole-genome sequencing of tomato are discussed. [The BAC end sequences described in this paper have been deposited in the GenBank data library under accession nos. AQ367111–AQ368361.] PMID:10645957
GeoBoost: accelerating research involving the geospatial metadata of virus GenBank records.
Tahsin, Tasnia; Weissenbacher, Davy; O'Connor, Karen; Magge, Arjun; Scotch, Matthew; Gonzalez-Hernandez, Graciela
2018-05-01
GeoBoost is a command-line software package developed to address sparse or incomplete metadata in GenBank sequence records that relate to the location of the infected host (LOIH) of viruses. Given a set of GenBank accession numbers corresponding to virus GenBank records, GeoBoost extracts, integrates and normalizes geographic information reflecting the LOIH of the viruses using integrated information from GenBank metadata and related full-text publications. In addition, to facilitate probabilistic geospatial modeling, GeoBoost assigns probability scores for each possible LOIH. Binaries and resources required for running GeoBoost are packed into a single zipped file and freely available for download at https://tinyurl.com/geoboost. A video tutorial is included to help users quickly and easily install and run the software. The software is implemented in Java 1.8, and supported on MS Windows and Linux platforms. gragon@upenn.edu. Supplementary data are available at Bioinformatics online.
Miyazaki, Akio; Shigaki, Toshiro; Koinuma, Hiroaki; Iwabuchi, Nozomu; Rauka, Gou Bue; Kembu, Alfred; Saul, Josephine; Watanabe, Kiyoto; Nijo, Takamichi; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou
2018-01-01
Bogia coconut syndrome (BCS) is one of the lethal yellowing (LY)-type diseases associated with phytoplasma presence that are seriously threatening coconut cultivation worldwide. It has recently emerged, and is rapidly spreading in northern parts of the island of New Guinea. BCS-associated phytoplasmas collected in different regions were compared in terms of 16S rRNA gene sequences, revealing high identity among them represented by strain BCS-Bo R . Comparative analysis of the 16S rRNA gene sequences revealed that BCS-Bo R shared less than a 97.5 % similarity with other species of 'Candidatus Phytoplasma', with a maximum value of 96.08 % (with strain LY; GenBank accession no. U18747). This result indicates the necessity and propriety of a novel taxon for BCS phytoplasmas according to the recommendations of the IRPCM. Phylogenetic analysis was also conducted on 16S rRNA gene sequences, resulting in a monophyletic cluster composed of BCS-Bo R and other LY-associated phytoplasmas. Other phytoplasmas on the island of New Guinea associated with banana wilt and arecanut yellow leaf diseases showed high similarities to BCS-Bo R and were closely related to BCS phytoplasmas. Based on the uniqueness of their 16S rRNA gene sequences, a novel taxon 'Ca.Phytoplasma noviguineense' is proposed for these phytoplasmas found on the island of New Guinea, with strain BCS-Bo R (GenBank accession no. LC228755) as the reference strain. The novel taxon is described in detail, including information on the symptoms of associated diseases and additional genetic features of the secY gene and rp operon.
Zúñiga, Jose D.; Gostel, Morgan R.; Mulcahy, Daniel G.; Barker, Katharine; Asia Hill; Sedaghatpour, Maryam; Vo, Samantha Q.; Funk, Vicki A.; Coddington, Jonathan A.
2017-01-01
Abstract The Global Genome Initiative has sequenced and released 1961 DNA barcodes for genetic samples obtained as part of the Global Genome Initiative for Gardens Program. The dataset includes barcodes for 29 plant families and 309 genera that did not have sequences flagged as barcodes in GenBank and sequences from officially recognized barcoding genetic markers meet the data standard of the Consortium for the Barcode of Life. The genetic samples were deposited in the Smithsonian Institution’s National Museum of Natural History Biorepository and their records were made public through the Global Genome Biodiversity Network’s portal. The DNA barcodes are now available on GenBank. PMID:29118648
Molecular cloning and sequencing analysis of the interferon receptor (IFNAR-1) from Columba livia.
Li, Chao; Chang, Wei Shan
2014-01-01
Partial sequence cloning of interferon receptor (IFNAR-1) of Columba livia. In order to obtain a certain length (630 bp) of gene, a pair of primers was designed according to the conserved nucleotide sequence of Gallus (EU477527.1) and Taeniopygia guttata (XM_002189232.1) IFNAR-1 gene fragment that was published by GenBank. Special primers were designed by the Race method to amplify the 3'terminal cDNA. The Columba livia IFNAR-1 displayed 88.5%, 80.5% and 73.8% nucleotide identity to Falco peregrinus, Gallus and Taeniopygia guttata, respectively. Phylogenetic analysis of the IFNAR1 gene showed that the relationship of Columba livia, Falco peregrinus and chicken had high homology. We successfully obtained a Columba livia IFNAR-1 gene partial sequence. Analysis of the genetic tree showed that the relationship of Columba livia and Falco peregrinus IFNAR-1 had high homology. This result can be used as reference for further research and practical application.
Molecular cloning and sequencing analysis of the interferon receptor (IFNAR-1) from Columba livia
Chang, Wei Shan
2014-01-01
Objective Partial sequence cloning of interferon receptor (IFNAR-1) of Columba livia. Material and methods In order to obtain a certain length (630 bp) of gene, a pair of primers was designed according to the conserved nucleotide sequence of Gallus (EU477527.1) and Taeniopygia guttata (XM_002189232.1) IFNAR-1 gene fragment that was published by GenBank. Special primers were designed by the Race method to amplify the 3'terminal cDNA. Results The Columba livia IFNAR-1 displayed 88.5%, 80.5% and 73.8% nucleotide identity to Falco peregrinus, Gallus and Taeniopygia guttata, respectively. Phylogenetic analysis of the IFNAR1 gene showed that the relationship of Columba livia, Falco peregrinus and chicken had high homology. Conclusions We successfully obtained a Columba livia IFNAR-1 gene partial sequence. Analysis of the genetic tree showed that the relationship of Columba livia and Falco peregrinus IFNAR-1 had high homology. This result can be used as reference for further research and practical application. PMID:26155117
A SSR-based genetic linkage map of cultivated peanut (Arachis hypogaea L.)
USDA-ARS?s Scientific Manuscript database
The objective of this study was to construct a molecular linkage map of cultivated tetraploid peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank. Three recombinant inbre...
Genome sequencing of the redbanded stink bug (Piezodorus guildinii)
USDA-ARS?s Scientific Manuscript database
We assembled a partial genome sequence from the redbanded stink bug, Piezodorus guildinii from Illumina MiSeq sequencing runs. The sequence has been submitted and published under NCBI GenBank Accession Number JTEQ01000000. The BioProject and BioSample Accession numbers are PRJNA263369 and SAMN030997...
VecScreen_plus_taxonomy: imposing a tax(onomy) increase on vector contamination screening.
Schäffer, Alejandro A; Nawrocki, Eric P; Choi, Yoon; Kitts, Paul A; Karsch-Mizrachi, Ilene; McVeigh, Richard
2018-03-01
Nucleic acid sequences in public databases should not contain vector contamination, but many sequences in GenBank do (or did) contain vectors. The National Center for Biotechnology Information uses the program VecScreen to screen submitted sequences for contamination. Additional tools are needed to distinguish true-positive (contamination) from false-positive (not contamination) VecScreen matches. A principal reason for false-positive VecScreen matches is that the sequence and the matching vector subsequence originate from closely related or identical organisms (for example, both originate in Escherichia coli). We collected information on the taxonomy of sources of vector segments in the UniVec database used by VecScreen. We used that information in two overlapping software pipelines for retrospective analysis of contamination in GenBank and for prospective analysis of contamination in new sequence submissions. Using the retrospective pipeline, we identified and corrected over 8000 contaminated sequences in the nonredundant nucleotide database. The prospective analysis pipeline has been in production use since April 2017 to evaluate some new GenBank submissions. Data on the sources of UniVec entries were included in release 10.0 (ftp://ftp.ncbi.nih.gov/pub/UniVec/). The main software is freely available at https://github.com/aaschaffer/vecscreen_plus_taxonomy. aschaffe@helix.nih.gov. Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2017. This work is written by US Government employees and are in the public domain in the US.
Lau, Joann M; Robinson, David L
2009-01-01
With rapid advances in biotechnology and molecular biology, instructors are challenged to not only provide undergraduate students with hands-on experiences in these disciplines but also to engage them in the "real-world" scientific process. Two common topics covered in biotechnology or molecular biology courses are gene-cloning and bioinformatics, but to provide students with a continuous laboratory-based research experience in these techniques is difficult. To meet these challenges, we have partnered with Bio-Rad Laboratories in the development of the "Cloning and Sequencing Explorer Series," which combines wet-lab experiences (e.g., DNA extraction, polymerase chain reaction, ligation, transformation, and restriction digestion) with bioinformatics analysis (e.g., evaluation of DNA sequence quality, sequence editing, Basic Local Alignment Search Tool searches, contig construction, intron identification, and six-frame translation) to produce a sequence publishable in the National Center for Biotechnology Information GenBank. This 6- to 8-wk project-based exercise focuses on a pivotal gene of glycolysis (glyceraldehyde-3-phosphate dehydrogenase), in which students isolate, sequence, and characterize the gene from a plant species or cultivar not yet published in GenBank. Student achievement was evaluated using pre-, mid-, and final-test assessments, as well as with a survey to assess student perceptions. Student confidence with basic laboratory techniques and knowledge of bioinformatics tools were significantly increased upon completion of this hands-on exercise.
Freeman, B.; Nico, L.G.; Osentoski, M.; Jelks, H.L.; Collins, T.M.
2007-01-01
Piranhas and their relatives have proven to be a challenging group from a systematic perspective, with difficulties in identification of species, linking of juveniles to adults, diagnosis of genera, and recognition of higher-level clades. In this study we add new molecular data consisting of three mitochondrial regions for museum vouchered and photo-documented representatives of the Serrasalmidae. These are combined with existing serrasalmid sequences in GenBank to address species and higher-level questions within the piranhas using parsimony and Bayesian methods. We found robust support for the monophyly of Serrasalmus manueli, but not for Serrasalmus gouldingi when GenBank specimens identified as S. gouldingi were included in the analysis. "Serrasalmus gouldingi" sequences in GenBank may, however, be misidentified. Linking of juveniles to adults of the same species was greatly facilitated by the addition of sequence data. Based on our sampling and identifications, our data robustly reject the monophyly of the genera Serrasalmus and Pristobrycon. We found evidence for a well-supported clade comprised of Serrasalmus, Pygocentrus, and Pristobrycon (in part). This clade was robustly supported in separate and combined analyses of gene regions, and was also supported by a unique molecular character, the loss of a tandem repeat in the control region. Analysis of specimens and a literature review suggest this clade is also characterized by the presence of a pre-anal spine and ectopterygoid teeth. A persistent polytomy at the base of this clade was dated using an independent calibration as 1.8 million years old, corresponding to the beginning of the Pleistocene Epoch, and suggesting an origin for this clade more recent than dates cited in the recent literature. The sister group to this clade is also robustly supported, and consists of Catoprion, Pygopristis, and Pristobrycon striolatus. If the term piranha is to refer to a monophyletic clade, it should be restricted to Serrasalmus, Pygocentrus, and Pristobrycon (in part), or expanded to include these taxa plus Pygopristis, Catoprion, and Pristobrycon striolatus. Copyright ?? 2007 Magnolia Press.
A statistical view of FMRFamide neuropeptide diversity.
Espinoza, E; Carrigan, M; Thomas, S G; Shaw, G; Edison, A S
2000-01-01
FMRFamide-like peptide (FLP) amino acid sequences have been collected and statistically analyzed. FLP amino acid composition as a function of position in the peptide is graphically presented for several major phyla. Results of total amino acid composition and frequencies of pairs of FLP amino acids have been computed and compared with corresponding values from the entire GenBank protein sequence database. The data for pairwise distributions of amino acids should help in future structure-function studies of FLPs. To aid in future peptide discovery, a computer program and search protocol was developed to identify FLPs from the GenBank protein database without the use of keywords.
2012-01-01
Background In the scientific biodiversity community, it is increasingly perceived the need to build a bridge between molecular and traditional biodiversity studies. We believe that the information technology could have a preeminent role in integrating the information generated by these studies with the large amount of molecular data we can find in bioinformatics public databases. This work is primarily aimed at building a bioinformatic infrastructure for the integration of public and private biodiversity data through the development of GIDL, an Intelligent Data Loader coupled with the Molecular Biodiversity Database. The system presented here organizes in an ontological way and locally stores the sequence and annotation data contained in the GenBank primary database. Methods The GIDL architecture consists of a relational database and of an intelligent data loader software. The relational database schema is designed to manage biodiversity information (Molecular Biodiversity Database) and it is organized in four areas: MolecularData, Experiment, Collection and Taxonomy. The MolecularData area is inspired to an established standard in Generic Model Organism Databases, the Chado relational schema. The peculiarity of Chado, and also its strength, is the adoption of an ontological schema which makes use of the Sequence Ontology. The Intelligent Data Loader (IDL) component of GIDL is an Extract, Transform and Load software able to parse data, to discover hidden information in the GenBank entries and to populate the Molecular Biodiversity Database. The IDL is composed by three main modules: the Parser, able to parse GenBank flat files; the Reasoner, which automatically builds CLIPS facts mapping the biological knowledge expressed by the Sequence Ontology; the DBFiller, which translates the CLIPS facts into ordered SQL statements used to populate the database. In GIDL Semantic Web technologies have been adopted due to their advantages in data representation, integration and processing. Results and conclusions Entries coming from Virus (814,122), Plant (1,365,360) and Invertebrate (959,065) divisions of GenBank rel.180 have been loaded in the Molecular Biodiversity Database by GIDL. Our system, combining the Sequence Ontology and the Chado schema, allows a more powerful query expressiveness compared with the most commonly used sequence retrieval systems like Entrez or SRS. PMID:22536971
PoMaMo--a comprehensive database for potato genome data.
Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane
2005-01-01
A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes.
PoMaMo—a comprehensive database for potato genome data
Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane
2005-01-01
A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes. PMID:15608284
Li, H C; Lu, H B; Yang, F Y; Liu, S J; Bai, C J; Zhang, Y W
2015-03-31
Sucrose phosphate synthase (SPS) is an enzyme used by higher plants for sucrose synthesis. In this study, three primer sets were designed on the basis of known SPS sequences from maize (GenBank: NM_001112224.1) and sugarcane (GenBank: JN584485.1), and five novel SPS genes were identified by RT-PCR from the genomes of Pennisetum spp (the hybrid P. americanum x P. purpureum, P. purpureum Schum., P. purpureum Schum. cv. Red, P. purpureum Schum. cv. Taiwan, and P. purpureum Schum. cv. Mott). The cloned sequences showed 99.9% identity and 80-88% similarity to the SPS sequences of other plants. The SPS gene of hybrid Pennisetum had one nucleotide and four amino acid polymorphisms compared to the other four germplasms, and cluster analysis was performed to assess genetic diversity in this species. Additional characterization of the SPS gene product can potentially allow Pennisetum to be exploited as a biofuel source.
First report on Babesia vogeli infection in dogs in the Philippines.
Ybañez, Adrian P; Ybañez, Rochelle Haidee D; Talle, MaxFrancis G; Liu, Mingming; Moumouni, Paul Franck Adjou; Xuan, Xuenan
2017-02-01
Babesia vogeli is a tick-borne protozoal pathogen that infects erythrocytes. In Southeast Asia, this pathogen has only been reported in Thailand. In this study, nine dogs presented at three different veterinary clinics in Cebu City, Philippines were found positive for B. vogeli. DNA was extracted from blood samples and tested using a PCR for genus Babesia and a PCR specific for B. vogeli (both based on the 18S rRNA gene). Blood smears (triplicate) from each sample were found negative. All positive amplicons were sequenced and were found to be 99.4% identical to registered B. vogeli sequences at Genbank. Phylogenetic analysis revealed monophyletic grouping of Philippine sequences with the registered A. platys Genbank sequences. This is the first report of B. vogeli infection in dogs in the Philippines. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Banerjee, Kakoli; Prasad, R. A.
2014-10-01
The whole gamut of Genetic data is ever increasing exponentially. The human genome in its base format occupies almost thirty terabyte of data and doubling its size every two and a half year. It is well-know that computational resources are limited. The most important resource which genetic data requires in its collection, storage and retrieval is its storage space. Storage is limited. Computational performance is also dependent on storage and execution time. Transmission capabilities are also directly dependent on the size of the data. Hence Data compression techniques become an issue of utmost importance when we confront with the task of handling such giganticdatabases like GenBank. Decompression is also an issue when such huge databases are being handled. This paper is intended not only to provide genetic data compression but also partially decompress the genetic sequences.
Exploring Evolutionary Patterns in Genetic Sequence: A Computer Exercise
ERIC Educational Resources Information Center
Shumate, Alice M.; Windsor, Aaron J.
2010-01-01
The increase in publications presenting molecular evolutionary analyses and the availability of comparative sequence data through resources such as NCBI's GenBank underscore the necessity of providing undergraduates with hands-on sequence analysis skills in an evolutionary context. This need is particularly acute given that students have been…
Masny, Aleksander; Jagiełło, Agata; Płucienniczak, Grażyna; Golab, Elzbieta
2012-09-01
Ribo HRM, a single-tube PCR and high resolution melting (HRM) assay for detection of polymorphisms in the large subunit ribosomal DNA expansion segment V, was developed on a Trichinella model. Four Trichinella species: T. spiralis (isolates ISS3 and ISS160), T. nativa (isolates ISS10 and ISS70), T. britovi (isolates ISS2 and ISS392) and T. pseudospiralis (isolates ISS13 and ISS1348) were genotyped. Cloned allelic variants of the expansion segment V were used as standards to prepare reference HRM curves characteristic for single sequences and mixtures of several cloned sequences imitating allelic composition detected in Trichinella isolates. Using the primer pair Tsr1 and Trich1bi, it was possible to amplify a fragment of the ESV and detect PCR products obtained from the genomic DNA of pools of larvae belonging to the four investigated species: T. pseudospiralis, T. spiralis, T. britovi and T. nativa, in a single tube Real-Time PCR reaction. Differences in the shape of the HRM curves of Trichinella isolates suggested the presence of differences between examined isolates of T. nativa, T. britovi and T. pseudospiralis species. No differences were observed between T. spiralis isolates. The presence of polymorphisms within the amplified ESV sequence fragment of T. nativa T. britovi and T. pseudospiralis was confirmed by sequencing of the cloned PCR products. Novel sequences were discovered and deposited in GenBank (GenBank IDs: JN971020-JN971027, JN120902.1, JN120903.1, JN120904.1, JN120906.1, JN120905.1). Screening the ESV region of Trichinella for polymorphism is possible using the genotyping assay Ribo HRM at the current state of its development. The Ribo HRM assay could be useful in phylogenetic studies of the Trichinella genus. Copyright © 2012 Elsevier B.V. All rights reserved.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
Complete Genome Sequences of Two Vesicular Stomatitis Virus Isolates Collected in Mexico.
Velazquez-Salinas, Lauro; Isa, Pavel; Pauszek, Steven J; Rodriguez, Luis L
2017-09-14
We report two full-genome sequences of vesicular stomatitis New Jersey virus (VSNJV) obtained by Illumina next-generation sequencing of RNA isolated from epithelial suspensions of cattle naturally infected in Mexico. These genomes represent the first full-genome sequences of vesicular stomatitis New Jersey viruses circulating in Mexico deposited in the GenBank database.
Chopra, Ratan; Burow, Gloria; Farmer, Andrew; Mudge, Joann; Simpson, Charles E; Wilkins, Thea A; Baring, Michael R; Puppala, Naveen; Chamberlin, Kelly D; Burow, Mark D
2015-06-01
Single-nucleotide polymorphisms, which can be identified in the thousands or millions from comparisons of transcriptome or genome sequences, are ideally suited for making high-resolution genetic maps, investigating population evolutionary history, and discovering marker-trait linkages. Despite significant results from their use in human genetics, progress in identification and use in plants, and particularly polyploid plants, has lagged. As part of a long-term project to identify and use SNPs suitable for these purposes in cultivated peanut, which is tetraploid, we generated transcriptome sequences of four peanut cultivars, namely OLin, New Mexico Valencia C, Tamrun OL07 and Jupiter, which represent the four major market classes of peanut grown in the world, and which are important economically to the US southwest peanut growing region. CopyDNA libraries of each genotype were used to generate 2 × 54 paired-end reads using an Illumina GAIIx sequencer. Raw reads were mapped to a custom reference consisting of Tifrunner 454 sequences plus peanut ESTs in GenBank, compromising 43,108 contigs; 263,840 SNP and indel variants were identified among four genotypes compared to the reference. A subset of 6 variants was assayed across 24 genotypes representing four market types using KASP chemistry to assess the criteria for SNP selection. Results demonstrated that transcriptome sequencing can identify SNPs usable as selectable DNA-based markers in complex polyploid species such as peanut. Criteria for effective use of SNPs as markers are discussed in this context.
Precise genotyping and recombination detection of Enterovirus
2015-01-01
Enteroviruses (EV) with different genotypes cause diverse infectious diseases in humans and mammals. A correct EV typing result is crucial for effective medical treatment and disease control; however, the emergence of novel viral strains has impaired the performance of available diagnostic tools. Here, we present a web-based tool, named EVIDENCE (EnteroVirus In DEep conception, http://symbiont.iis.sinica.edu.tw/evidence), for EV genotyping and recombination detection. We introduce the idea of using mixed-ranking scores to evaluate the fitness of prototypes based on relatedness and on the genome regions of interest. Using phylogenetic methods, the most possible genotype is determined based on the closest neighbor among the selected references. To detect possible recombination events, EVIDENCE calculates the sequence distance and phylogenetic relationship among sequences of all sliding windows scanning over the whole genome. Detected recombination events are plotted in an interactive figure for viewing of fine details. In addition, all EV sequences available in GenBank were collected and revised using the latest classification and nomenclature of EV in EVIDENCE. These sequences are built into the database and are retrieved in an indexed catalog, or can be searched for by keywords or by sequence similarity. EVIDENCE is the first web-based tool containing pipelines for genotyping and recombination detection, with updated, built-in, and complete reference sequences to improve sensitivity and specificity. The use of EVIDENCE can accelerate genotype identification, aiding clinical diagnosis and enhancing our understanding of EV evolution. PMID:26678286
Conte-Grand, Cecilia; Britz, Ralf; Dahanukar, Neelesh; Raghavan, Rajeev; Pethiyagoda, Rohan; Tan, Heok Hui; Hadiaty, Renny K.; Yaakob, Norsham S.
2017-01-01
Snakehead fishes of the family Channidae are predatory freshwater teleosts from Africa and Asia comprising 38 valid species. Snakeheads are important food fishes (aquaculture, live food trade) and have been introduced widely with several species becoming highly invasive. A channid barcode library was recently assembled by Serrao and co-workers to better detect and identify potential and established invasive snakehead species outside their native range. Comparing our own recent phylogenetic results of this taxonomically confusing group with those previously reported revealed several inconsistencies that prompted us to expand and improve on previous studies. By generating 343 novel snakehead coxI sequences and combining them with an additional 434 coxI sequences from GenBank we highlight several problems with previous efforts towards the assembly of a snakehead reference barcode library. We found that 16.3% of the channid coxI sequences deposited in GenBank are based on misidentifications. With the inclusion of our own data we were, however, able to solve these cases of perpetuated taxonomic confusion. Different species delimitation approaches we employed (BIN, GMYC, and PTP) were congruent in suggesting a potentially much higher species diversity within snakeheads than currently recognized. In total, 90 BINs were recovered and within a total of 15 currently recognized species multiple BINs were identified. This higher species diversity is mostly due to either the incorporation of undescribed, narrow range, endemics from the Eastern Himalaya biodiversity hotspot or the incorporation of several widespread species characterized by deep genetic splits between geographically well-defined lineages. In the latter case, over-lumping in the past has deflated the actual species numbers. Further integrative approaches are clearly needed for providing a better taxonomic understanding of snakehead diversity, new species descriptions and taxonomic revisions of the group. PMID:28931084
Complete Genome Sequences of Two Vesicular Stomatitis Virus Isolates Collected in Mexico
Isa, Pavel; Pauszek, Steven J.; Rodriguez, Luis L.
2017-01-01
ABSTRACT We report two full-genome sequences of vesicular stomatitis New Jersey virus (VSNJV) obtained by Illumina next-generation sequencing of RNA isolated from epithelial suspensions of cattle naturally infected in Mexico. These genomes represent the first full-genome sequences of vesicular stomatitis New Jersey viruses circulating in Mexico deposited in the GenBank database. PMID:28912331
Molecular epidemiology of rabies virus in Poland.
Orłowska, Anna; Żmudziński, Jan Franciszek
2014-08-01
The paper describes a phylogenetic study of 58 Polish isolates of rabies virus collected between 1992 and 2010. Sequences of the nucleoprotein (N) and glycoprotein (G) genes approximately 600 bp long were compared with reference sequences (GenBank) of European rabies viruses from neighbouring countries. The study confirmed a very high level of homology (94.4-100 %) of the Polish rabies virus strains irrespective of the date of isolation. Two variants of rabies virus: NEE (Northeastern Europe variant) and CE (Central Europe variant), depending on the geographical place of isolation, were circulating in Poland from 1992 to 2010. The Polish rabies virus isolates showed high similarity to European RABV strains, especially those collected in Ukraine and Romania. They were clearly different from vaccine strains SAD B19 and SAD Bern, which have been used for oral vaccination of foxes against rabies in Poland since 1993.
A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records
Weissenbacher, Davy; Rivera, Robert; Beard, Rachel; Firago, Mari; Wallstrom, Garrick; Scotch, Matthew; Gonzalez, Graciela
2016-01-01
Objective The metadata reflecting the location of the infected host (LOIH) of virus sequences in GenBank often lacks specificity. This work seeks to enhance this metadata by extracting more specific geographic information from related full-text articles and mapping them to their latitude/longitudes using knowledge derived from external geographical databases. Materials and Methods We developed a rule-based information extraction framework for linking GenBank records to the latitude/longitudes of the LOIH. Our system first extracts existing geospatial metadata from GenBank records and attempts to improve it by seeking additional, relevant geographic information from text and tables in related full-text PubMed Central articles. The final extracted locations of the records, based on data assimilated from these sources, are then disambiguated and mapped to their respective geo-coordinates. We evaluated our approach on a manually annotated dataset comprising of 5728 GenBank records for the influenza A virus. Results We found the precision, recall, and f-measure of our system for linking GenBank records to the latitude/longitudes of their LOIH to be 0.832, 0.967, and 0.894, respectively. Discussion Our system had a high level of accuracy for linking GenBank records to the geo-coordinates of the LOIH. However, it can be further improved by expanding our database of geospatial data, incorporating spell correction, and enhancing the rules used for extraction. Conclusion Our system performs reasonably well for linking GenBank records for the influenza A virus to the geo-coordinates of their LOIH based on record metadata and information extracted from related full-text articles. PMID:26911818
A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records.
Tahsin, Tasnia; Weissenbacher, Davy; Rivera, Robert; Beard, Rachel; Firago, Mari; Wallstrom, Garrick; Scotch, Matthew; Gonzalez, Graciela
2016-09-01
The metadata reflecting the location of the infected host (LOIH) of virus sequences in GenBank often lacks specificity. This work seeks to enhance this metadata by extracting more specific geographic information from related full-text articles and mapping them to their latitude/longitudes using knowledge derived from external geographical databases. We developed a rule-based information extraction framework for linking GenBank records to the latitude/longitudes of the LOIH. Our system first extracts existing geospatial metadata from GenBank records and attempts to improve it by seeking additional, relevant geographic information from text and tables in related full-text PubMed Central articles. The final extracted locations of the records, based on data assimilated from these sources, are then disambiguated and mapped to their respective geo-coordinates. We evaluated our approach on a manually annotated dataset comprising of 5728 GenBank records for the influenza A virus. We found the precision, recall, and f-measure of our system for linking GenBank records to the latitude/longitudes of their LOIH to be 0.832, 0.967, and 0.894, respectively. Our system had a high level of accuracy for linking GenBank records to the geo-coordinates of the LOIH. However, it can be further improved by expanding our database of geospatial data, incorporating spell correction, and enhancing the rules used for extraction. Our system performs reasonably well for linking GenBank records for the influenza A virus to the geo-coordinates of their LOIH based on record metadata and information extracted from related full-text articles. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
DNA sequence database as a tool to identify decapod crustaceans on the São Paulo coastline.
Mantelatto, Fernando L; Terossi, Mariana; Negri, Mariana; Buranelli, Raquel C; Robles, Rafael; Magalhães, Tatiana; Tamburus, Ana Francisca; Rossi, Natália; Miyazaki, Mayara J
2017-09-05
DNA barcoding has emerged as an efficient tool for taxonomy and other biodiversity fields. The vast and speciose group of decapod crustaceans is not an exception in the current scenario and comparing short DNA fragments has enabled researchers to overcome some taxonomic impediments to help broadening knowledge on the diversity of this group of crustaceans. Brazil is considered as an important area in terms of global marine biodiversity and some regions stand out in terms of decapod fauna, such as the São Paulo coastline. Thus, the aim of this study is to obtain sequences of the mitochondrial markers (COI and 16S) for decapod crustaceans distributed at the São Paulo coastline and to test the accuracy of these markers for species identification from this region by comparing our sequences to those already present in the GenBank database. We sampled along almost the 300 km of the São Paulo coastline from estuaries to offshore islands during the development of a multidisciplinary research project that took place for 5 years. All the species were processed to obtain the DNA sequences. The diversity of the decapod fauna on the São Paulo coastline comprises at least 404 species. We were able to collect 256 of those species and sequence of at least one of the target genes from 221. By testing the accuracy of these two DNA markers as a tool for identification, we were able to check our own identifications, including new records in GenBank, spot potential mistakes in GenBank, and detect potential new species.
Molecular characterization of Echinococcus granulosus isolated from sheep in Palestine.
Adwan, Ghaleb; Adwan, Kamel; Bdir, Sami; Abuseir, Sameh
2013-06-01
A total of twenty-three Echinococcus granulosus hydatid cysts were collected from infected sheep slaughtered in Nablus abattoir, Nablus - Palestine. Protoscoleces or germinal membranes were used for DNA extraction followed by PCR amplification. Amplified products were analyzed the presence of a fragment of 444bp of the mitochondrial cytochrome c oxidase subunit 1 (cox1) gene followed by nucleotide sequencing. Overall, 21 hydatid cysts were positive compared to a negative control. The partial sequences of cox1 gene of E. granulosus strains indicated that the sheep in Palestine were infected with genotype 1 (G1), genotype 2 (G2) and genotype 3 (G3). The prevalence of these genotypes was (14/21) 66.7%, (4/21) 19.0% and (3/21) 14.3% for G1, G2 and G3, respectively. Our results showed that twelve strains of G1 belonged to the common haplotype EG01 which is the major haplotype in all the geographic populations. Phylogenetic analysis also showed that two sequences of G1 genotype which have GenBank accession No. KC109657 and KC109659 were corresponding to G1.4 micro-variants. Only the sequence of GenBank accession No. KC109652 identified in our study as G2 was found to have complete identity to the original sequence described for the cox1 gene (GenBank accession No. M84662). It is concluded that G1 genotype is the predominant genotype in sheep in Palestine. Therefore, these findings should be taken into consideration in developing prevention strategies and control programs for hydatidosis in Palestine. Copyright © 2013 Elsevier Inc. All rights reserved.
Genetic recombination of tick-borne flaviviruses among wild-type strains.
Norberg, Peter; Roth, Anette; Bergström, Tomas
2013-06-05
Genetic recombination has been suggested to occur in mosquito-borne flaviviruses. In contrast, tick-borne flaviviruses have been thought to evolve in a clonal manner, although recent studies suggest that recombination occurs also for these viruses. We re-analyzed the data and found that previous conclusions on wild type recombination were probably falsely drawn due to misalignments of nucleotide sequences, ambiguities in GenBank sequences, or different laboratory culture histories suggestive of recombination events in laboratory. To evaluate if reliable predictions of wild type recombination of tick-borne flaviviruses can be made, we analyzed viral strains sequenced exclusively for this study, and other flavivirus sequences retrieved from GenBank. We detected genetic signals supporting recombination between viruses within the three clades of TBEV-Eu, TBEV-Sib and TBEV-Fe, respectively. Our results suggest that the tick-borne encephalitis viruses may undergo recombination under natural conditions, but that geographic barriers restrict most recombination events to involve only closely genetically related viruses. Copyright © 2013 Elsevier Inc. All rights reserved.
Yuan, Ying; Dego, Oudessa Kerro; Chen, Xueyan; Abadin, Eurife; Chan, Shangfeng; Jory, Lauren; Kovacevic, Steven; Almeida, Raul A; Oliver, Stephen P
2014-12-01
The objective was to identify and sequence the sua gene (GenBank no. DQ232760; http://www.ncbi.nlm.nih.gov/genbank/) and detect Streptococcus uberis adhesion molecule (SUAM) expression by Western blot using serum from naturally S. uberis-infected cows in strains of S. uberis isolated in milk from cows with mastitis from geographically diverse areas of the world. All strains evaluated yielded a 4.4-kb sua-containing PCR fragment that was subsequently sequenced. Deduced SUAM AA sequences from those S. uberis strains evaluated shared >97% identity. The pepSUAM sequence located at the N terminus of SUAM was >99% identical among strains of S. uberis. Streptococcus uberis adhesion molecule expression was detected in all strains of S. uberis tested. These results suggest that sua is ubiquitous among strains of S. uberis isolated from diverse geographic locations and that SUAM is immunogenic. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Arbefeville, S; Harris, A; Ferrieri, P
2017-09-01
Fungal infections cause considerable morbidity and mortality in immunocompromised patients. Rapid and accurate identification of fungi is essential to guide accurately targeted antifungal therapy. With the advent of molecular methods, clinical laboratories can use new technologies to supplement traditional phenotypic identification of fungi. The aims of the study were to evaluate the sole commercially available MicroSEQ® D2 LSU rDNA Fungal Identification Kit compared to the in-house developed internal transcribed spacer (ITS) regions assay in identifying moulds, using two well-known online public databases to analyze sequenced data. 85 common and uncommon clinically relevant fungi isolated from clinical specimens were sequenced for the D2 region of the large subunit (LSU) of ribosomal RNA (rRNA) gene with the MicroSEQ® Kit and the ITS regions with the in house developed assay. The generated sequenced data were analyzed with the online GenBank and MycoBank public databases. The D2 region of the LSU rRNA gene identified 89.4% or 92.9% of the 85 isolates to the genus level and the full ITS region (f-ITS) 96.5% or 100%, using GenBank or MycoBank, respectively, when compared to the consensus ID. When comparing species-level designations to the consensus ID, D2 region of the LSU rRNA gene aligned with 44.7% (38/85) or 52.9% (45/85) of these isolates in GenBank or MycoBank, respectively. By comparison, f-ITS possessed greater specificity, followed by ITS1, then ITS2 regions using GenBank or MycoBank. Using GenBank or MycoBank, D2 region of the LSU rRNA gene outperformed phenotypic based ID at the genus level. Comparing rates of ID between D2 region of the LSU rRNA gene and the ITS regions in GenBank or MycoBank at the species level against the consensus ID, f-ITS and ITS2 exceeded performance of the D2 region of the LSU rRNA gene, but ITS1 had similar performance to the D2 region of the LSU rRNA gene using MycoBank. Our results indicated that the MicroSEQ® D2 LSU rDNA Fungal Identification Kit was equivalent to the in-house developed ITS regions assay to identify fungi at the genus level. The MycoBank database gave a better curated database and thus allowed a better genus and species identification for both D2 region of the LSU rRNA gene and ITS regions. Copyright © 2017 Elsevier B.V. All rights reserved.
Moustafa, Mohamed Abdallah Mohamed; Shimozuru, Michito; Mohamed, Wessam; Taylor, Kyle Rueben; Nakao, Ryo; Sashika, Mariko; Tsubota, Toshio
2017-08-01
Sarcocystis and Hepatozoon species are protozoan parasites that are frequently detected in domestic and wild animals. Rodents are considered common intermediate and paratenic hosts for several Sarcocystis and Hepatozoon species. Here, blood DNA samples from a total of six rodents, including one Myodes rutilus, one Myodes rufocanus, and four Apodemus speciosus, collected from Hokkaido, Japan, were shown by conventional PCR of the 18S ribosomal RNA (rRNA) gene to contain Sarcocystis and Hepatozoon DNA. Sequencing of the DNA detected one Sarcocystis sp. in the M. rufocanus sample and two different Hepatozoon spp. in the M. rutilus and A. speciosus samples. Phylogenetic analysis showed that the detected Sarcocystis sp. sequence grouped with GenBank Sarcocystis sequences from rodents, snakes, and raccoons from Japan and China. The 18S rRNA partial gene sequences of both detected Hepatozoon spp. clustered with GenBank Hepatozoon sequences from snakes, geckos and voles in Europe, Africa, and Asia. This study provides evidence that wild rodents have a role in the maintenance of Sarcocystis and Hepatozoon species on the island of Hokkaido.
Pohuang, Tawatchai; Chansiripornchai, Niwat; Tawatsin, Achara; Sasipreeyajan, Jiroj
2009-09-01
Thirteen field isolates of infectious bronchitis virus (IBV) were isolated from broiler flocks in Thailand between January and June 2008. The 878-bp of the S1 gene covering a hypervariable region was amplified and sequenced. Phylogenetic analysis based on that region revealed that these viruses were separated into two groups (I and II). IBV isolates in group I were not related to other IBV strains published in the GenBank database. Group 1 nucleotide sequence identities were less than 85% and amino acid sequence identities less than 84% in common with IBVs published in the GenBank database. This group likely represents the strains indigenous to Thailand. The isolates in group II showed a close relationship with Chinese IBVs. They had nucleotide sequence identities of 97-98% and amino acid sequence identities 96-98% in common with Chinese IBVs (strain A2, SH and QXIBV). This finding indicated that the recent Thai IBVs evolved separately and at least two groups of viruses are circulating in Thailand.
NASA Astrophysics Data System (ADS)
Zhan, Wenbin; Li, Yongqin; Sheng, Xiuzhen; Xing, Jing; Tang, Xiaoqian
2010-11-01
We isolated a strain of lymphocystis disease virus (LCDV) from Japanese flounder ( Paralichthys olivaceus) cultured in northern China. Based on published sequences of major capsid protein (MCP) gene of LCDV-cn (GenBank: AF126405), we designed two primer sets P1/P2 and P3/P4. We then used one-step or nested PCR and in-situ hybridization (ISH) to detect LCDV and identify the target tissues or cells in infected Japanese flounder. The PCR products were positive in purified viral supernatant, skin nodules, gut, gill, kidney, spleen, stomach, heart, and liver of Japanese flounder. We compared the DNA sequence with 14 MCP nucleotide sequences from GenBank, including Megalocytivirus (OFIV and RSIV), Iridovirus (CzIV and WIV), Ranavirus (TFV and FV3), and Lymphocystivirus (8 LCDV). Based on the alignment, we confirmed the PCR product was from Lymphocystivirus (GenBank accession number DQ279090 (LCDV-HD)). Using ISH, we noted the presence of LCDV in the skin nodules, gut, gill, spleen, stomach, and heart of spontaneously infected Japanese flounders. We successfully amplified LCDV fragments from Schlegel’s black rockfish ( Sebastes schlegeli Higendorf), redwing sea robin ( Lepidotrigla microptera Günther) and turbot ( Scophthalmus maximus) using the one-step and nested PCR, suggesting the target genes can be widely detected in fish using this method.
Reevaluating the serotype II capsular locus of Streptococcus agalactiae.
Martins, E R; Melo-Cristino, J; Ramirez, M
2007-10-01
We report a novel sequence of the serotype II capsular locus of group B streptococcus that resolves inconsistencies among the results of various groups and the sequence in GenBank. This locus was found in diverse lineages and presents genes consistent with the complete synthesis of the type II polysaccharide.
Phytophthora siskiyouensis, a new species from soil and water in southwest Oregon
Paul Reeser; Everett Hansen; Wendy Sutton
2008-01-01
An unknown Phytophthora species was recovered from rhododendron and tanoak leaf baits used for monitoring streams and soils in Southwestern Oregon for the presence of Phytophthora ramorum. Isolates of this species yielded ITS-DNA sequences that differed substantially from other Phytophthora sequences in GenBank....
Three Decades of Recombinant DNA.
ERIC Educational Resources Information Center
Palmer, Jackie
1985-01-01
Discusses highlights in the development of genetic engineering, examining techniques with recombinant DNA, legal and ethical issues, GenBank (a national database of nucleic acid sequences), and other topics. (JN)
Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin
2011-01-01
The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Bioactive endophytes warrant intensified exploration and conservation.
Smith, Stephen A; Tank, David C; Boulanger, Lori-Ann; Bascom-Slack, Carol A; Eisenman, Kaury; Kingery, David; Babbs, Beatrice; Fenn, Kathleen; Greene, Joshua S; Hann, Bradley D; Keehner, Jocelyn; Kelley-Swift, Elizabeth G; Kembaiyan, Vivek; Lee, Sun Jin; Li, Puyao; Light, David Y; Lin, Emily H; Ma, Cong; Moore, Emily; Schorn, Michelle A; Vekhter, Daniel; Nunez, Percy V; Strobel, Gary A; Donoghue, Michael J; Strobel, Scott A
2008-08-25
A key argument in favor of conserving biodiversity is that as yet undiscovered biodiversity will yield products of great use to humans. However, the link between undiscovered biodiversity and useful products is largely conjectural. Here we provide direct evidence from bioassays of endophytes isolated from tropical plants and bioinformatic analyses that novel biology will indeed yield novel chemistry of potential value. We isolated and cultured 135 endophytic fungi and bacteria from plants collected in Peru. nrDNAs were compared to samples deposited in GenBank to ascertain the genetic novelty of cultured specimens. Ten endophytes were found to be as much as 15-30% different than any sequence in GenBank. Phylogenetic trees, using the most similar sequences in GenBank, were constructed for each endophyte to measure phylogenetic distance. Assays were also conducted on each cultured endophyte to record bioactivity, of which 65 were found to be bioactive. The novelty of our contribution is that we have combined bioinformatic analyses that document the diversity found in environmental samples with culturing and bioassays. These results highlight the hidden hyperdiversity of endophytic fungi and the urgent need to explore and conserve hidden microbial diversity. This study also showcases how undergraduate students can obtain data of great scientific significance.
NASA Astrophysics Data System (ADS)
von Beeren, Christoph; Stoeckle, Mark Y.; Xia, Joyce; Burke, Griffin; Kronauer, Daniel J. C.
2015-02-01
DNA barcoding promises to be a useful tool to identify pest species assuming adequate representation of genetic variants in a reference library. Here we examined mitochondrial DNA barcodes in a global urban pest, the American cockroach (Periplaneta americana). Our sampling effort generated 284 cockroach specimens, most from New York City, plus 15 additional U.S. states and six other countries, enabling the first large-scale survey of P. americana barcode variation. Periplaneta americana barcode sequences (n = 247, including 24 GenBank records) formed a monophyletic lineage separate from other Periplaneta species. We found three distinct P. americana haplogroups with relatively small differences within (<=0.6%) and larger differences among groups (2.4%-4.7%). This could be interpreted as indicative of multiple cryptic species. However, nuclear DNA sequences (n = 77 specimens) revealed extensive gene flow among mitochondrial haplogroups, confirming a single species. This unusual genetic pattern likely reflects multiple introductions from genetically divergent source populations, followed by interbreeding in the invasive range. Our findings highlight the need for comprehensive reference databases in DNA barcoding studies, especially when dealing with invasive populations that might be derived from multiple genetically distinct source populations.
Vuta, Vlad; Picard-Meyer, Evelyne; Robardet, Emmanuelle; Barboi, Gheorghe; Motiu, Razvan; Barbuceanu, Florica; Vlagioiu, Constantin; Cliquet, Florence
2016-09-22
Rabies is a fatal neuropathogenic zoonosis caused by the rabies virus of the Lyssavirus genus, Rhabdoviridae family. The oral vaccination of foxes - the main reservoir of rabies in Europe - using a live attenuated rabies virus vaccine was successfully conducted in many Western European countries. In July 2015, a rabies vaccine strain was isolated from the brain tissues of a clinically suspect cow (Bos taurus) in Romania. The nucleotide analysis of both N and G gene sequences showed 100% identity between the rabid animal, the GenBank reference SAD B19 strain and five rabies vaccine batches used for the national oral vaccination campaign targeting foxes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Coding Complete Genome for the Mogiana Tick Virus, a Jingmenvirus Isolated from Ticks in Brazil
2017-05-04
sequences for all four genome segments. We downloaded the raw Illumina sequence reads from the NCBI Short Read Archive (GenBank...MGTV genome segments through sequence similarity (BLASTN) to the published genome of Jingmen tick virus (JMTV) isolate SY84 (GenBank: KJ001579-KJ001582...2014. Standards for sequencing viral genomes in the era of high-throughput sequencing . MBio 5:e01360–14. 8. Bankevich A, Nurk S, Antipov
Valdez-Moreno, Martha; Quintal-Lizama, Carolina; Gómez-Lozano, Ricardo; García-Rivas, María Del Carmen
2012-01-01
In the Mexican Caribbean, the exotic lionfish Pterois volitans has become a species of great concern because of their predatory habits and rapid expansion onto the Mesoamerican coral reef, the second largest continuous reef system in the world. This is the first report of DNA identification of stomach contents of lionfish using the barcode of life reference database (BOLD). We confirm with barcoding that only Pterois volitans is apparently present in the Mexican Caribbean. We analyzed the stomach contents of 157 specimens of P. volitans from various locations in the region. Based on DNA matches in the Barcode of Life Database (BOLD) and GenBank, we identified fishes from five orders, 14 families, 22 genera and 34 species in the stomach contents. The families with the most species represented were Gobiidae and Apogonidae. Some prey taxa are commercially important species. Seven species were new records for the Mexican Caribbean: Apogon mosavi, Coryphopterus venezuelae, C. thrix, C. tortugae, Lythrypnus minimus, Starksia langi and S. ocellata. DNA matches, as well as the presence of intact lionfish in the stomach contents, indicate some degree of cannibalism, a behavior confirmed in this species by the first time. We obtained 45 distinct crustacean prey sequences, from which only 20 taxa could be identified from the BOLD and GenBank databases. The matches were primarily to Decapoda but only a single taxon could be identified to the species level, Euphausia americana. This technique proved to be an efficient and useful method, especially since prey species could be identified from partially-digested remains. The primary limitation is the lack of comprehensive coverage of potential prey species in the region in the BOLD and GenBank databases, especially among invertebrates.
Valdez-Moreno, Martha; Quintal-Lizama, Carolina; Gómez-Lozano, Ricardo; García-Rivas, María del Carmen
2012-01-01
Background In the Mexican Caribbean, the exotic lionfish Pterois volitans has become a species of great concern because of their predatory habits and rapid expansion onto the Mesoamerican coral reef, the second largest continuous reef system in the world. This is the first report of DNA identification of stomach contents of lionfish using the barcode of life reference database (BOLD). Methodology/Principal Findings We confirm with barcoding that only Pterois volitans is apparently present in the Mexican Caribbean. We analyzed the stomach contents of 157 specimens of P. volitans from various locations in the region. Based on DNA matches in the Barcode of Life Database (BOLD) and GenBank, we identified fishes from five orders, 14 families, 22 genera and 34 species in the stomach contents. The families with the most species represented were Gobiidae and Apogonidae. Some prey taxa are commercially important species. Seven species were new records for the Mexican Caribbean: Apogon mosavi, Coryphopterus venezuelae, C. thrix, C. tortugae, Lythrypnus minimus, Starksia langi and S. ocellata. DNA matches, as well as the presence of intact lionfish in the stomach contents, indicate some degree of cannibalism, a behavior confirmed in this species by the first time. We obtained 45 distinct crustacean prey sequences, from which only 20 taxa could be identified from the BOLD and GenBank databases. The matches were primarily to Decapoda but only a single taxon could be identified to the species level, Euphausia americana. Conclusions/Significance This technique proved to be an efficient and useful method, especially since prey species could be identified from partially-digested remains. The primary limitation is the lack of comprehensive coverage of potential prey species in the region in the BOLD and GenBank databases, especially among invertebrates. PMID:22675470
Li, Jing; Yu, Yong-Xin; Dong, Guan-Mu
2009-04-01
To compare the molecular characteristics of the Chinese attenuated yellow fever 17D vaccine strain and the WHO reference yellow fever 17D vaccine strain. The primers were designed according to the published nucleotide sequences of YFV 17D strains in GenBank. Total RNA of was extracted by the Trizol and reverse transcripted. The each fragments of the YFV genome were amplified by PCR and sequenced subsequently. The fragments of the 5' and 3' end of the two strains were cloned into the pGEM T-easy vector and then sequenced. The nucleotide acid and amino acid sequences of the homology to both strains were 99% with each other. No obvious nulceotide changes were found in the sequences of the entire genome of each 17D strains. Moreover, there was no obvious changes in the E protein genes. But the E173 of YF17D Tiantan, associted with the virulence, had mutantions. And the two live attenuated yellow fever 17D vaccine strains fell to the same lineage by the phylogenetic analysis. The results indicated that the two attenuated yellow fever 17D vaccine viruses accumulates mutations at a very low frequency and the genomes were relative stable.
Borges, Juliana N; Cunha, Luiz F G; Miranda, Daniele F; Monteiro-Neto, Cassiano; Santos, Cláudia P
2015-12-01
Pseudoterranova larvae parasitizing cutlassfish Trichiurus lepturus and bluefish Pomatomus saltatrix from Southwest Atlantic coast of Brazil were studied in this work by morphological, ultrastructural and molecular approaches. The genetic analysis were performed for the ITS2 intergenic region specific for Pseudoterranova decipiens, the partial 28S (LSU) of ribosomal DNA and the mtDNA cox-1 region. We obtained results for the 28S region and mtDNA cox-1 that was amplified using the polymerase chain reaction and sequenced to evaluate the phylogenetic relationships between sequences of this study and sequences from the GenBank. The morphological profile indicated that all the nine specimens collected from both fish were L3 larvae of Pseudoterranova sp. The genetic profile confirmed the generic level but due to the absence of similar sequences for adult parasites on GenBank for the regions amplifyied, it was not possible to identify them to the species level. The sequences obtained presented 89% of similarity with Pseudoterranova decipiens (28S sequences) and Contracaecum osculatum B (mtDNA cox-1). The low similarity allied to the fact that the amplification with the specific primer for P. decipiens didn't occur, lead us to conclude that our sequences don't belong to P. decipiens complex.
Ying, Jianchao; Wang, Huifeng; Bao, Bokan; Zhang, Ying; Zhang, Jinfang; Zhang, Cheng; Li, Aifang; Lu, Junwan; Li, Peizhen; Ying, Jun; Liu, Qi; Xu, Teng; Yi, Huiguang; Li, Jinsong; Zhou, Li; Zhou, Tieli; Xu, Zuyuan; Ni, Liyan; Bao, Qiyu
2015-01-01
The homocysteine methyltransferase encoded by mmuM is widely distributed among microbial organisms. It is the key enzyme that catalyzes the last step in methionine biosynthesis and plays an important role in the metabolism process. It also enables the microbial organisms to tolerate high concentrations of selenium in the environment. In this research, 533 mmuM gene sequences covering 70 genera of the bacteria were selected from GenBank database. The distribution frequency of mmuM is different in the investigated genera of bacteria. The mapping results of 160 mmuM reference sequences showed that the mmuM genes were found in 7 species of pathogen genomes sequenced in this work. The polymerase chain reaction products of one mmuM genotype (NC_013951 as the reference) were sequenced and the sequencing results confirmed the mapping results. Furthermore, 144 representative sequences were chosen for phylogenetic analysis and some mmuM genes from totally different genera (such as the genes between Escherichia and Klebsiella and between Enterobacter and Kosakonia) shared closer phylogenetic relationship than those from the same genus. Comparative genomic analysis of the mmuM encoding regions on plasmids and bacterial chromosomes showed that pKF3-140 and pIP1206 plasmids shared a 21 kb homology region and a 4.9 kb fragment in this region was in fact originated from the Escherichia coli chromosome. These results further suggested that mmuM gene did go through the gene horizontal transfer among different species or genera of bacteria. High-throughput sequencing combined with comparative genomics analysis would explore distribution and dissemination of the mmuM gene among bacteria and its evolution at a molecular level.
USDA-ARS?s Scientific Manuscript database
This study used 1321 base pair 16S rRNA gene sequence methods to confirm the phylogenetic position of a soil isolate as a bacterium belonging to the genus Pesudomonas sp. Morphological, biochemical characteristics, and fatty acid profiles are consistent with the 16S rRNA gene sequence identification...
Simon, J W; Slabas, A R
1998-09-18
The GenBank database was searched using the E. coli malonyl CoA:ACP transacylase (MCAT) sequence, for plant protein/cDNA sequences corresponding to MCAT, a component of plant fatty acid synthetase (FAS), for which the plant cDNA has not been isolated. A 272-bp Zea mays EST sequence (GenBank accession number: AA030706) was identified which has strong homology to the E. coli MCAT. A PCR derived cDNA probe from Zea mays was used to screen a Brassica napus (rape) cDNA library. This resulted in the isolation of a 1200-bp cDNA clone which encodes an open reading frame corresponding to a protein of 351 amino acids. The protein shows 47% homology to the E. coli MCAT amino acid sequence in the coding region for the mature protein. Expression of a plasmid (pMCATrap2) containing the plant cDNA sequence in Fab D89, an E. coli mutant, in MCAT activity restores growth demonstrating functional complementation and direct function of the cloned cDNA. This is the first functional evidence supporting the identification of a plant cDNA for MCAT.
Transterm—extended search facilities and improved integration with other databases
Jacobs, Grant H.; Stockwell, Peter A.; Tate, Warren P.; Brown, Chris M.
2006-01-01
Transterm has now been publicly available for >10 years. Major changes have been made since its last description in this database issue in 2002. The current database provides data for key regions of mRNA sequences, a curated database of mRNA motifs and tools to allow users to investigate their own motifs or mRNA sequences. The key mRNA regions database is derived computationally from Genbank. It contains 3′ and 5′ flanking regions, the initiation and termination signal context and coding sequence for annotated CDS features from Genbank and RefSeq. The database is non-redundant, enabling summary files and statistics to be prepared for each species. Advances include providing extended search facilities, the database may now be searched by BLAST in addition to regular expressions (patterns) allowing users to search for motifs such as known miRNA sequences, and the inclusion of RefSeq data. The database contains >40 motifs or structural patterns important for translational control. In this release, patterns from UTRsite and Rfam are also incorporated with cross-referencing. Users may search their sequence data with Transterm or user-defined patterns. The system is accessible at . PMID:16381889
Choe, Se-Eun; Nguyen, Thuy Thi-Dieu; Kang, Tae-Gyu; Kweon, Chang-Hee; Kang, Seung-Won
2011-09-01
Nuclear ribosomal DNA sequence of the second internal transcribed spacer (ITS-2) has been used efficiently to identify the liver fluke species collected from different hosts and various geographic regions. ITS-2 sequences of 19 Fasciola samples collected from Korean native cattle were determined and compared. Sequence comparison including ITS-2 sequences of isolates from this study and reference sequences from Fasciola hepatica and Fasciola gigantica and intermediate Fasciola in Genbank revealed seven identical variable sites of investigated isolates. Among 19 samples, 12 individuals had ITS-2 sequences completely identical to that of pure F. hepatica, five possessed the sequences identical to F. gigantica type, whereas two shared the sequence of both F. hepatica and F. gigantica. No variations in length and nucleotide composition of ITS-2 sequence were observed within isolates that belonged to F. hepatica or F. gigantica. At the position of 218, five Fasciola containing a single-base substitution (C>T) formed a distinct branch inside the F. gigantica-type group which was similar to those of Asian-origin isolates. The phylogenetic tree of the Fasciola spp. based on complete ITS-2 sequences from this study and other representative isolates in different locations clearly showed that pure F. hepatica, F. gigantica type and intermediate Fasciola were observed. The result also provided additional genetic evidence for the existence of three forms of Fasciola isolated from native cattle in Korea by genetic approach using ITS-2 sequence.
[Study of three ciguatera fish poisoning cases in Xiamen city, in 2005].
Luo, He-dong; Bai, Yan-yan; Zhou, Na
2011-06-01
To find out the reason of three ciguatera fish poisoning cases in Xiamen in 2005 and identify the fish species. The grouper implicated in food poisoning and seven other coral reef fishes collected from market were tested by mice bioassay and ciguatoxin-test kit. The mtDNA was extracted from toxic grouper meat, and Cty b gene segment was amplified and the PCR products were sequenced. The sequences were compared with those in the GenBank. The result turned out to be positive by the ciguatoxin-test kit, while the toxicity of the toxic grouper implicated in food poisoning was 0.11 mouse unit (MU)/g by mice bioassay. A 475 bp segments of Cty b gene was amplified by PCR and the sequence was 99% homologous with Epinephelus fuscoguttatus (GenBank: AY950695).No ciguatoxin in six grouper species collected from market was detected. All three food poisoning cases were caused by consumption of ciguatoxin-carrying groupers.
Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung
2017-06-26
Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
Object-oriented parsing of biological databases with Python.
Ramu, C; Gemünd, C; Gibson, T J
2000-07-01
While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.
GenSeq: An updated nomenclature and ranking for genetic sequences from type and non-type sources
Chakrabarty, Prosanta; Warren, Melanie; Page, Lawrence M.; Baldwin, Carole C.
2013-01-01
Abstract An improved and expanded nomenclature for genetic sequences is introduced that corresponds with a ranking of the reliability of the taxonomic identification of the source specimens. This nomenclature is an advancement of the “Genetypes” naming system, which some have been reluctant to adopt because of the use of the “type” suffix in the terminology. In the new nomenclature, genetic sequences are labeled “genseq,” followed by a reliability ranking (e.g., 1 if the sequence is from a primary type), followed by the name of the genes from which the sequences were derived (e.g., genseq-1 16S, COI). The numbered suffix provides an indication of the likely reliability of taxonomic identification of the voucher. Included in this ranking system, in descending order of taxonomic reliability, are the following: sequences from primary types – “genseq-1,” secondary types – “genseq-2,” collection-vouchered topotypes – “genseq-3,” collection-vouchered non-types – “genseq-4,” and non-types that lack specimen vouchers but have photo vouchers – “genseq-5.” To demonstrate use of the new nomenclature, we review recently published new-species descriptions in the ichthyological literature that include DNA data and apply the GenSeq nomenclature to sequences referenced in those publications. We encourage authors to adopt the GenSeq nomenclature (note capital “G” and “S” when referring to the nomenclatural program) to provide a searchable tag (e.g., “genseq”; note lowercase “g” and “s” when referring to sequences) for genetic sequences from types and other vouchered specimens. Use of the new nomenclature and ranking system will improve integration of molecular phylogenetics and biological taxonomy and enhance the ability of researchers to assess the reliability of sequence data. We further encourage authors to update sequence information on databases such as GenBank whenever nomenclatural changes are made. PMID:24223486
The practical evaluation of DNA barcode efficacy.
Spouge, John L; Mariño-Ramírez, Leonardo
2012-01-01
This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.
Virus Database and Online Inquiry System Based on Natural Vectors.
Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St
2017-01-01
We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
Dees, Merete Wiken; Brurberg, May Bente; Lysøe, Erik
2016-12-01
Here, we present the 3,795,952 bp complete genome sequence of the biofilm-forming Curtobacterium sp. strain BH-2-1-1, isolated from conventionally grown lettuce ( Lactuca sativa ) from a field in Vestfold, Norway. The nucleotide sequence of this genome was deposited into NCBI GenBank under the accession CP017580.
Nucleotide sequencing and identification of some wild mushrooms.
Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari
2013-01-01
The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.
HIV Type 1 Transmission Networks Among Men Having Sex with Men and Heterosexuals in Kenya
Faria, Nuno Rodrigues; Hassan, Amin; Hamers, Raph L.; Mutua, Gaudensia; Anzala, Omu; Mandaliya, Kishor; Cane, Patricia; Berkley, James A.; Rinke de Wit, Tobias F.; Wallis, Carole; Graham, Susan M.; Price, Matthew A.; Coutinho, Roel A.; Sanders, Eduard J.
2014-01-01
Abstract We performed a molecular phylogenetic study on HIV-1 polymerase sequences of men who have sex with men (MSM) and heterosexual patient samples in Kenya to characterize any observed HIV-1 transmission networks. HIV-1 polymerase sequences were obtained from samples in Nairobi and coastal Kenya from 84 MSM, 226 other men, and 364 women from 2005 to 2010. Using Bayesian phylogenetics, we tested whether sequences clustered by sexual orientation and geographic location. In addition, we used trait diffusion analyses to identify significant epidemiological links and to quantify the number of transmissions between risk groups. Finally, we compared 84 MSM sequences with all HIV-1 sequences available online at GenBank. Significant clustering of sequences from MSM at both coastal Kenya and Nairobi was found, with evidence of HIV-1 transmission between both locations. Although a transmission pair between a coastal MSM and woman was confirmed, no significant HIV-1 transmission was evident between MSM and the comparison population for the predominant subtype A (60%). However, a weak but significant link was evident when studying all subtypes together. GenBank comparison did not reveal other important transmission links. Our data suggest infrequent intermingling of MSM and heterosexual HIV-1 epidemics in Kenya. PMID:23947948
USDA-ARS?s Scientific Manuscript database
Single-nucleotide Polymorphism (SNP) markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean comparing sequences from coding and non-coding regions obtained from Genbank and genomic DNA and to compare sequencing resu...
Regulatory sequence analysis tools.
van Helden, Jacques
2003-07-01
The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully sequenced genomes and these data are regularly updated from GenBank.
Probiotic Candidates from Fish Pond Water in Central Java Indonesia
NASA Astrophysics Data System (ADS)
Harjuno Condro Haditomo, Alfabetian; Desrina; Sarjito; Budi Prayitno, S.
2018-02-01
Aeromonas hydrophilla is a major bacterial pathogen of intensive fresh water fish culture in Indonesia. An alternative method to control the pathogen is using probiotics. Probiotics is usually consist of live microorganisms which when administered in adequate amounts confer a health benefits on host. The aim of this research was to determine the probiotic candidates against A. hydrophilla which identified based on the 16S rDNA gene sequences. This research was started with field survey to obtained the probiotic candidate and continue with laboratory experiment. Probiotic candidates were isolated from fish pond water located in Boyolali, and Banjarnegara Regency, Central Java, Indonesia. A total of 133 isolates bacteria were isolated and cultured on to TSA, TSB and GSP medium. Out of 133 isolates only 30 isolates showed inhibition to A.hydrophilla activity. Three promising isolates were identified with PCR using primer for 16S rDNA. Based on 16S rDNA sequence analysis, all three isolates were belong to Bacillus genus. Isolate CKlA21, CKlA28, and CBA14 respectively were closely related to Bacillus sp. 13843 (GenBank accession no. JN874760.1 -100% homology), Bacillus subtilis strain H13 (GenBank accession no.KT907045.1 -- 99% homology), and Bacillus sp. strain 22-4 (GenBank accession no. KX816417.1 -- 97% homology).
Hostnik, Peter; Picard-Meyer, Evelyne; Rihtarič, Danijela; Toplak, Ivan; Cliquet, Florence
2014-04-01
Oral vaccination campaigns to eliminate fox rabies were initiated in Slovenia in 1995. In May 2012, a young fox (Vulpes vulpes) with typical rabies signs was captured. Its brain and salivary gland tissues were found to contain vaccine strain SAD B19. The Basic Logical Alignment Search Tool alignment of 589 nucleotides determined from the N gene of the virus isolated from the brain and salivary glands of the affected fox was 100% identical to the GenBank reference SAD B19 strain. Sequence analysis of the N and M genes (4,351 nucleotides) showed two nucleotide modifications at position 1335 (N gene) and 3114 (M gene) in the KC522613 isolate identified in the fox compared to SAD B19.
Molecular confirmation of Hepatozoon canis in Mauritius.
Daskalaki, Aikaterini Alexandra; Ionică, Angela Monica; Jeetah, Keshav; Gherman, Călin Mircea; Mihalca, Andrei Daniel
2018-01-01
In this study, Hepatozoon species was molecularly identified and characterized for the first time on the Indian Ocean island of Mauritius. Partial sequences of the 18S rRNA gene of the Hepatozoon isolates were analysed from three naturally infected dogs. The sequences of H. canis were similar to the 18S rRNA partial sequences (JX112783, AB365071 99%) from dog blood samples from West Indies and Nigeria. Our sequences were deposited in the GenBank database. Copyright © 2017 Elsevier B.V. All rights reserved.
A 5.8S nuclear ribosomal RNA gene sequence database: applications to ecology and evolution
NASA Technical Reports Server (NTRS)
Cullings, K. W.; Vogler, D. R.
1998-01-01
We complied a 5.8S nuclear ribosomal gene sequence database for animals, plants, and fungi using both newly generated and GenBank sequences. We demonstrate the utility of this database as an internal check to determine whether the target organism and not a contaminant has been sequenced, as a diagnostic tool for ecologists and evolutionary biologists to determine the placement of asexual fungi within larger taxonomic groups, and as a tool to help identify fungi that form ectomycorrhizae.
Identification of a Herbal Powder by Deoxyribonucleic Acid Barcoding and Structural Analyses.
Sheth, Bhavisha P; Thaker, Vrinda S
2015-10-01
Authentic identification of plants is essential for exploiting their medicinal properties as well as to stop the adulteration and malpractices with the trade of the same. To identify a herbal powder obtained from a herbalist in the local vicinity of Rajkot, Gujarat, using deoxyribonucleic acid (DNA) barcoding and molecular tools. The DNA was extracted from a herbal powder and selected Cassia species, followed by the polymerase chain reaction (PCR) and sequencing of the rbcL barcode locus. Thereafter the sequences were subjected to National Center for Biotechnology Information (NCBI) basic local alignment search tool (BLAST) analysis, followed by the protein three-dimension structure determination of the rbcL protein from the herbal powder and Cassia species namely Cassia fistula, Cassia tora and Cassia javanica (sequences obtained in the present study), Cassia Roxburghii, and Cassia abbreviata (sequences retrieved from Genbank). Further, the multiple and pairwise structural alignment were carried out in order to identify the herbal powder. The nucleotide sequences obtained from the selected species of Cassia were submitted to Genbank (Accession No. JX141397, JX141405, JX141420). The NCBI BLAST analysis of the rbcL protein from the herbal powder showed an equal sequence similarity (with reference to different parameters like E value, maximum identity, total score, query coverage) to C. javanica and C. roxburghii. In order to solve the ambiguities of the BLAST result, a protein structural approach was implemented. The protein homology models obtained in the present study were submitted to the protein model database (PM0079748-PM0079753). The pairwise structural alignment of the herbal powder (as template) and C. javanica and C. roxburghii (as targets individually) revealed a close similarity of the herbal powder with C. javanica. A strategy as used here, incorporating the integrated use of DNA barcoding and protein structural analyses could be adopted, as a novel rapid and economic procedure, especially in cases when protein coding loci are considered. Authentic identification of plants is essential for exploiting their medicinal properties as well as to stop the adulteration and malpractices with the trade of the same. A herbal powder was obtained from a herbalist in the local vicinity of Rajkot, Gujarat. An integrated approach using DNA barcoding and structural analyses was carried out to identify the herbal powder. The herbal powder was identified as Cassia javanica L.
The draft genome sequence and annotation of the desert woodrat Neotoma lepida.
Campbell, Michael; Oakeson, Kelly F; Yandell, Mark; Halpert, James R; Dearing, Denise
2016-09-01
We present the de novo draft genome sequence for a vertebrate mammalian herbivore, the desert woodrat (Neotoma lepida). This species is of ecological and evolutionary interest with respect to ingestion, microbial detoxification and hepatic metabolism of toxic plant secondary compounds from the highly toxic creosote bush (Larrea tridentata) and the juniper shrub (Juniperus monosperma). The draft genome sequence and annotation have been deposited at GenBank under the accession LZPO01000000.
Compressing DNA sequence databases with coil.
White, W Timothy J; Hendy, Michael D
2008-05-20
Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Compressing DNA sequence databases with coil
White, W Timothy J; Hendy, Michael D
2008-01-01
Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794
A novel gene, RSD-3/HSD-3.1, encodes a meiotic-related protein expressed in rat and human testis.
Zhang, Xiaodong; Liu, Huixian; Zhang, Yan; Qiao, Yuan; Miao, Shiying; Wang, Linfang; Zhang, Jianchao; Zong, Shudong; Koide, S S
2003-06-01
The expression of stage-specific genes during spermatogenesis was determined by isolating two segments of rat seminiferous tubule at different stages of the germinal epithelium cycle delineated by transillumination-delineated microdissection, combined with differential display polymerase chain reaction to identify the differential transcripts formed. A total of 22 cDNAs were identified and accepted by GenBank as new expressed sequence tags. One of the expressed sequence tags was radiolabeled and used as a probe to screen a rat testis cDNA library. A novel full-length cDNA composed of 2228 bp, designated as RSD-3 (rat sperm DNA no.3, GenBank accession no. AF094609) was isolated and characterized. The reading frame encodes a polypeptide consisting of 526 amino acid residues, containing a number of DNA binding motifs and phosphorylation sites for PKC, CK-II, and p34cdc2. Northern blot of mRNA prepared from various tissues of adult rats showed that RSD-3 is expressed only in the testis. The initial expression of the RSD-3 gene was detected in the testis on the 30th postnatal day and attained adult level on the 60th postnatal day. Immunolocalization of RSD-3 in germ cells of rat testis showed that its expression is restricted to primary spermatocytes, undergoing meiosis division I. A human testis homologue of RSD-3 cDNA, designated as HSD-3.1 (GenBank accession no. AF144487) was isolated by screening the Human Testis Rapid-Screen arrayed cDNA library panels by RT-PCR. The exon-intron boundaries of HSD-3.1 gene were determined by aligning the cDNA sequence with the corresponding genome sequence. The cDNA consisted of 12 exons that span approximately 52.8 kb of the genome sequence and was mapped to chromosome 14q31.3.
Fuchs, O; Kostecka, A; Provazníková, D; Krásná, B; Kotlín, R; Stanková, M; Kobylka, P; Dostálová, G; Zeman, M; Chochola, M
2010-01-01
The CCAAT/enhancer-binding protein alpha, encoded by the intronless CEBPA gene, is a transcription factor that induces expression of genes involved in differentiation of granulocytes, monocytes, adipocytes and hepatocytes. Both mono- and bi-allelic CEBPA mutations were detected in acute myeloid leukaemia and myelodysplastic syndrome. In this study we also identified CEBPA mutations in healthy individuals and in patients with peripheral artery disease, ischaemic heart disease and hyperlipidaemia. We found 16 various deletions with the presence of two direct repeats in CEBPA by analysis of 431 individuals. Three most frequent repeats included in these deletions in CEBPA gene are CGCGAG (493- 498_865-870), GG (486-487_885-886), and GCCAAGCAGC (508-517_907-916), all according to GenBank Accession No. NM_004364.2. In one case we identified that a father with ischaemic heart disease and his healthy son had two identical deletions (493_864del and 508_906del, both according to GenBank Accession No. NM_004364.2) in CEBPA. The occurrence of deletions between two repetitive sequences may be caused by recombination events in the repair process. A double-stranded cut in DNA may initiate these recombination events in adjacent DNA sequences. Four types of polymorphisms in the CEBPA gene were also detected in the screened individuals. Polymorphism in CEBPA gene 690 G>T according to GenBank Accession No. NM_004364.2 is the most frequent type in our analysis. Statistical analysis did not find significant differences in the frequency of polymorphisms in CEBPA in patients and in healthy individuals with the exception of P4 polymorphism (580_585dup according to GenBank Accesion No. NM_004364.2). P4 polymorphism was significantly increased in ischaemic heart disease patients.
Morphological and molecular characterization of fungal pathogen, Magnaphorthe oryzae
NASA Astrophysics Data System (ADS)
Hasan, Nor'Aishah; Rafii, Mohd Y.; Rahim, Harun A.; Ali, Nusaibah Syd; Mazlan, Norida; Abdullah, Shamsiah
2016-02-01
Rice is arguably the most crucial food crops supplying quarter of calories intake. Fungal pathogen, Magnaphorthe oryzae promotes blast disease unconditionally to gramineous host including rice species. This disease spurred an outbreaks and constant threat to cereal production. Global rice yield declining almost 10-30% including Malaysia. As Magnaphorthe oryzae and its host is model in disease plant study, the rice blast pathosystem has been the subject of intense interest to overcome the importance of the disease to world agriculture. Therefore, in this study, our prime objective was to isolate samples of Magnaphorthe oryzae from diseased leaf obtained from MARDI Seberang Perai, Penang, Malaysia. Molecular identification was performed by sequences analysis from internal transcribed spacer (ITS) region of nuclear ribosomal RNA genes. Phylogenetic affiliation of the isolated samples were analyzed by comparing the ITS sequences with those deposited in the GenBank database. The sequence of the isolate demonstrated at least 99% nucleotide identity with the corresponding sequence in GenBank for Magnaphorthe oryzae. Morphological observed under microscope demonstrated that the structure of conidia followed similar characteristic as M. oryzae. Finding in this study provide useful information for breeding programs, epidemiology studies and improved disease management.
Morphological and molecular characterization of fungal pathogen, Magnaphorthe oryzae
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasan, Nor’Aishah, E-mail: aishahnh@ns.uitm.edu.my; Rafii, Mohd Y., E-mail: mrafii@upm.edu.my; Department of Crop Science, Universiti Putra Malaysia
2016-02-01
Rice is arguably the most crucial food crops supplying quarter of calories intake. Fungal pathogen, Magnaphorthe oryzae promotes blast disease unconditionally to gramineous host including rice species. This disease spurred an outbreaks and constant threat to cereal production. Global rice yield declining almost 10-30% including Malaysia. As Magnaphorthe oryzae and its host is model in disease plant study, the rice blast pathosystem has been the subject of intense interest to overcome the importance of the disease to world agriculture. Therefore, in this study, our prime objective was to isolate samples of Magnaphorthe oryzae from diseased leaf obtained from MARDI Seberangmore » Perai, Penang, Malaysia. Molecular identification was performed by sequences analysis from internal transcribed spacer (ITS) region of nuclear ribosomal RNA genes. Phylogenetic affiliation of the isolated samples were analyzed by comparing the ITS sequences with those deposited in the GenBank database. The sequence of the isolate demonstrated at least 99% nucleotide identity with the corresponding sequence in GenBank for Magnaphorthe oryzae. Morphological observed under microscope demonstrated that the structure of conidia followed similar characteristic as M. oryzae. Finding in this study provide useful information for breeding programs, epidemiology studies and improved disease management.« less
Contamination of sequence databases with adaptor sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D.
Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable ofmore » transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.« less
The Apis mellifera Filamentous Virus Genome
Gauthier, Laurent; Cornman, Scott; Hartmann, Ulrike; Cousserans, François; Evans, Jay D.; de Miranda, Joachim R.; Neumann, Peter
2015-01-01
A complete reference genome of the Apis mellifera Filamentous virus (AmFV) was determined using Illumina Hiseq sequencing. The AmFV genome is a double stranded DNA molecule of approximately 498,500 nucleotides with a GC content of 50.8%. It encompasses 247 non-overlapping open reading frames (ORFs), equally distributed on both strands, which cover 65% of the genome. While most of the ORFs lacked threshold sequence alignments to reference protein databases, twenty-eight were found to display significant homologies with proteins present in other large double stranded DNA viruses. Remarkably, 13 ORFs had strong similarity with typical baculovirus domains such as PIFs (per os infectivity factor genes: pif-1, pif-2, pif-3 and p74) and BRO (Baculovirus Repeated Open Reading Frame). The putative AmFV DNA polymerase is of type B, but is only distantly related to those of the baculoviruses. The ORFs encoding proteins involved in nucleotide metabolism had the highest percent identity to viral proteins in GenBank. Other notable features include the presence of several collagen-like, chitin-binding, kinesin and pacifastin domains. Due to the large size of the AmFV genome and the inconsistent affiliation with other large double stranded DNA virus families infecting invertebrates, AmFV may belong to a new virus family. PMID:26184284
The Apis mellifera Filamentous Virus Genome.
Gauthier, Laurent; Cornman, Scott; Hartmann, Ulrike; Cousserans, François; Evans, Jay D; de Miranda, Joachim R; Neumann, Peter
2015-07-09
A complete reference genome of the Apis mellifera Filamentous virus (AmFV) was determined using Illumina Hiseq sequencing. The AmFV genome is a double stranded DNA molecule of approximately 498,500 nucleotides with a GC content of 50.8%. It encompasses 247 non-overlapping open reading frames (ORFs), equally distributed on both strands, which cover 65% of the genome. While most of the ORFs lacked threshold sequence alignments to reference protein databases, twenty-eight were found to display significant homologies with proteins present in other large double stranded DNA viruses. Remarkably, 13 ORFs had strong similarity with typical baculovirus domains such as PIFs (per os infectivity factor genes: pif-1, pif-2, pif-3 and p74) and BRO (Baculovirus Repeated Open Reading Frame). The putative AmFV DNA polymerase is of type B, but is only distantly related to those of the baculoviruses. The ORFs encoding proteins involved in nucleotide metabolism had the highest percent identity to viral proteins in GenBank. Other notable features include the presence of several collagen-like, chitin-binding, kinesin and pacifastin domains. Due to the large size of the AmFV genome and the inconsistent affiliation with other large double stranded DNA virus families infecting invertebrates, AmFV may belong to a new virus family.
De Cremer, Koen; Piérard, Denis; Hendrickx, Marijke
2016-01-01
Recently, the Fusarium genus has been narrowed based upon phylogenetic analyses and a Fusarium-like clade was adopted. The few species of the Fusarium-like clade were moved to new, re-installed or existing genera or provisionally retained as "Fusarium." Only a limited number of reference strains and DNA marker sequences are available for this clade and not much is known about its actual species diversity. Here, we report six strains, preserved by the Belgian fungal culture collection BCCM/IHEM as a Fusarium species, that belong to the Fusarium-like clade. They showed a slow growth and produced pionnotes, typical morphological characteristics of many Fusarium-like species. Multilocus sequencing with comparative sequence analyses in GenBank and phylogenetic analyses, using reference sequences of type material, confirmed that they were indeed member of the Fusarium-like clade. One strain was identified as "Fusarium" ciliatum whereas another strain was identified as Fusicolla merismoides. The four remaining strains were shown to represent a unique phylogenetic lineage in the Fusarium-like clade and were also found morphologically distinct from other members of the Fusarium-like clade. Based upon phylogenetic considerations, a new genus, Pseudofusicolla gen. nov., and a new species, Pseudofusicolla belgica sp. nov., were installed for this lineage. A formal description is provided in this study. Additional sampling will be required to gather isolates other than the historical strains presented in the present study as well as to further reveal the actual species diversity in the Fusarium-like clade. PMID:27790062
Bioinformatic mining of EST-SSR loci in the Pacific oyster, Crassostrea gigas.
Wang, Y; Ren, R; Yu, Z
2008-06-01
A set of expressed sequence tag-simple sequence repeat (EST-SSR) markers of the Pacific oyster, Crassostrea gigas, was developed through bioinformatic mining of the GenBank public database. As of June 30, 2007, a total of 5132 EST sequences from GenBank were downloaded and screened for di-, tri- and tetra-nucleotide repeats, with criteria set at a minimum of 5, 4 and 4 repeats for the three categories of SSRs respectively. Seventeen polymorphic microsatellite markers were characterized. Allele numbers ranged from 3 to 10, and the observed and expected heterozygosity values varied from 0.125 to 0.770 and from 0.113 to 0.732 respectively. Eleven loci were at Hardy-Weinberg equilibrium (HWE); the other six loci showed significant departure from HWE (P < 0.01), suggesting possible presence of null alleles. Pairwise check of linkage disequilibrium (LD) indicated that 11 of 136 pairs of loci showed significant LD (P < 0.01), likely due to HWE present in single markers. Cross-species amplification was examined for five other Crassostrea species and reasonable results were obtained, promising usefulness of these markers in oyster genetics.
Hazes, Bart
2014-02-28
Protein-coding DNA sequences and their corresponding amino acid sequences are routinely used to study relationships between sequence, structure, function, and evolution. The rapidly growing size of sequence databases increases the power of such comparative analyses but it makes it more challenging to prepare high quality sequence data sets with control over redundancy, quality, completeness, formatting, and labeling. Software tools for some individual steps in this process exist but manual intervention remains a common and time consuming necessity. CDSbank is a database that stores both the protein-coding DNA sequence (CDS) and amino acid sequence for each protein annotated in Genbank. CDSbank also stores Genbank feature annotation, a flag to indicate incomplete 5' and 3' ends, full taxonomic data, and a heuristic to rank the scientific interest of each species. This rich information allows fully automated data set preparation with a level of sophistication that aims to meet or exceed manual processing. Defaults ensure ease of use for typical scenarios while allowing great flexibility when needed. Access is via a free web server at http://hazeslab.med.ualberta.ca/CDSbank/. CDSbank presents a user-friendly web server to download, filter, format, and name large sequence data sets. Common usage scenarios can be accessed via pre-programmed default choices, while optional sections give full control over the processing pipeline. Particular strengths are: extract protein-coding DNA sequences just as easily as amino acid sequences, full access to taxonomy for labeling and filtering, awareness of incomplete sequences, and the ability to take one protein sequence and extract all synonymous CDS or identical protein sequences in other species. Finally, CDSbank can also create labeled property files to, for instance, annotate or re-label phylogenetic trees.
Dolz, Roser; Pujols, Joan; Ordóñez, German; Porta, Ramon; Majó, Natàlia
2006-04-01
As part of an epidemiological surveillance of infectious bronchitis virus (IBV) in Spain, four Spanish field isolates showed high S1 spike sequence similarities with an IBV sequence from the GenBank database named Italy 02. Given that little was known about this new emergent IBV strain we have characterized the four isolates by sequencing the entire S1 part of the spike protein gene and have compared them with many reference IBV serotypes. In addition, cross-virus neutralization assays were conducted with the main IBV serotypes present in Europe. The four Spanish field strains and the Italy 02 S1 sequence from the NCBI database were established as a new genotype that showed maximum amino acid identities with the 4/91 serotype (81.7% to 83.7%), the D274 group that included D207, D274 and D3896 strains (79.8% to 81.7%), and the B1648 serotype (79.3% to 80%). Furthermore, on the basis of these results, it was demonstrated that the Italy 02 genotype had been circulating in Spain since as early as 1997. Based on the average ratio of synonymous:non-synonymous (dS/dN) amino acid substitutions within Italy 02 sequences, no positive selection pressures were related with changes observed in the S1 gene. Moreover, phylogenetic analysis of the S1 gene suggested that the Italy 02 genotype has undergone a recombination event. Virus neutralization assays demonstrated that little antigenic relatedness (less than 35%) exists between Italy 02 and some of the reference IBV serotypes, and indicated that Italy 02 is likely to be a new serotype.
Huang, Fengying; Meng, Qiuping; Tan, Guanghong; Huang, Yonghao; Wang, Hua; Mei, Wenli; Dai, Haofu
2011-06-01
To analysis and identify a bacterium strain isolated from laboratory breeding mouse far away from a hospital. Phenotype of the isolate was investigated by conventional microbiological methods, including Gram-staining, colony morphology, tests for haemolysis, catalase, coagulase, and antimicrobial susceptibility test. The mecA and 16S rRNA genes were amplified by the polymerase chain reaction (PCR) and sequenced. The base sequence of the PCR product was compared with known 16S rRNA gene sequences in the GenBank database by phylogenetic analysis and multiple sequence alignment. The isolate in this study was a gram positive, coagulase negative, and catalase positive coccus. The isolate was resistant to oxacillin, methicillin, penicillin, ampicillin, cefazolin, ciprofloxacin erythromycin, et al. PCR results indicated that the isolate was mecA gene positive and its 16S rRNA was 1 465 bp. Phylogenetic analysis of the resultant 16S rRNA indicated the isolate belonged to genus Saphylococcus, and multiple sequence alignment showed that the isolate was Saphylococcus haemolyticus with only one base difference from the corresponding 16S rRNA deposited in the GenBank. 16S rRNA gene sequencing is a suitable technique for non-specialist researchers. Laboratory animals are possible sources of lethal pathogens, and researchers must adapt protective measures when they manipulate animals. Copyright © 2011 Hainan Medical College. Published by Elsevier B.V. All rights reserved.
Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe; Avarre, Jean-Christophe
2016-01-01
Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×10 7 . The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.
Hammoumi, Saliha; Vallaeys, Tatiana; Santika, Ayi; Leleux, Philippe; Borzym, Ewa; Klopp, Christophe
2016-01-01
Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3. PMID:27703859
GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes.
Hallin, Peter F; Stærfeldt, Hans-Henrik; Rotenberg, Eva; Binnewies, Tim T; Benham, Craig J; Ussery, David W
2009-09-25
We present an interactive web application for visualizing genomic data of prokaryotic chromosomes. The tool (GeneWiz browser) allows users to carry out various analyses such as mapping alignments of homologous genes to other genomes, mapping of short sequencing reads to a reference chromosome, and calculating DNA properties such as curvature or stacking energy along the chromosome. The GeneWiz browser produces an interactive graphic that enables zooming from a global scale down to single nucleotides, without changing the size of the plot. Its ability to disproportionally zoom provides optimal readability and increased functionality compared to other browsers. The tool allows the user to select the display of various genomic features, color setting and data ranges. Custom numerical data can be added to the plot allowing, for example, visualization of gene expression and regulation data. Further, standard atlases are pre-generated for all prokaryotic genomes available in GenBank, providing a fast overview of all available genomes, including recently deposited genome sequences. The tool is available online from http://www.cbs.dtu.dk/services/gwBrowser. Supplemental material including interactive atlases is available online at http://www.cbs.dtu.dk/services/gwBrowser/suppl/.
Mitochondrial DNA variation of indigenous goats in Narok and Isiolo counties of Kenya.
Kibegwa, F M; Githui, K E; Jung'a, J O; Badamana, M S; Nyamu, M N
2016-06-01
Phylogenetic relationships among and genetic variability within 60 goats from two different indigenous breeds in Narok and Isiolo counties in Kenya and 22 published goat samples were analysed using mitochondrial control region sequences. The results showed that there were 54 polymorphic sites in a 481-bp sequence and 29 haplotypes were determined. The mean haplotype diversity and nucleotide diversity were 0.981 ± 0.006 and 0.019 ± 0.001, respectively. The phylogenetic analysis in combination with goat haplogroup reference sequences from GenBank showed that all goat sequences were clustered into two haplogroups (A and G), of which haplogroup A was the commonest in the two populations. A very high percentage (99.90%) of the genetic variation was distributed within the regions, and a smaller percentage (0.10%) distributed among regions as revealed by the analysis of molecular variance (amova). This amova results showed that the divergence between regions was not statistically significant. We concluded that the high levels of intrapopulation diversity in Isiolo and Narok goats and the weak phylogeographic structuring suggested that there existed strong gene flow among goat populations probably caused by extensive transportation of goats in history. © 2015 Blackwell Verlag GmbH.
Bustamante, Carlos; Ovenden, Jennifer R
2016-01-01
The silver gemfish Rexea solandri is an important economic resource but Vulnerable to overfishing in Australian waters. The complete mitochondrial genome sequence is described from 1.6 million reads obtained via next generation sequencing. The total length of the mitogenome is 16,350 bp comprising 2 rRNA, 13 protein-coding genes, 22 tRNA and 2 non-coding regions. The mitogenome sequence was validated against sequences of PCR fragments and BLAST queries of Genbank. Gene order was equivalent to that found in marine fishes.
Sequence search on a supercomputer.
Gotoh, O; Tagashira, Y
1986-01-10
A set of programs was developed for searching nucleic acid and protein sequence data bases for sequences similar to a given sequence. The programs, written in FORTRAN 77, were optimized for vector processing on a Hitachi S810-20 supercomputer. A search of a 500-residue protein sequence against the entire PIR data base Ver. 1.0 (1) (0.5 M residues) is carried out in a CPU time of 45 sec. About 4 min is required for an exhaustive search of a 1500-base nucleotide sequence against all mammalian sequences (1.2M bases) in Genbank Ver. 29.0. The CPU time is reduced to about a quarter with a faster version.
ESTuber db: an online database for Tuber borchii EST sequences.
Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo
2007-03-08
The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping
2007-01-01
Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Complete genome sequence analysis of a duck circovirus from Guangxi pockmark ducks.
Xie, Liji; Xie, Zhixun; Zhao, Guangyuan; Liu, Jiabo; Pang, Yaoshan; Deng, Xianwen; Xie, Zhiqin; Fan, Qing
2012-12-01
We report here the complete genomic sequence of a novel duck circovirus (DuCV) strain, GX1104, isolated from Guangxi pockmark ducks in Guangxi, China. The whole nucleotide sequence had the highest homology (97.2%) with the sequence of strain TC/2002 (GenBank accession number AY394721.1) and had a low homology (76.8% to 78.6%) with the sequences of other strains isolated from China, Germany, and the United States. This report will help to understand the epidemiology and molecular characteristics of Guangxi pockmark duck circovirus in southern China.
Lee, Jennifer F.; Hesselberth, Jay R.; Meyers, Lauren Ancel; Ellington, Andrew D.
2004-01-01
The aptamer database is designed to contain comprehensive sequence information on aptamers and unnatural ribozymes that have been generated by in vitro selection methods. Such data are not normally collected in ‘natural’ sequence databases, such as GenBank. Besides serving as a storehouse of sequences that may have diagnostic or therapeutic utility, the database serves as a valuable resource for theoretical biologists who describe and explore fitness landscapes. The database is updated monthly and is publicly available at http://aptamer.icmb.utexas.edu/. PMID:14681367
Shur, K V; Zaychikova, M V; Mikheecheva, N E; Klimina, K M; Bekker, O B; Zhdanova, S N; Ogarkov, O B; Danilenko, V N
2016-12-01
We report a draft genome sequence of Mycobacterium tuberculosis strain B9741 belonging to Beijing B0/W lineage isolated from a HIV patient from Siberia, Russia. This clinical isolate showed MDR phenotype and resistance to isoniazid, rifampin, streptomycin and pyrazinamide. We analyzed SNPs associated with virulence and resistance. The draft genome sequence and annotation have been deposited at GenBank under the accession NZ_LVJJ00000000.
Ormeño-Orrillo, Ernesto; Rey, Luis; Durán, David; Canchaya, Carlos A; Zúñiga-Dávila, Doris; Imperial, Juan; Martínez-Romero, Esperanza; Ruiz-Argüeso, Tomás
2017-09-01
Bradyrhizobium sp. LMTR 3 is a representative strain of one of the geno(species) of diazotrophic symbionts associated with Lima bean ( Phaseolus lunatus ) in Peru. Its 7.83 Mb genome was sequenced using the Illumina technology and found to encode a complete set of genes required for nodulation and nitrogen fixation, and additional genes putatively involved in root colonization. Its draft genome sequence and annotation have been deposited at GenBank under the accession number MAXC00000000.
2012-01-01
Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920
Bioengineering recombinant diacylglycerol acyltransferases
USDA-ARS?s Scientific Manuscript database
Diacylglycerol acyltransferases (DGATs) catalyze the last and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. At least 115 DGAT sequences are identified from 69 organisms in the GenBank databases. Only a few papers have been published in the last 28 years on the exp...
Evidence for contemporary plant mitoviruses
USDA-ARS?s Scientific Manuscript database
Mitoviruses have small RNA(+) genomes, replicate in mitochondria, and have to date been directly shown to infect only fungi. For this report, sequences that appear to represent approximately complete mitovirus genomes were discovered in plant transcriptome data at GenBank. At least 17 of the refined...
Sequence verification as quality-control step for production of cDNA microarrays.
Taylor, E; Cogdell, D; Coombes, K; Hu, L; Ramdas, L; Tabor, A; Hamilton, S; Zhang, W
2001-07-01
To generate cDNA arrays in our core laboratory, we amplified about 2300 PCR products from a human, sequence-verified cDNA clone library. As a quality-control step, we sequenced the PCR products immediately before printing. The sequence information was used to search the GenBank database to confirm the identities. Although these clones were previously sequence verified by the company, we found that only 79% of the clones matched the original database after handling. Our experience strongly indicates the necessity to sequence verify the clones at the final stage before printing on microarray slides and to modify the gene list accordingly.
Mimosa caesalpiniifolia rhizobial isolates from different origins of the Brazilian Northeast.
Martins, Paulo Geovani Silva; Junior, Mario Andrade Lira; Fracetto, Giselle Gomes Monteiro; da Silva, Maria Luiza Ribeiro Bastos; Vincentin, Rayssa Pereira; de Lyra, Maria do Carmo Catanho Pereira
2015-04-01
Biological nitrogen fixation from the legume-rhizobia symbiosis is one of the main sources of fixed nitrogen on land environments. Diazotrophic bacteria taxonomy has been substantially modified by the joint use of phenotypic, physiological and molecular aspects. Among these molecular tools, sequencing and genotyping of genomic regions such as 16S rDNA and repetitive conserved DNA regions have boosted the accuracy of species identification. This research is a phylogenetic study of diazotrophic bacteria from sabiá (Mimosa caesalpiniifolia Benth.), inoculated with soils from five municipalities of the Brazilian Northeast. After bacterial isolation and morphophysiological characterization, genotyping was performed using REP, ERIC and BOX oligonucleotides and 16S rDNA sequencing for genetic diversity identification. A 1.5b Kb fragment of the 16S rDNA was amplified from each isolate. Morphophysiological characterization of the 47 isolates created a dendrogram, where isolate PE-GR02 formed a monophyletic branch. The fingerprinting conducted with BOX, ERIC and REP shows distinct patterns, and their compilation created a dendrogram with diverse groups and, after blasting in GenBank, resulted in genetic identities ranging from 77 to 99 % with Burkholderia strains. The 16S rDNA phylogenetic tree constructed with these isolates and GenBank deposits of strains recommended for inoculant production confirm these isolates are distinct from the previously deposited strains, whereas isolates PE-CR02, PE-CR4, PE-CR07, PE-CR09 and PE-GE06 were the most distinct within the group. Morphophysiological characterization and BOX, ERIC and REP compilation enhanced the discrimination of the isolates, and the 16S rDNA sequences compared with GenBank confirmed the preference of Mimosa for Burkholderia diazotrophic bacteria.
Ali, M. Rahmat; Alam, A. S. M. Rubayet Ul; Amin, M. Al; Ullah, Huzzat; Siddique, Mohammad Anwar; Momtaz, Samina; Sultana, Munawar
2017-01-01
ABSTRACT The complete genome sequence of foot-and-mouth disease virus (FMDV) serotype Asia1 isolated from Bangladesh is reported here. Genome analysis revealed amino acid substitutions in the VP1 antigenic region and deletions in both the 5′ and 3′ untranslated regions (UTRs) compared to the genome of the existing vaccine strain (GenBank accession no. AY304994). PMID:29074654
Karaushu, E. V.; Kravzova, T. R.; Vorobey, N. A.; Kiriziy, D. A.; Olkhovich, O. P.; Taran, N. Yu.; Kots, S. Ya.; Omarova, E.
2015-01-01
Seed inoculation with bacterial consortium was found to increase legume yield, providing a higher growth than the standard nitrogen treatment methods. Alfalfa plants were inoculated by mono- and binary compositions of nitrogen-fixing microorganisms. Their physiological and biochemical properties were estimated. Inoculation by microbial consortium of Sinorhizobium meliloti T17 together with a new cyanobacterial isolate Nostoc PTV was more efficient than the single-rhizobium strain inoculation. This treatment provides an intensification of the processes of biological nitrogen fixation by rhizobia bacteria in the root nodules and an intensification of plant photosynthesis. Inoculation by bacterial consortium stimulates growth of plant mass and rhizogenesis and leads to increased productivity of alfalfa and to improving the amino acid composition of plant leaves. The full nucleotide sequence of the rRNA gene cluster and partial sequence of the dinitrogenase reductase (nifH) gene of Nostoc PTV were deposited to GenBank (JQ259185.1, JQ259186.1). Comparison of these gene sequences of Nostoc PTV with all sequences present at the GenBank shows that this cyanobacterial strain does not have 100% identity with any organisms investigated previously. Phylogenetic analysis showed that this cyanobacterium clustered with high credibility values with Nostoc muscorum. PMID:26114100
Karaushu, E V; Lazebnaya, I V; Kravzova, T R; Vorobey, N A; Lazebny, O E; Kiriziy, D A; Olkhovich, O P; Taran, N Yu; Kots, S Ya; Popova, A A; Omarova, E; Koksharova, O A
2015-01-01
Seed inoculation with bacterial consortium was found to increase legume yield, providing a higher growth than the standard nitrogen treatment methods. Alfalfa plants were inoculated by mono- and binary compositions of nitrogen-fixing microorganisms. Their physiological and biochemical properties were estimated. Inoculation by microbial consortium of Sinorhizobium meliloti T17 together with a new cyanobacterial isolate Nostoc PTV was more efficient than the single-rhizobium strain inoculation. This treatment provides an intensification of the processes of biological nitrogen fixation by rhizobia bacteria in the root nodules and an intensification of plant photosynthesis. Inoculation by bacterial consortium stimulates growth of plant mass and rhizogenesis and leads to increased productivity of alfalfa and to improving the amino acid composition of plant leaves. The full nucleotide sequence of the rRNA gene cluster and partial sequence of the dinitrogenase reductase (nifH) gene of Nostoc PTV were deposited to GenBank (JQ259185.1, JQ259186.1). Comparison of these gene sequences of Nostoc PTV with all sequences present at the GenBank shows that this cyanobacterial strain does not have 100% identity with any organisms investigated previously. Phylogenetic analysis showed that this cyanobacterium clustered with high credibility values with Nostoc muscorum.
Inventory of high-abundance mRNAs in skeletal muscle of normal men.
Welle, S; Bhatt, K; Thornton, C A
1999-05-01
G42875rial analysis of gene expression (SAGE) method was used to generate a catalog of 53,875 short (14 base) expressed sequence tags from polyadenylated RNA obtained from vastus lateralis muscle of healthy young men. Over 12,000 unique tags were detected. The frequency of occurrence of each tag reflects the relative abundance of the corresponding mRNA. The mRNA species that were detected 10 or more times, each comprising >/=0.02% of the mRNA population, accounted for 64% of the mRNA mass but <10% of the total number of mRNA species detected. Almost all of the abundant tags matched mRNA or EST sequences cataloged in GenBank. Mitochondrial transcripts accounted for approximately 20% of the polyadenylated RNA. Transcripts encoding proteins of the myofibrils were the most abundant nuclear-encoded mRNAs. Transcripts encoding ribosomal proteins, and those encoding proteins involved in energy metabolism, also were very abundant. The database can be used as a reference for investigations of alterations in gene expression associated with conditions that influence muscle function, such as muscular dystrophies, aging, and exercise.
The Diversity Present in 5140 Human Mitochondrial Genomes
Pereira, Luísa; Freitas, Fernando; Fernandes, Verónica; Pereira, Joana B.; Costa, Marta D.; Costa, Stephanie; Máximo, Valdemar; Macaulay, Vincent; Rocha, Ricardo; Samuels, David C.
2009-01-01
We analyzed the current status (as of the end of August 2008) of human mitochondrial genomes deposited in GenBank, amounting to 5140 complete or coding-region sequences, in order to present an overall picture of the diversity present in the mitochondrial DNA of the global human population. To perform this task, we developed mtDNA-GeneSyn, a computer tool that identifies and exhaustedly classifies the diversity present in large genetic data sets. The diversity observed in the 5140 human mitochondrial genomes was compared with all possible transitions and transversions from the standard human mitochondrial reference genome. This comparison showed that tRNA and rRNA secondary structures have a large effect in limiting the diversity of the human mitochondrial sequences, whereas for the protein-coding genes there is a bias toward less variation at the second codon positions. The analysis of the observed amino acid variations showed a tolerance of variations that convert between the amino acids V, I, A, M, and T. This defines a group of amino acids with similar chemical properties that can interconvert by a single transition. PMID:19426953
Harnessing mtDNA variation to resolve ambiguity in ‘Redfish’ sold in Europe
Moore, Lauren; Pampoulie, Christophe; Di Muri, Cristina; Vandamme, Sara; Mariani, Stefano
2017-01-01
Morphology-based identification of North Atlantic Sebastes has long been controversial and misidentification may produce misleading data, with cascading consequences that negatively affect fisheries management and seafood labelling. North Atlantic Sebastes comprises of four species, commonly known as ‘redfish’, but little is known about the number, identity and labelling accuracy of redfish species sold across Europe. We used a molecular approach to identify redfish species from ‘blind’ specimens to evaluate the performance of the Barcode of Life (BOLD) and Genbank databases, as well as carrying out a market product accuracy survey from retailers across Europe. The conventional BOLD approach proved ambiguous, and phylogenetic analysis based on mtDNA control region sequences provided a higher resolution for species identification. By sampling market products from four countries, we found the presence of two species of redfish (S. norvegicus and S. mentella) and one unidentified Pacific rockfish marketed in Europe. Furthermore, public databases revealed the existence of inaccurate reference sequences, likely stemming from species misidentification from previous studies, which currently hinders the efficacy of DNA methods for the identification of Sebastes market samples. PMID:29018597
Database resources of the National Center for Biotechnology
Wheeler, David L.; Church, Deanna M.; Federhen, Scott; Lash, Alex E.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Tatusova, Tatiana A.; Wagner, Lukas
2003-01-01
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:12519941
Purushe, Janaki; Fouts, Derrick E; Morrison, Mark; White, Bryan A; Mackie, Roderick I; Coutinho, Pedro M; Henrissat, Bernard; Nelson, Karen E
2010-11-01
The Prevotellas comprise a diverse group of bacteria that has received surprisingly limited attention at the whole genome-sequencing level. In this communication, we present the comparative analysis of the genomes of Prevotella ruminicola 23 (GenBank: CP002006) and Prevotella bryantii B(1)4 (GenBank: ADWO00000000), two gastrointestinal isolates. Both P. ruminicola and P. bryantii have acquired an extensive repertoire of glycoside hydrolases that are targeted towards non-cellulosic polysaccharides, especially GH43 bifunctional enzymes. Our analysis demonstrates the diversity of this genus. The results from these analyses highlight their role in the gastrointestinal tract, and provide a template for additional work on genetic characterization of these species.
Shayan, P; Jafari, S; Fattahi, R; Ebrahimzade, E; Amininia, N; Changizi, E
2016-05-01
Ovine theileriosis is an important hemoprotozoal disease of sheep and goats in tropical and subtropical regions which caused high economic loses in the livestock industry. Theileria annulata surface protein (TaSp) was used previously as a tool for serological analysis in livestock. Since the amino acid sequences of TaSp is, at least, in part very conserved in T. annulata, Theileria lestoquardi and Theileria china I and II, it is very important to determine the amino acid sequence of this protein in Theileria ovis as well, to avoid false interpretation of serological data based on this protein in small animal. In the present study, the nucleotide sequence and amino acid sequence of T. ovis surface protein (ToSp) were determined. The comparison of the nucleotide sequence of ToSp showed 96, 96, 99, and 86 % homology to the corresponding nucleotide sequence of TaSp genes by T. annulata, T. China I, T. China II and T. lestoquardi, previously registered in GenBank under accession nos. AJ316260.1, AY274329.1, DQ120058.1, and EF092924.1 respectively. The amino acid sequence analysis showed 95, 81, 98 and 70 % homology to the corresponding amino acid sequence of T. annulata, T chinaI, T china II and T. lestoquardi, registered in GenBank under accession nos. CAC87478.1, AAP36993.1, AAZ30365.1 and AAP36999.11, respectively. Interestingly, in contrast to the C terminus, a significant difference in amino acid sequence in the N teminus of the ToSp protein could be determined compared to the other known corresponding TaSp sequences, which make this region attractive for designing of a suitable tool for serological diagnosis.
Hause, Anne M.; Henke, David M.; Avadhanula, Vasanthi; Shaw, Chad A.; Tapia, Lorena I.
2017-01-01
Background The fusion (F) protein of RSV is the major vaccine target. This protein undergoes a conformational change from pre-fusion to post-fusion. Both conformations share antigenic sites II and IV. Pre-fusion F has unique antigenic sites p27, ø, α2α3β3β4, and MPE8; whereas, post-fusion F has unique antigenic site I. Our objective was to determine the antigenic variability for RSV/A and RSV/B isolates from contemporary and historical genotypes compared to a historical RSV/A strain. Methods The F sequences of isolates from GenBank, Houston, and Chile (N = 1,090) were used for this analysis. Sequences were compared pair-wise to a reference sequence, a historical RSV/A Long strain. Variability (calculated as %) was defined as changes at each amino acid (aa) position when compared to the reference sequence. Only aa at antigenic sites with variability ≥5% were reported. Results A total of 1,090 sequences (822 RSV/A and 268 RSV/B) were analyzed. When compared to the reference F, those domains with the greatest number of non-synonymous changes included the signal peptide, p27, heptad repeat domain 2, antigenic site ø, and the transmembrane domain. RSV/A subgroup had 7 aa changes in the antigenic sites: site I (N = 1), II (N = 1), p27 (N = 4), α2α3β3β4(AM14) (N = 1), ranging in frequency from 7–91%. In comparison, RSV/B had 19 aa changes in antigenic sites: I (N = 3), II (N = 1), p27 (N = 9), ø (N = 4), α2α3β3β4(AM14) (N = 1), and MPE8 (N = 1), ranging in frequency from 79–100%. Discussion Although antigenic sites of RSV F are generally well conserved, differences are observed when comparing the two subgroups to the reference RSV/A Long strain. Further, these discrepancies are accented in the antigenic sites in pre-fusion F of RSV/B isolates, often occurring with a frequency of 100%. This could be of importance if a monovalent F protein from the historical GA1 genotype of RSV/A is used for vaccine development. PMID:28414749
Hause, Anne M; Henke, David M; Avadhanula, Vasanthi; Shaw, Chad A; Tapia, Lorena I; Piedra, Pedro A
2017-01-01
The fusion (F) protein of RSV is the major vaccine target. This protein undergoes a conformational change from pre-fusion to post-fusion. Both conformations share antigenic sites II and IV. Pre-fusion F has unique antigenic sites p27, ø, α2α3β3β4, and MPE8; whereas, post-fusion F has unique antigenic site I. Our objective was to determine the antigenic variability for RSV/A and RSV/B isolates from contemporary and historical genotypes compared to a historical RSV/A strain. The F sequences of isolates from GenBank, Houston, and Chile (N = 1,090) were used for this analysis. Sequences were compared pair-wise to a reference sequence, a historical RSV/A Long strain. Variability (calculated as %) was defined as changes at each amino acid (aa) position when compared to the reference sequence. Only aa at antigenic sites with variability ≥5% were reported. A total of 1,090 sequences (822 RSV/A and 268 RSV/B) were analyzed. When compared to the reference F, those domains with the greatest number of non-synonymous changes included the signal peptide, p27, heptad repeat domain 2, antigenic site ø, and the transmembrane domain. RSV/A subgroup had 7 aa changes in the antigenic sites: site I (N = 1), II (N = 1), p27 (N = 4), α2α3β3β4(AM14) (N = 1), ranging in frequency from 7-91%. In comparison, RSV/B had 19 aa changes in antigenic sites: I (N = 3), II (N = 1), p27 (N = 9), ø (N = 4), α2α3β3β4(AM14) (N = 1), and MPE8 (N = 1), ranging in frequency from 79-100%. Although antigenic sites of RSV F are generally well conserved, differences are observed when comparing the two subgroups to the reference RSV/A Long strain. Further, these discrepancies are accented in the antigenic sites in pre-fusion F of RSV/B isolates, often occurring with a frequency of 100%. This could be of importance if a monovalent F protein from the historical GA1 genotype of RSV/A is used for vaccine development.
Microsatellite Markers for Raspberries and Blackberries
USDA-ARS?s Scientific Manuscript database
Twelve microsatellites were isolated from SSR-enriched genomic libraries of Rubus idaeus L.‘Meeker’ red raspberry (diploid) and R. loganobaccus L. H. Bailey ‘Marion’ blackberry-raspberry hybrid (hexaploid). These primer pairs, with the addition of one developed from a GenBank R. idaeus sequence, w...
Submission of nucleotide sequence eimeria acervulina profilin to genbank database
USDA-ARS?s Scientific Manuscript database
Poultry coccidiosis, caused by intestinal protozoa Eimeria, is a severe problem for the poultry industry, leading to a substantial economic burden of over three billion dollars worldwide. Conventional vaccines including live vaccines and attenuated vaccines could cause mild to severe reactions Numer...
Amelia, Tan Suet May; Amirul, Al-Ashraf Abdullah; Bhubalan, Kesaven
2018-02-01
We report data associated with the identification of three polyhydroxyalkanoate synthase genes (phaC) isolated from the marine bacteria metagenome of Aaptos aaptos marine sponge in the waters of Bidong Island, Terengganu, Malaysia. Our data describe the extraction of bacterial metagenome from sponge tissue, measurement of purity and concentration of extracted metagenome, polymerase chain reaction (PCR)-mediated amplification using degenerate primers targeting Class I and II phaC genes, sequencing at First BASE Laboratories Sdn Bhd, and phylogenetic analysis of identified and known phaC genes. The partial nucleotide sequences were aligned, refined, compared with the Basic Local Alignment Search Tool (BLAST) databases, and released online in GenBank. The data include the identified partial putative phaC and their GenBank accession numbers, which are Rhodocista sp. phaC (MF457754), Pseudomonas sp. phaC (MF437016), and an uncultured bacterium AR5-9d_16 phaC (MF457753).
Zhang, Ziqi; Sun, Tong; Kang, Chunlan; Liu, Yang; Liu, Shaoying; Yue, Bisong; Zeng, Tao
2016-01-01
The complete mitochondrial genome sequence of Cricetulus longicaudatus (Rodentia Cricetidae: Cricetinae) was determined and was deposited in GenBank (GenBank accession no. KM067270). The mitochondrial genome of C. longicaudatus was 16,302 bp in length and contained 13 protein-coding genes, 2 ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes and one control region, with an identical order to that of other rodents' mitochondrial genomes. The phylogenetic analysis was performed with Bayesian inference based on the concatenated nucleotide sequence of 12 protein-coding genes on the heavy strand. The result showed that these species from Cricetidae and its two subfamilies (Cricetinae and Arvicolines) formed solid monophyletic group, respectively. The Cricetulus had close phylogenetic relationship with Tscherskia among three genera (Cricetulus, Cricetulus and Mesocricetus). Neodon irene and Myodes regulus were embedded in Microtus and Eothenomys, respectively. The unusual phylogenetic positions of Neodon irene and Myodes regulus remain further study in the future.
Database resources of the National Center for Biotechnology Information: 2002 update
Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.
2002-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, Human¡VMouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:11752242
Song, Yang; Zhang, Yong; Fan, Qin; Cui, Hui; Yan, Dongmei; Zhu, Shuangli; Tang, Haishu; Sun, Qiang; Wang, Dongyan; Xu, Wenbo
2017-02-23
Human enterovirus B106 (EV-B106) is a new member of the enterovirus B species. To date, only three nucleotide sequences of EV-B106 have been published, and only one full-length genome sequence (the Yunnan strain 148/YN/CHN/12) is available in the GenBank database. In this study, we conducted phylogenetic characterisation of four EV-B106 strains isolated in Xinjiang, China. Pairwise comparisons of the nucleotide sequences and the deduced amino acid sequences revealed that the four Xinjiang EV-B106 strains had only 80.5-80.8% nucleotide identity and 95.4-97.3% amino acid identity with the Yunnan EV-B106 strain, indicating high mutagenicity. Similarity plots and bootscanning analyses revealed that frequent intertypic recombination occurred in all four Xinjiang EV-B106 strains in the non-structural region. These four strains may share a donor sequence with the EV-B85 strain, which circulated in Xinjiang in 2011, indicating extensive genetic exchanges between these strains. All Xinjiang EV-B106 strains were temperature-sensitive. An antibody seroprevalence study against EV-B106 in two Xinjiang prefectures also showed low titres of neutralizing antibodies, suggesting limited exposure and transmission in the population. This study contributes the whole genome sequences of EV-B106 to the GenBank database and provides valuable information regarding the molecular epidemiology of EV-B106 in China.
Song, Yang; Zhang, Yong; Fan, Qin; Cui, Hui; Yan, Dongmei; Zhu, Shuangli; Tang, Haishu; Sun, Qiang; Wang, Dongyan; Xu, Wenbo
2017-01-01
Human enterovirus B106 (EV-B106) is a new member of the enterovirus B species. To date, only three nucleotide sequences of EV-B106 have been published, and only one full-length genome sequence (the Yunnan strain 148/YN/CHN/12) is available in the GenBank database. In this study, we conducted phylogenetic characterisation of four EV-B106 strains isolated in Xinjiang, China. Pairwise comparisons of the nucleotide sequences and the deduced amino acid sequences revealed that the four Xinjiang EV-B106 strains had only 80.5–80.8% nucleotide identity and 95.4–97.3% amino acid identity with the Yunnan EV-B106 strain, indicating high mutagenicity. Similarity plots and bootscanning analyses revealed that frequent intertypic recombination occurred in all four Xinjiang EV-B106 strains in the non-structural region. These four strains may share a donor sequence with the EV-B85 strain, which circulated in Xinjiang in 2011, indicating extensive genetic exchanges between these strains. All Xinjiang EV-B106 strains were temperature-sensitive. An antibody seroprevalence study against EV-B106 in two Xinjiang prefectures also showed low titres of neutralizing antibodies, suggesting limited exposure and transmission in the population. This study contributes the whole genome sequences of EV-B106 to the GenBank database and provides valuable information regarding the molecular epidemiology of EV-B106 in China. PMID:28230168
Detection of Bacillus anthracis DNA in Complex Soil and Air Samples Using Next-Generation Sequencing
Be, Nicholas A.; Thissen, James B.; Gardner, Shea N.; McLoughlin, Kevin S.; Fofanov, Viacheslav Y.; Koshinsky, Heather; Ellingson, Sally R.; Brettin, Thomas S.; Jackson, Paul J.; Jaing, Crystal J.
2013-01-01
Bacillus anthracis is the potentially lethal etiologic agent of anthrax disease, and is a significant concern in the realm of biodefense. One of the cornerstones of an effective biodefense strategy is the ability to detect infectious agents with a high degree of sensitivity and specificity in the context of a complex sample background. The nature of the B. anthracis genome, however, renders specific detection difficult, due to close homology with B. cereus and B. thuringiensis. We therefore elected to determine the efficacy of next-generation sequencing analysis and microarrays for detection of B. anthracis in an environmental background. We applied next-generation sequencing to titrated genome copy numbers of B. anthracis in the presence of background nucleic acid extracted from aerosol and soil samples. We found next-generation sequencing to be capable of detecting as few as 10 genomic equivalents of B. anthracis DNA per nanogram of background nucleic acid. Detection was accomplished by mapping reads to either a defined subset of reference genomes or to the full GenBank database. Moreover, sequence data obtained from B. anthracis could be reliably distinguished from sequence data mapping to either B. cereus or B. thuringiensis. We also demonstrated the efficacy of a microbial census microarray in detecting B. anthracis in the same samples, representing a cost-effective and high-throughput approach, complementary to next-generation sequencing. Our results, in combination with the capacity of sequencing for providing insights into the genomic characteristics of complex and novel organisms, suggest that these platforms should be considered important components of a biosurveillance strategy. PMID:24039948
Genetic Diversity and Phylogenetic Evolution of Tibetan Sheep Based on mtDNA D-Loop Sequences
Yue, Yaojing; Guo, Xian; Guo, Tingting; Chu, Min; Wang, Fan; Han, Jilong; Feng, Ruilin; Sun, Xiaoping; Niu, Chune; Yang, Bohui; Guo, Jian; Yuan, Chao
2016-01-01
The molecular and population genetic evidence of the phylogenetic status of the Tibetan sheep (Ovis aries) is not well understood, and little is known about this species’ genetic diversity. This knowledge gap is partly due to the difficulty of sample collection. This is the first work to address this question. Here, the genetic diversity and phylogenetic relationship of 636 individual Tibetan sheep from fifteen populations were assessed using 642 complete sequences of the mitochondrial DNA D-loop. Samples were collected from the Qinghai-Tibetan Plateau area in China, and reference data were obtained from the six reference breed sequences available in GenBank. The length of the sequences varied considerably, between 1031 and 1259 bp. The haplotype diversity and nucleotide diversity were 0.992±0.010 and 0.019±0.001, respectively. The average number of nucleotide differences was 19.635. The mean nucleotide composition of the 350 haplotypes was 32.961% A, 29.708% T, 22.892% C, 14.439% G, 62.669% A+T, and 37.331% G+C. Phylogenetic analysis showed that all four previously defined haplogroups (A, B, C, and D) were found in the 636 individuals of the fifteen Tibetan sheep populations but that only the D haplogroup was found in Linzhou sheep. Further, the clustering analysis divided the fifteen Tibetan sheep populations into at least two clusters. The estimation of the demographic parameters from the mismatch analyses showed that haplogroups A, B, and C had at least one demographic expansion in Tibetan sheep. These results contribute to the knowledge of Tibetan sheep populations and will help inform future conservation programs about the Tibetan sheep native to the Qinghai-Tibetan Plateau. PMID:27463976
High-resolution phylogenetic microbial community profiling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singer, Esther; Coleman-Derr, Devin; Bowman, Brett
2014-03-17
The representation of bacterial and archaeal genome sequences is strongly biased towards cultivated organisms, which belong to merely four phylogenetic groups. Functional information and inter-phylum level relationships are still largely underexplored for candidate phyla, which are often referred to as microbial dark matter. Furthermore, a large portion of the 16S rRNA gene records in the GenBank database are labeled as environmental samples and unclassified, which is in part due to low read accuracy, potential chimeric sequences produced during PCR amplifications and the low resolution of short amplicons. In order to improve the phylogenetic classification of novel species and advance ourmore » knowledge of the ecosystem function of uncultivated microorganisms, high-throughput full length 16S rRNA gene sequencing methodologies with reduced biases are needed. We evaluated the performance of PacBio single-molecule real-time (SMRT) sequencing in high-resolution phylogenetic microbial community profiling. For this purpose, we compared PacBio and Illumina metagenomic shotgun and 16S rRNA gene sequencing of a mock community as well as of an environmental sample from Sakinaw Lake, British Columbia. Sakinaw Lake is known to contain a large age of microbial species from candidate phyla. Sequencing results show that community structure based on PacBio shotgun and 16S rRNA gene sequences is highly similar in both the mock and the environmental communities. Resolution power and community representation accuracy from SMRT sequencing data appeared to be independent of GC content of microbial genomes and was higher when compared to Illumina-based metagenome shotgun and 16S rRNA gene (iTag) sequences, e.g. full-length sequencing resolved all 23 OTUs in the mock community, while iTags did not resolve closely related species. SMRT sequencing hence offers various potential benefits when characterizing uncharted microbial communities.« less
Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan
2016-06-01
Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids.
Viputtigul, Kwanjai; Tungpukdee, Noppadon; Ruangareerate, Toon; Luplertlop, Natthanej; Wilairatana, Polrat; Gaywee, Jariyanart; Krudsood, Srivicha
2013-01-01
This study was undertaken to ascertain the extent of polymorphism in the C-terminal region of Plasmodium falciparum merozoite surface protein (MSP-1) from 119 malaria patients in Tak Province on the western border of Thailand, who were admitted to the Bangkok Hospital for Tropical Diseases, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand. P. falciparum infection was confirmed by microscopic examination of peripheral blood smears. Clinical manifestations were categorized into 2 groups: uncomplicated (94 cases) and complicated/severe (25 cases). A 1,040 basepair fragment of P. falciparum MSP-1 gene was compared with MSP-1 of reference strains retrieved from GenBank. The consensus sequences of MSP-1 block 16 showed it belonged to MAD20 genotype, which is the major allele of falciparum malaria from the western border of Thailand. MSP-1 block 16 amino acid fragment could be separated into 2 groups: similar and dissimilar to reference sequence. Four variations in MSP-1 block 16 were -1494K, D1510G, D1556N, and K1696I. MSP-1 block 16 diversity is not significantly associated with clinical manifestation although MAD 20 genotype is the predominant genotype in this area. The genetic data of MSP1 gene of faciparum malaria isolated from western Thai border contribute to the existing genetic database of Thai P. falciparum strain.
Blastocystis phylogeny among various isolates from humans to insects.
Yoshikawa, Hisao; Koyama, Yukiko; Tsuchiya, Erika; Takami, Kazutoshi
2016-12-01
Blastocystis is a common unicellular eukaryotic parasite found not only in humans, but also in various kinds of animal species worldwide. Since Blastocystis isolates are morphologically indistinguishable, many molecular biological approaches have been applied to classify these isolates. The complete or partial sequences of the small subunit rRNA gene (SSU rDNA) are mainly used for comparisons and phylogenetic analyses among Blastocystis isolates. However, various lengths of the partial SSU rDNA sequence have been used for phylogenetic inference among genetically different isolates. Based on the complete SSU rDNA sequences, consensus terminology of nine subtypes (STs) of Blastocystis sp. that were supported by phylogenetically monophyletic nine clades was proposed in 2007. Thereafter, eight additional kinds of STs comprising non-human mammalian Blastocystis isolates have been reported based on the phylogeny of SSU rDNA sequences, while STs 11 and 12 were only proposed on the base of partial sequences. Although many sequence data from mammalian and avian Blastocystis are registered in GenBank, only limited data on SSU rDNA are available for poikilotherm-derived Blastocystis isolates. Therefore, the phylogenetic positions of the reptilian/amphibian Blastocystis clades are unstable. The phylogenetic inference of various STs comprising mammalian and/or avian Blastocystis isolates was verified herein based on comparisons between partial and complete SSU rDNA sequences, and the phylogenetic positions of reptilian and amphibian Blastocystis isolates were also investigated using 14 new Blastocystis isolates from reptiles with all known isolates from other reptilians, amphibians, and insects registered in GenBank. Copyright © 2016. Published by Elsevier Ireland Ltd.
Submission of nucleotide sequence clostridium perfringens NetB toxin to genbank database
USDA-ARS?s Scientific Manuscript database
Clostridium perfringens can cause gas gangrene and food poisoning in humans and causes several enterot-oxemic diseases in animals including avian necrotic enteritis. This disease affects all chicken producing countries worldwide and is a considerable burden on the commercial chicken production indus...
Submission of nucleotide sequence chicken IL-7 to genbank database
USDA-ARS?s Scientific Manuscript database
Mammalian interleukin-7 (IL-7) is able to stimulate lymphocyte proliferation and maturation, and reverse immuno-suppression. However, whether poultry IL-7 has similar functions remains unclear. Chicken IL-7 promoted mouse B cell proliferation in vitro, and significantly reduced virus titer in bursal...
2013-01-01
identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a...tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85...improve effectiveness of pesticide application for control of the new world sand fly Lutzomyia longipalpis in chicken sheds [13]. Attempts to control
K.D. Jermstad; L.A. Sheppard; B.B. Kinloch; A. Delfino-Mix; E.S. Ersoz; K.V. Krutovsky; D.B Neale
2006-01-01
The nucleotide-binding-site and leucine-rich-repeat (NBSâLRR) class of R proteins is abundant and widely distributed in plants. By using degenerate primers designed on the NBS domain in lettuce, we amplified sequences in sugar pine that shared sequence identity with many of the NBSâLRR class resistance genes catalogued in GenBank. The polymerase chain reaction products...
Dees, Merete Wiken; Brurberg, May Bente; Lysøe, Erik
2017-03-01
The genus Microbacterium contains bacteria that are ubiquitously distributed in various environments and includes plant-associated bacteria that are able to colonize tissue of agricultural crop plants. Here, we report the 3,508,491 bp complete genome sequence of Microbacterium sp. strain BH-3-3-3, isolated from conventionally grown lettuce ( Lactuca sativa ) from a field in Vestfold, Norway. The nucleotide sequence of this genome was deposited into NCBI GenBank under the accession CP017674.
Attachment of Asaia bogorensis Originating in Fruit-Flavored Water to Packaging Materials
Otlewska, Anna; Antolak, Hubert
2014-01-01
The objective of this study was to investigate the adhesion of isolated spoilage bacteria to packaging materials used in the food industry. Microorganisms were isolated from commercial fruit-flavored mineral water in plastic bottles with flocks as a visual defect. The Gram-negative rods were identified using the molecular method through the amplification of a partial region of the 16S rRNA gene. Based on the sequence identity (99.6%) between the spoilage organism and a reference strain deposited in GenBank, the spoilage isolate was identified as Asaia bgorensis. Experiments on bacterial adhesion were conducted using plates made of glass and polystyrene (packaging materials commonly used in the beverage industry). Cell adhesion ability was determined using luminometry, plate count, and the microscopic method. The strain of A. bogorensis was characterized by strong adhesion properties which were dependent on the surface type, with the highest cell adhesion detected on polystyrene. PMID:25295262
Consistency of gene starts among Burkholderia genomes
2011-01-01
Background Evolutionary divergence in the position of the translational start site among orthologous genes can have significant functional impacts. Divergence can alter the translation rate, degradation rate, subcellular location, and function of the encoded proteins. Results Existing Genbank gene maps for Burkholderia genomes suggest that extensive divergence has occurred--53% of ortholog sets based on Genbank gene maps had inconsistent gene start sites. However, most of these inconsistencies appear to be gene-calling errors. Evolutionary divergence was the most plausible explanation for only 17% of the ortholog sets. Correcting probable errors in the Genbank gene maps decreased the percentage of ortholog sets with inconsistent starts by 68%, increased the percentage of ortholog sets with extractable upstream intergenic regions by 32%, increased the sequence similarity of intergenic regions and predicted proteins, and increased the number of proteins with identifiable signal peptides. Conclusions Our findings highlight an emerging problem in comparative genomics: single-digit percent errors in gene predictions can lead to double-digit percentages of inconsistent ortholog sets. The work demonstrates a simple approach to evaluate and improve the quality of gene maps. PMID:21342528
Draft genome sequence of Aeromonas hydrophila TN97-08
USDA-ARS?s Scientific Manuscript database
Aeromonas hydrophila is an opportunistic Gram-negative species causing disease in fish and mammals. The genus Aeromonas affects a variety of aquatic organisms and lives in diverse aquatic ecosystems (1). There are 39 A. hydrophila genomes currently available in GenBank. In the current study, we repo...
Submission of nucleotide sequence eimeria tenella elongation factor 1-alpha to genBank database
USDA-ARS?s Scientific Manuscript database
Avian coccidiosis is caused by multiple species of the apicomplexan protozoan, Eimeria, and is one of the most economically devastating enteric diseases for the poultry industry worldwide. Host immunity to Eimeria infection, however, is relatively species-specific. The ability to immunize chickens a...
Submission of nucleotide sequence clostridium perfringens alpha-toxin to genbank database
USDA-ARS?s Scientific Manuscript database
Clostridium perfringens (CP) is ubiquitous in the nature, and a normal inhabitant in the intestinal tracts of animals and humans. However, pathogenic CP is also a causative agent of poultry disease necrotic enteritis (NE). Clostridium perfringens alpha toxin is a toxin produced by the bacterium Clo...
USDA-ARS?s Scientific Manuscript database
Clostridium perfringens (CP) is ubiquitous in the nature, and a normal inhabitant in the intestinal tracts of animals and humans. However, pathogenic CP is also a causative agent of poultry disease necrotic enteritis (NE). Clostridium-related poultry diseases such as necrotic enteritis (NE) and gang...
Submission of nucleotide sequence clostridium perfringens elongation factor-tu to genbank database
USDA-ARS?s Scientific Manuscript database
Clostridium perfringens (CP) is ubiquitous in the nature, and a normal inhabitant in the intestinal tracts of animals and humans. However, pathogenic CP is also a causative agent of poultry disease necrotic enteritis (NE). Clostridium-related poultry diseases such as necrotic enteritis (NE) and gang...
Benmechernene, Zineb; Fernández-No, Inmaculada; Quintela-Baluja, Marcos; Kihal, Mebrouk; Calo-Mata, Pilar; Barros-Velázquez, Jorge
2014-01-01
Information on the microbiology of camel milk is very limited. In this work, the genetic characterization and proteomic identification of 13 putative producing bacteriocin Leuconostoc strains exhibiting antilisterial activity and isolated from camel milk were performed. DNA sequencing of the 13 selected strains revealed high homology among the 16S rRNA genes for all strains. In addition, 99% homology with Leuconostoc mesenteroides was observed when these sequences were analysed by the BLAST tool against other sequences from reference strains deposited in the Genbank. Furthermore, the isolates were characterized by matrix-assisted laser desorption/ionization time of flight mass spectrometry (MALDITOF MS) which allowed for the identification of 2 mass peaks 6242 m/z and 5118 m/z that resulted to be specific to the species L. mesenteroides. Remarkably, the phyloproteomic tree provided more intraspecific information of L. mesenteroides than phylogenetic analysis. Accordingly, phyloproteomic analysis grouped L. mesenteroides strains into different subbranches, while all L. mesenteroides isolates were grouped in the same branch according to phylogenetic analysis. This study represents, to our knowledge, the first report on the use of MALDI-TOF MS on the identification of LAB isolated from camel milk. PMID:24809059
WANG, ZHANG-YANG; HONG, WEI-LONG; ZHU, ZHE-HUI; CHEN, YUN-HAO; YE, WEN-LE; CHU, GUANG-YU; LI, JIA-LIN; CHEN, BI-CHENG; XIA, PENG
2015-01-01
BK polyomavirus (BKV) is important pathogen for kidney transplant recipients, as it is frequently re-activated, leading to nephropathy. The aim of this study was to investigate the phylogenetic reconstruction and polymorphism of the VP2 gene in BKV isolated from Chinese kidney transplant recipients. Phylogenetic analysis was carried out in the VP2 region from 135 BKV-positive samples and 28 reference strains retrieved from GenBank. The unweighted pair-group method with arithmetic mean (UPGMA) grouped all strains into subtypes, but failed to subdivide strains into subgroups. Among the plasma and urine samples, all plasma (23/23) and 82 urine samples (82/95) were identified to contain subtype I; the other 10 urine samples contained subtype IV. A 86-bp fragment was identified as a highly conserved sequence. Following alignment with 36 published BKV sequences from China, 92 sites of polymorphism were identified, including 11 single nucleotide polymorphisms (SNPs) prevalent in Chinese individuals and 30 SNPs that were specific to the two predominant subtypes I and IV. The limitations of the VP2 gene segment in subgrouping were confirmed by phylogenetic analysis. The conserved sequence and polymorphism identified in this study may be helpful in the detection and genotyping of BKV. PMID:26640547
Whipps, Christopher M.; El-Matbouli, M.; Hedrick, R.P.; Blazer, V.; Kent, M.L.
2004-01-01
Molecular approaches for resolving relationships among the Myxozoa have relied mainly on small subunit (SSU) ribosomal DNA (rDNA) sequence analysis. This region of the gene is generally used for higher phylogenetic studies, and the conservative nature of this gene may make it inadequate for intraspecific comparisons. Previous intraspecific studies of Myxobolus cerebralis based on molecular analyses reported that the sequence of SSU rDNA and the internal transcribed spacer (ITS) were highly conserved in representatives of the parasite from North America and Europe. Considering that the ITS is usually a more variable region than the SSU, we reanalyzed available sequences on GenBank and obtained sequences from other M. cerebralis representatives from the states of California and West Virginia in the USA and from Germany and Russia. With the exception of 7 base pairs, most of the sequence designated as ITS-1 in GenBank was a highly conserved portion of the rDNA near the 3-prime end of the SSU region. Nonetheless, the additional ITS-1 sequences obtained from the available geographic representatives were well conserved. It is unlikely that we would have observed virtually identical ITS-1 sequences between European and American M. cerebralis samples had it spread naturally over time, particularly when compared to the variation seen between isolates of another myxozoan (Kudoa thyrsites) that has most likely spread naturally. These data further support the hypothesis that the current distribution of M. cerebralis in North America is a result of recent introductions followed by dispersal via anthropogenic means, largely through the stocking of infected trout for sport fishing.
Mioduchowska, Monika; Czyż, Michał Jan; Gołdyn, Bartłomiej; Kur, Jarosław; Sell, Jerzy
2018-01-01
The cytochrome c oxidase subunit I (cox1) gene is the main mitochondrial molecular marker playing a pivotal role in phylogenetic research and is a crucial barcode sequence. Folmer's "universal" primers designed to amplify this gene in metazoan invertebrates allowed quick and easy barcode and phylogenetic analysis. On the other hand, the increase in the number of studies on barcoding leads to more frequent publishing of incorrect sequences, due to amplification of non-target taxa, and insufficient analysis of the obtained sequences. Consequently, some sequences deposited in genetic databases are incorrectly described as obtained from invertebrates, while being in fact bacterial sequences. In our study, in which we used Folmer's primers to amplify COI sequences of the crustacean fairy shrimp Branchipus schaefferi (Fischer 1834), we also obtained COI sequences of microbial contaminants from Aeromonas sp. However, when we searched the GenBank database for sequences closely matching these contaminations we found entries described as representatives of Gastrotricha and Mollusca. When these entries were compared with other sequences bearing the same names in the database, the genetic distance between the incorrect and correct sequences amplified from the same species was c.a. 65%. Although the responsibility for the correct molecular identification of species rests on researchers, the errors found in already published sequences data have not been re-evaluated so far. On the basis of the standard sampling technique we have estimated with 95% probability that the chances of finding incorrectly described metazoan sequences in the GenBank depend on the systematic group, and variety from less than 1% (Mollusca and Arthropoda) up to 6.9% (Gastrotricha). Consequently, the increasing popularity of DNA barcoding and metabarcoding analysis may lead to overestimation of species diversity. Finally, the study also discusses the sources of the problems with amplification of non-target sequences.
Mulcahy, Daniel G.; Vanthomme, Hadrien; Tobi, Elie; Wynn, Addison H.; Zimkus, Breda M.; McDiarmid, Roy W.
2017-01-01
Development projects in west Central Africa are proceeding at an unprecedented rate, often with little concern for their effects on biodiversity. In an attempt to better understand potential impacts of a road development project on the anuran amphibian community, we conducted a biodiversity assessment employing multiple methodologies (visual encounter transects, auditory surveys, leaf litter plots and pitfall traps) to inventory species prior to construction of a new road within the buffer zone of Moukalaba-Doudou National Park, Gabon. Because of difficulties in morphological identification and taxonomic uncertainty of amphibian species observed in the area, we integrated a DNA barcoding analysis into the project to improve the overall quality and accuracy of the species inventory. Based on morphology alone, 48 species were recognized in the field and voucher specimens of each were collected. We used tissue samples from specimens collected at our field site, material available from amphibians collected in other parts of Gabon and the Republic of Congo to initiate a DNA barcode library for west Central African amphibians. We then compared our sequences with material in GenBank for the genera recorded at the study site to assist in identifications. The resulting COI and 16S barcode library allowed us to update the number of species documented at the study site to 28, thereby providing a more accurate assessment of diversity and distributions. We caution that because sequence data maintained in GenBank are often poorly curated by the original submitters and cannot be amended by third-parties, these data have limited utility for identification purposes. Nevertheless, the use of DNA barcoding is likely to benefit biodiversity inventories and long-term monitoring, particularly for taxa that can be difficult to identify based on morphology alone; likewise, inventory and monitoring programs can contribute invaluable data to the DNA barcode library and the taxonomy of complex groups. Our methods provide an example of how non-taxonomists and parataxonomists working in understudied parts of the world with limited geographic sampling and comparative morphological material can use DNA barcoding and publicly available sequence data (GenBank) to rapidly identify the number of species and assign tentative names to aid in urgent conservation management actions and contribute to taxonomic resolution. PMID:29131846
Li, Chunhua; Lu, Ling; Wu, Xianghong; Wang, Chuanxi; Bennett, Phil; Lu, Teng; Murphy, Donald
2009-08-01
In this study, we characterized the full-length genomic sequences of 13 distinct hepatitis C virus (HCV) genotype 4 isolates/subtypes: QC264/4b, QC381/4c, QC382/4d, QC193/4g, QC383/4k, QC274/4l, QC249/4m, QC97/4n, QC93/4o, QC139/4p, QC262/4q, QC384/4r and QC155/4t. These were amplified, using RT-PCR, from the sera of patients now residing in Canada, 11 of which were African immigrants. The resulting genomes varied between 9421 and 9475 nt in length and each contains a single ORF of 9018-9069 nt. The sequences showed nucleotide similarities of 77.3-84.3 % in comparison with subtypes 4a (GenBank accession no. Y11604) and 4f (EF589160) and 70.6-72.8 % in comparison with genotype 1 (M62321/1a, M58335/1b, D14853/1c, and 1?/AJ851228) reference sequences. These similarities were often higher than those currently defined by HCV classification criteria for subtype (75.0-80.0 %) and genotype (67.0-70.0 %) division, respectively. Further analyses of the complete and partial E1 and partial NS5B sequences confirmed these 13 'provisionally assigned subtypes'.
Diversity of Basidiomycetes in Michigan Agricultural Soils▿
Lynch, Michael D. J.; Thorn, R. Greg
2006-01-01
We analyzed the communities of soil basidiomycetes in agroecosystems that differ in tillage history at the Kellogg Biological Station Long-Term Ecological Research site near Battle Creek, Michigan. The approach combined soil DNA extraction through a bead-beating method modified to increase recovery of fungal DNA, PCR amplification with basidiomycete-specific primers, cloning and restriction fragment length polymorphism screening of mixed PCR products, and sequencing of unique clones. Much greater diversity was detected than was anticipated in this habitat on the basis of culture-based methods or surveys of fruiting bodies. With “species” defined as organisms yielding PCR products with ≥99% identity in the 5′ 650 bases of the nuclear large-subunit ribosomal DNA, 241 “species” were detected among 409 unique basidiomycete sequences recovered. Almost all major clades of basidiomycetes from basidiomycetous yeasts and other heterobasidiomycetes through polypores and euagarics (gilled mushrooms and relatives) were represented, with a majority from the latter clade. Only 24 of 241 “species” had 99% or greater sequence similarity to named reference sequences in GenBank, and several clades with multiple “species” could not be identified at the genus level by phylogenetic comparisons with named sequences. The total estimated “species” richness for this 11.2-ha site was 367 “species” of basidiomycetes. Since >99% of the study area has not been sampled, the accuracy of our diversity estimate is uncertain. Replication in time and space is required to detect additional diversity and the underlying community structure. PMID:16950900
Brassica ASTRA: an integrated database for Brassica genomic research.
Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David
2005-01-01
Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
Gan, Han Ming; Tan, Mun Hua; Lee, Yin Peng; Austin, Christopher M
2016-05-01
The mitogenome of the Australian freshwater blackfish, Gadopsis marmoratus was recovered coverage by genome skimming using the MiSeq sequencer (GenBank Accession Number: NC_024436). The blackfish mitogenome has 16,407 base pairs made up of 13 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a 819 bp non-coding AT-rich region. This is the 5th mitogenome sequence to be reported for the family Percichthyidae.
Wang, Shuo
2016-01-01
We announce here the first complete chloroplast genome sequence of the tropical japonica rice, along with its genome structure and functional annotation. The plant was collected from Indonesia and deposited as a germplasm accession of the International Rice GenBank Collection (IRGC 66630) at the International Rice Research Institute (IRRI). This genome provides valuable data for the future utilization of the germplasm of rice. PMID:26893422
2003-12-01
populations. (ii) Characterization of Dehalococeoides sp . strain FL2. The isolate, designate d Dehalococcoides sp . strain FL2, reductively...Pinellas group of the Dehalococcoides cluster, and demonstrated that strain FL2 shared an identical 165 rRNA gene sequence with Dehalococcoides sp ...strain CBDBI, a chlorobenzene-dechlorinating strain. The 165 rRNA gene sequence of Dehalococcoides sp . strain FL2 was submitted to GenBank (AF357918.2
Susulovska, Solomia; Castillo, Pablo; Archidona-Yuste, Antonio
2017-01-01
Seven needle nematode species of the genus Longidorus have been reported in Ukraine. Nematological surveys for needle nematodes were carried out in Ukraine between 2016 and 2017 and two nematode species of Longidorus (L. caespiticola and L. poessneckensis) were collected from natural and anthropogenically altered habitats on the territory of Opillia and Zakarpattia in Ukraine. Nematodes were extracted from 500 cm3 of soil by modified sieving and decanting method. Extracted specimens were processed to glycerol and mounted on permanent slides and subsequently identified morphologically and molecularly. Nematode DNA was extracted from single individuals and PCR assays were conducted as previously described for D2–D3 expansion segments of 28S rRNA. Sequence alignments for D2–D3 from L. caespiticola showed 97%–99% similarity to other sequences of L. caespiticola deposited in GenBank from Belgium, Bulgaria, Czech Republic, Russia, Slovenia, and Scotland. Similarly, D2–D3 sequence alignments from L. poessneckensis, showed 99% to other sequences of L. poessneckensis deposited in GenBank from Slovakia and Czech Republic. Morphology, morphometry, and molecular data obtained from these samples were consistent with L. caespiticola and L. poessneckensis identification. To our knowledge, these are the first reports of L. caespiticola and L. poessneckensis in Ukraine, extending the geographical distribution of these species. PMID:29353928
Detection of signals in mRNAs that influence translation.
Brown, Chris M; Jacobs, Grant; Stockwell, Peter; Schreiber, Mark
2003-01-01
Genome sequencing efforts mean that we now have extensive data from a wide range of organisms to study. Understanding the differing natures of the biology of these organisms is an important aim of genome analysis. We are interested in signals that affect translation of mRNAs. Some signals in the mRNA influence how efficiently it is translated into protein. Previous studies have indicated that many important signals are located around the initiation and termination codons. We have developed tools described here to extract the relevant sequence regions from GenBank. To create databases organised by species, or higher taxonomic groupings (eg planta), a program was developed to dynamically view and edit the taxonomy database. Data from relevant species were then extracted using our Genbank feature table parser. We analysed all available sequences, particularly those from complete genomes. Patterns were then identified using information theory. The software is available from http://transterm.otago.ac.nz. Patterns around the initiation codons for most of the organisms fall into two groups, containing the previously known Shine-Dalgarno and Kozaks efficiency signals. However, we have identified several organisms that appear to utilise novel systems. Our analysis indicates that some organisms with extremely high GC% genomes do not have a strong dependence on base pairing ribosome binding sites, as the complementary sequence is absent from many genes.
Karaulov, Alexander; Aleshkin, Vladimir; Slobodenyuk, Vladimir; Grechishnikova, Olga; Afanasyev, Stanislav; Lapin, Boris; Dzhikidze, Eteri; Nesvizhsky, Yuriy; Evsegneeva, Irina; Voropayeva, Elena; Afanasyev, Maxim; Aleshkin, Andrei; Metelskaya, Valeria; Yegorova, Ekaterina; Bayrakova, Alexandra
2010-01-01
Based on the results of the comparative analysis concerning relatedness and evolutional difference of the 16S-23S nucleotide sequences of the middle ribosomal cluster and 23S rRNA I domain, and based on identification of phylogenetic position for Chlamydophila pneumoniae and Chlamydia trichomatis strains released from monkeys, relatedness of the above stated isolates with similar strains released from humans and with strains having nucleotide sequences presented in the GenBank electronic database has been detected for the first time ever. Position of these isolates in the Chlamydiaceae family phylogenetic tree has been identified. The evolutional position of the investigated original Chlamydia and Chlamydophila strains close to analogous strains from the Gen-Bank electronic database has been demonstrated. Differences in the 16S-23S nucleotide sequence of the middle ribosomal cluster and 23S rRNA I domain of plasmid and nonplasmid Chlamydia trachomatis strains released from humans and monkeys relative to different genotype groups (group B-B, Ba, D, Da, E, L1, L2, L2a; intermediate group-F, G, Ga) have been revealed for the first time ever. Abnormality in incA chromosomal gene expression resulting in Chlamydia life development cycle disorder, and decrease of Chlamydia virulence can be related to probable changes in the nucleotide sequence of the gene under consideration.
DNA Barcoding Identifies Illegal Parrot Trade.
Gonçalves, Priscila F M; Oliveira-Marques, Adriana R; Matsumoto, Tania E; Miyaki, Cristina Y
2015-01-01
Illegal trade threatens the survival of many wild species, and molecular forensics can shed light on various questions raised during the investigation of cases of illegal trade. Among these questions is the identity of the species involved. Here we report a case of a man who was caught in a Brazilian airport trying to travel with 58 avian eggs. He claimed they were quail eggs, but authorities suspected they were from parrots. The embryos never hatched and it was not possible to identify them based on morphology. As 29% of parrot species are endangered, the identity of the species involved was important to establish a stronger criminal case. Thus, we identified the embryos' species based on the analyses of mitochondrial DNA sequences (cytochrome c oxidase subunit I gene [COI] and 16S ribosomal DNA). Embryonic COI sequences were compared with those deposited in BOLD (The Barcode of Life Data System) while their 16S sequences were compared with GenBank sequences. Clustering analysis based on neighbor-joining was also performed using parrot COI and 16S sequences deposited in BOLD and GenBank. The results, based on both genes, indicated that 57 embryos were parrots (Alipiopsitta xanthops, Ara ararauna, and the [Amazona aestiva/A. ochrocephala] complex), and 1 was an owl. This kind of data can help criminal investigations and to design species-specific anti-poaching strategies, and demonstrate how DNA sequence analysis in the identification of bird species is a powerful conservation tool. © The American Genetic Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis.
Buldyrev, S V; Goldberger, A L; Havlin, S; Mantegna, R N; Matsa, M E; Peng, C K; Simons, M; Stanley, H E
1995-05-01
An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.
Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis
NASA Technical Reports Server (NTRS)
Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Matsa, M. E.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.
Barcoding and Border Biosecurity: Identifying Cyprinid Fishes in the Aquarium Trade
Collins, Rupert A.; Armstrong, Karen F.; Meier, Rudolf; Yi, Youguang; Brown, Samuel D. J.; Cruickshank, Robert H.; Keeling, Suzanne; Johnston, Colin
2012-01-01
Background Poorly regulated international trade in ornamental fishes poses risks to both biodiversity and economic activity via invasive alien species and exotic pathogens. Border security officials need robust tools to confirm identifications, often requiring hard-to-obtain taxonomic literature and expertise. DNA barcoding offers a potentially attractive tool for quarantine inspection, but has yet to be scrutinised for aquarium fishes. Here, we present a barcoding approach for ornamental cyprinid fishes by: (1) expanding current barcode reference libraries; (2) assessing barcode congruence with morphological identifications under numerous scenarios (e.g. inclusion of GenBank data, presence of singleton species, choice of analytical method); and (3) providing supplementary information to identify difficult species. Methodology/Principal Findings We sampled 172 ornamental cyprinid fish species from the international trade, and provide data for 91 species currently unrepresented in reference libraries (GenBank/Bold). DNA barcodes were found to be highly congruent with our morphological assignments, achieving success rates of 90–99%, depending on the method used (neighbour-joining monophyly, bootstrap, nearest neighbour, GMYC, percent threshold). Inclusion of data from GenBank (additional 157 spp.) resulted in a more comprehensive library, but at a cost to success rate due to the increased number of singleton species. In addition to DNA barcodes, our study also provides supporting data in the form of specimen images, morphological characters, taxonomic bibliography, preserved vouchers, and nuclear rhodopsin sequences. Using this nuclear rhodopsin data we also uncovered evidence of interspecific hybridisation, and highlighted unrecognised diversity within popular aquarium species, including the endangered Indian barb Puntius denisonii. Conclusions/Significance We demonstrate that DNA barcoding provides a highly effective biosecurity tool for rapidly identifying ornamental fishes. In cases where DNA barcodes are unable to offer an identification, we improve on previous studies by consolidating supplementary information from multiple data sources, and empower biosecurity agencies to confidently identify high-risk fishes in the aquarium trade. PMID:22276096
Pope, Welkin H.; Weigele, Peter R.; Chang, Juan; Pedulla, Marisa L.; Ford, Michael E.; Houtz, Jennifer M.; Jiang, Wen; Chiu, Wah; Hatfull, Graham F.; Hendrix, Roger W.; King, Jonathan
2010-01-01
Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214bp with a 237bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life-cycle. Assignment of eleven ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryoelectron micrographs of purified Syn5 virions revealed that the capsid has a single “horn”, a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent three-fold rather than six-fold symmetry. An 18Å-resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria. PMID:17383677
Hesamizadeh, Khashayar; Alavian, Seyed Moayed; Najafi Tireh Shabankareh, Azar; Sharafi, Heidar
2016-12-01
Hepatitis C virus (HCV) is characterized by a high degree of genetic heterogeneity and classified into 7 genotypes and different subtypes. It heterogeneously distributed through various risk groups and geographical regions. A well-established phylogenetic relationship can simplify the tracing of HCV hierarchical strata into geographical regions. The current study aimed to find genetic phylogeny of subtypes 1a and 1b of HCV isolates based on NS5B nucleotide sequences in Iran and other members of Eastern Mediterranean regional office of world health organization, as well as other Middle Eastern countries, with a systematic review of available published and unpublished studies. The phylogenetic analyses were performed based on the nucleotide sequences of NS5B gene of HCV genotype 1 (HCV-1), which were registered in the GenBank database. The literature review was performed in two steps: 1) searching studies evaluating the NS5B sequences of HCV-1, on PubMed, Scopus, and Web of Science, and 2) Searching sequences of unpublished studies registered in the GenBank database. In this study, 442 sequences from HCV-1a and 232 from HCV-1b underwent phylogenetic analysis. Phylogenetic analysis of all sequences revealed different clusters in the phylogenetic trees. The results showed that the proportion of HCV-1a and -1b isolates from Iranian patients probably originated from domestic sources. Moreover, the HCV-1b isolates from Iranian patients may have similarities with the European ones. In this study, phylogenetic reconstruction of HCV-1 sequences clearly indicated for molecular tracing and ancestral relationships of the HCV genotypes in Iran, and showed the likelihood of domestic origin for HCV-1a and various origin for HCV-1b.
GALAVANI, Hossein; GHOLIZADEH, Saber; HAZRATI TAPPEH, Khosrow
2016-01-01
Background: Fascioliasis, caused by Fasciola hepatica and F. gigantica, has medical and economic importance in the world. Molecular approaches comparing traditional methods using for identification and characterization of Fasciola spp. are precise and reliable. The aims of current study were molecular characterization of Fasciola spp. in West Azerbaijan Province, Iran and then comparative analysis of them using GenBank sequences. Methods: A total number of 580 isolates were collected from different hosts in five cities of West Azerbaijan Province, in 2014 from 90 slaughtered cattle (n=50) and sheep (n=40). After morphological identification and DNA extraction, designing specific primer were used to amplification of ITS1, 5.8s and ITS2 regions, 50 samples were conducted to sequence, randomly. Result: Using morphometric characters 99.14% and 0.86% of isolates identified as F. hepatica and F. gigantica, respectively. PCR amplification of 1081 bp fragment and sequencing result showed 100% similarity with F. hepatica in ITS1 (428 bp), 5.8s (158 bp), and ITS2 (366 bp) regions. Sequence comparison among current study sequences and GenBank data showed 98% identity with 11 nucleotide mismatches. However, in phylogenetic tree F. hepatica sequences of West Azerbaijan Province, Iran, were in a close relationship with Iranian, Asian, and African isolates. Conclusions: Only F. hepatica species is distributed among sheep and cattle in West Azerbaijan Province Iran. However, 5 and 6 bp variation in ITS1 and ITS2 regions, respectively, is not enough to separate of Fasciola spp. Therefore, more studies are essential for designing new molecular markers to correct species identification. PMID:27095969
Sequence analysis of porcine kobuvirus VP1 region detected in pigs in Japan and Thailand.
Okitsu, Shoko; Khamrin, Pattara; Thongprachum, Aksara; Hidaka, Satoshi; Kongkaew, Sompreeya; Kongkaew, Apisek; Maneekarn, Niwat; Mizuguchi, Masashi; Hayakawa, Satoshi; Ushijima, Hiroshi
2012-04-01
Porcine kobuvirus is a new candidate species of the genus Kobuvirus in the family Picornaviridae, and information is still limited. The identification of porcine kobuvirus has been performed by the sequence analyses of the 3D region of the viruses. Therefore, the purpose of this study was to characterize the molecular properties of VP1 nucleotide sequences of the porcine kobuviruses isolated from porcine stool samples in Japan during 2009 and Thailand between 2006 and 2008. In addition, previous identification of a unique porcine kobuvirus; Japanese H023/2009/JP, which is a bovine kobuvirus-like strain based on sequence analysis of the 3D region, was also included in this study. All of the strains were amplified by the VP1-specific primer pair: the amplicons were subjected to direct sequencing and compared with the VP1 nucleotide sequences of reference strains. The VP1 sequences of strains from the GenBank database revealed high nucleotide sequence identity at 84.3-100%. On the other hand, the nucleotide identities among the 15 porcine kobuvirus strains analyzed in this study ranged from 78.8 to 99.8%. The results revealed that diversity of the strains in this study were higher than those of the strains in previous studies. Furthermore, it was found that the VP1 region of the bovine kobuvirus-like strain, H023/2009/JP, clustered with nine porcine kobuvirus strains that were isolated in Thailand and Japan. Since this strain was previously found to be closely related to bovine kobuviruses in the 3D gene region, it may be a natural recombinant.
Distribution of hepatitis B virus subgenotype F2a in São Paulo, Brazil.
Alvarado-Mora, Mónica V; Botelho-Lima, Livia S; Santana, Rubia A; Sitnik, Roberta; Ferreira, Paulo Abrão; do Amaral Mello, Francisco; Mangueira, Cristovão P; Carrilho, Flair J; Rebello Pinho, João R
2013-10-21
HBV genotype F is primarily found in indigenous populations from South America and is classified in four subgenotypes (F1 to F4). Subgenotype F2a is the most common in Brazil among genotype F cases. The aim of this study was to characterize HBV genotype F2a circulating in 16 patients from São Paulo, Brazil. Samples were collected between 2006 and 2012 and sent to Hospital Israelita Albert Einstein. A fragment of 1306 bp partially comprising HBsAg and DNA polymerase coding regions was amplified and sequenced. Viral sequences were genotyped by phylogenetic analysis using reference sequences from GenBank (n=198), including 80 classified as subgenotype F2a. Bayesian Markov chain Monte Carlo simulation implemented in BEAST v.1.5.4 was applied to obtain the best possible estimates using the model of nucleotide substitutions GTR+G+I. It were identified three groups of sequences of subgenotype F2a: 1) 10 sequences from São Paulo state; 2) 3 sequences from Rio de Janeiro and one from São Paulo states; 3) 8 sequences from the West Amazon Basin. These results showing for the first time the distribution of F2a subgenotype in Brazil. The spreading and the dynamic of subgenotype F2a in Brazil requires the study of a higher number of samples from different regions as it is unfold in almost all Brazilian populations studied so far. We cannot infer with certainty the origin of these different groups due to the lack of available sequences. Nevertheless, our data suggest that the common origin of these groups probably occurred a long time ago.
Calongea, a new genus of truffles in the Pezizaceae (Pezizales)
Rosanne A. Healy; Gregory Bonito; James M. Trappe
2009-01-01
Phylogenetic analysis of the ITS and LSU rDNA of Pachyphloeus species from Europe and North America revealed a new truffle genus. These molecular analyses plus sequences downloaded from a BLAST search in GenBank indicated that Pachyphloeus prieguensis is within the Pezizaceae but well outside of the genus Pachyphloeus...
USDA-ARS?s Scientific Manuscript database
High density genotyping techniques are needed for investigating antimicrobial resistance especially in the case of multi-drug resistant (MDR) isolates. To achieve this all antimicrobial resistance genes in the NCBI Genbank database were identified by key word searches of sequence annotations and the...
Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools
ERIC Educational Resources Information Center
Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.
2011-01-01
Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…
Chen, Qilei; Hu, Youjia; Zhao, Wenjie; Zhu, Chunbao; Zhu, Baoquan
2010-01-01
A gene encoding a novel (S)-specific NADH-dependent alcohol dehydrogenase (LK-ADH) was isolated from the genomic DNA of Lactobacillus kefir DSM 20587 by thermal asymmetric interlaced-polymerase chain reaction. The nucleotide sequence of (S)-LK-ADH gene (adhS) was determined, which consists of an open reading frame of 1,044 bp, coding for 347 amino acids with a molecular mass of 37.065 kDa. After a BLAST similarity search in GenBank database, the amino acid sequence of (S)-LK-ADH showed some homologies to several zinc containing medium-chain alcohol dehydrogenases. This novel gene was deposited into GenBank with the accession number of EU877965. adhS gene was subcloned into plasmid pET-28a(+), and recombinant (S)-LK-ADH was successfully expressed in E. coli BL21(DE3) by isopropyl-beta-D-1-thiogalactopyranoside induction. Purified enzyme showed a high enantioselectivity in the reduction of acetophenone to (S)-phenylethanol with an ee value of 99.4%. The substrate specificity and cofactor preference of recombinant (S)-LK-ADH were also tested.
Yao, Li-Nong; Zhang, Ling-Ling; Ruan, Wei; Chen, Hua-Liang; Lu, Qiao-Yi; Yang, Ting-Ting
2013-06-01
To identify the species of malaria parasites in 5 imported cases previously diagnosed as vivax malaria. Epidemiological information and blood samples were collected from five patients who returned from Africa and were diagnosed as vivax malaria. The detection was conducted by microscopy, right VIEW rapid malaria test (RDTs) and nested PCR with Plasmodium genus-specific and species-specific primers. The amplified products were sequenced and Blast analysis was performed. Three of the 5 cases had a history of malaria attack. Microscopically, 4 cases were confirmed as Plasmodium ovale infection, 1 (case 1) was co-infected with P. vivax and P. ovale. All 5 cases showed negative RDT results. Nested PCR detection revealed that the 5 cases had a P. ovale-specific fragment (800 bp), while case 1 had a P. vivax-specific fragment (120 bp) concurrently. Blast analysis showed that the amplified sequence of the 5 cases had a high sequence homology (99%) with P. ovale gene for small subunit ribosomal RNA from GenBank, and that of case 1 also shared 99% homology with P. vivax isolate SV5 18S ribosomal RNA gene (GenBank accession number: JQ627157.1). Among the five cases, four were infected by Plasmodium ovale, and one was co-infected with both P. vivax and P. ovale.
Nanobacteria may be linked to calcification in placenta.
Lu, He; Guo, Ya-nan; Liu, Sheng-nan; Zhang, De-chun
2012-05-01
Placental calcification is a common pathologic condition in obstetrics. To detect the bacteria infection mechanisms for calcification, an experiment was performed to isolate, culture, and identify the nanobacteria in placental calcification. Sixteen cases of placental calcification of pregnant women were collected for the purpose of the isolation of nanobacteria, cultivation, and identification of 16S rDNA sequence. Under transmission electron microscope, novel oval-shape nanobacteria-like particles (NLP) in extracellular matrix of calcified placenta tissues were found with 50-500 nm in diameter, and among hydroxyapatite crystals aggregation existed. After about 4 weeks of culturing and isolating NLP from these calcified tissues, all calcified placental tissue samples and one adjacent tissue of calcified placental tissue samples showed white granular depositions, which were firmly attached to the bottom of the culture tubes and visible to the naked eyes. In the control group they could not be seen. After PCR was amplified a 1407-bp fragment was obtained and submitted to GenBank after sequencing with accession number JN029830. The 16S rDNA sequence homology between the isolation strain and strain nanobacteria (X98418) was 92% in GenBank. For the first time isolated, cultured, and identified nanobacteria in placental calcification indicated that nanobacteria infection is related to placental calcification.
Gao, Qian; Yang, Zhu L
2016-01-01
The diversity of root-associated fungi associated with four ectomycorrhizal herbaceous species, Kobresia capillifolia, Carex parva, Polygonum macrophyllum and Potentilla fallens, collected in three sites of alpine meadows in southwestern China, was estimated based on internal transcribed spacer (ITS) rDNA sequence analysis of root tips. Three hundred seventy-seven fungal sequences sorted to 154 operational taxonomical units (sequence similarity of ≥ 97% across the ITS) were obtained from the four plant species across all three sites. Similar taxa (in GenBank with ≥ 97% similarity) were not found in GenBank and/or UNITE for most of the OTUs. Ectomycorrhiz a made up 64% of the fungi operational taxonomic units (OTUs), endophytes constituted 4% and the other 33% were unidentified root-associated fungi. Fungal OTUs were represented by 57% basidiomycetes and 43% ascomycetes. Inocybe, Tomentella/Thelophora, Sebacina, Hebeloma, Pezizomycotina, Cenococcum geophilum complex, Cortinarius, Lactarius and Helotiales were OTU-rich fungal lineages. Across the sites and host species the root-associated fungal communities generally exhibited low host and site specificity but high host and sampling site preference. Collectively our study revealed noteworthy diversity and endemism of root-associated fungi of alpine plants in this global biodiversity hotspot. © 2016 by The Mycological Society of America.
Molecular detection and characterization of Theileria species in the Philippines.
Belotindos, Lawrence P; Lazaro, Jonathan V; Villanueva, Marvin A; Mingala, Claro N
2014-09-01
Theileriosis is a tick-borne disease of domestic and wild animals that cause devastating economic loss in livestock in tropical and subtropical regions. Theileriosis is not yet documented in the Philippines as compared to babesiosis and anaplasmosis which are considered major tick-borne diseases that infect livestock in the country and contribute major losses to the livestock industry. The study was aimed to detect Theileria sp. at genus level in blood samples of cattle using polymerase chain reaction (PCR) assay. Specifically, it determined the phylogenetic relationship of Theileria species affecting cattle in the Philippines to other Theileria sp. registered in the GenBank. A total of 292 blood samples of cattle that were collected from various provinces were used. Theileria sp. was detected in 43/292 from the cattle blood samples using PCR assay targeting the major piroplasm surface protein (MPSP) gene. DNA sequence showed high similarity (90-99%) among the reported Theileria sp. isolates in the GenBank and the Philippine isolates of Theileria. Phylogenetic tree construction using nucleotide sequence classified the Philippine isolates of Theileria as benign. However, nucleotide polymorphism was observed in the new isolate based on nucleotide sequence alignment. It revealed that the new isolate can be a new species of Theileria.
Wang, Shuo; Gao, Li-Zhi
2016-02-18
We announce here the first complete chloroplast genome sequence of the tropical japonica rice, along with its genome structure and functional annotation. The plant was collected from Indonesia and deposited as a germplasm accession of the International Rice GenBank Collection (IRGC 66630) at the International Rice Research Institute (IRRI). This genome provides valuable data for the future utilization of the germplasm of rice. Copyright © 2016 Wang and Gao.
Bacillus pumilus SAFR-032 isolate
NASA Technical Reports Server (NTRS)
Venkateswaran, Kasthuri J. (Inventor)
2007-01-01
The present invention relates to discovery and isolation of a biologically pure culture of a Bacillus pumilus SAFR-032 isolate with UV sterilization resistant properties. This novel strain has been characterized on the basis of phenotypic traits, 16S rDNA sequence analysis and DNA-DNA hybridization. According to the results of these analyses, this strain belongs to the genus Bacillus. The GenBank accession number for the 16S rDNA sequence of the Bacillus pumilus SAFR-032 isolate is AY167879.
DMTB: the magnetotactic bacteria database
NASA Astrophysics Data System (ADS)
Pan, Y.; Lin, W.
2012-12-01
Magnetotactic bacteria (MTB) are of interest in biogeomagnetism, rock magnetism, microbiology, biomineralization, and advanced magnetic materials because of their ability to synthesize highly ordered intracellular nano-sized magnetic minerals, magnetite or greigite. Great strides for MTB studies have been made in the past few decades. More than 600 articles concerning MTB have been published. These rapidly growing data are stimulating cross disciplinary studies in such field as biogeomagnetism. We have compiled the first online database for MTB, i.e., Database of Magnestotactic Bacteria (DMTB, http://database.biomnsl.com). It contains useful information of 16S rRNA gene sequences, oligonucleotides, and magnetic properties of MTB, and corresponding ecological metadata of sampling sites. The 16S rRNA gene sequences are collected from the GenBank database, while all other data are collected from the scientific literature. Rock magnetic properties for both uncultivated and cultivated MTB species are also included. In the DMTB database, data are accessible through four main interfaces: Site Sort, Phylo Sort, Oligonucleotides, and Magnetic Properties. References in each entry serve as links to specific pages within public databases. The online comprehensive DMTB will provide a very useful data resource for researchers from various disciplines, e.g., microbiology, rock magnetism and paleomagnetism, biogeomagnetism, magnetic material sciences and others.
Utilizing the Web in the Classroom: Linking Student Scientists with Professional Data.
ERIC Educational Resources Information Center
Seitz, Kristine; Leake, Devin
1999-01-01
Describes how information gathered from a computer database can be used as a springboard to scientific discovery. Specifies directions for studying the homeobox gene PAX-6 using GenBank, a database maintained by the National Center for Biotechnology Information (NCBI). Contains 16 references. (WRM)
Deshpande, J M; Nadkarni, S S; Siddiqui, Z A
2003-12-01
Significant progress has been made towards eradication of poliomyelitis in India. Surveillance for acute flaccid paralysis (AFP) has reached high standards. Among the 3 types of polioviruses, type 2 had been eliminated in India and eradicated globally as of October 1999. However, we isolated wild poliovirus type 2 from a small number of polio cases in northern India in 2000 and again during December 2002 to February 2003. Using molecular tools the origin, of the wild type 2 poliovirus was investigated. Polioviruses isolated from stool samples collected from patients with AFP were differentiated as wild virus or Sabin vaccine-like by ELISA and probe hybridization assays. Complete VP1 gene nucleotide sequences of the wild type 2 poliovirus isolates were determined by reverse transcriptase polymerase chain reaction (RT-PCR), followed by cycle sequencing. VP1 nucleotide sequences were compared with those of wild type 2 polioviruses that were indigenous in India in the past as well as prototype/laboratory strains and the GenBank database. Wild poliovirus type 2 was detected in stool samples from 6 patients with AFP in western Uttar Pradesh and 1 in Gujarat. In addition, the virus was isolated from one healthy contact child and from environmental sewage sample in Moradabad where three of these patients were reported. These isolates were identified as genetically closely related to laboratory reference strain MEF-1. Molecular characterization of the isolates confirmed that there was no evidence of extensive person-to-person transmission of the virus in the community. Laboratory reference strain (MEF-1) of poliovirus type 2 caused paralytic poliomyelitis in 10 patients in September 2000 and November 2002 to February 2003. The origin of the virus was some laboratory as yet not identified. This episode highlights the urgent need for stringent containment of wild poliovirus containing materials in the laboratories across the country in order to prevent recurrence of such incidents.
Wang, Xu-Hua; Wang, Yong; Zhang, De-Bao; Liu, A-Ke; Yao, Qin; Chen, Ke-Ping
2014-01-01
Abstract Basic helix-loop-helix (bHLH) proteins comprise a large superfamily of transcription factors, which are involved in the regulation of various developmental processes. bHLH family members are widely distributed in various eukaryotes including yeast, fruit fly, zebrafish, mouse, and human. In this study, we identified 55 bHLH motifs encoded in genome sequence of the human body louse, Pediculus humanus corporis (Phthiraptera: Pediculidae). Phylogenetic analyses of the identified P. humanus corporis bHLH (PhcbHLH) motifs revealed that there are 23, 11, 9, 1, 10, and 1 member(s) in groups A, B, C, D, E, and F, respectively. Examination to GenBank annotations of the 55 PhcbHLH members indicated that 29 PhcbHLH proteins were annotated in consistence with our analytical result, 8 were annotated different with our analytical result, 12 were merely annotated as hypothetical protein, and the rest 6 were not deposited in GenBank. A comparison on insect bHLH gene composition revealed that human body louse possibly has more hairy and E(spl) genes than other insect species. Because hairy and E(spl) genes have been found to negatively regulate the differentiation of insect preneural cells, it is suggested that the existence of additional hairy and E(spl) genes in human body louse is probably the consequence of its long period adaptation to the relatively dark and stable environment. These data provide good references for further studies on regulatory functions of bHLH proteins in the growth and development of human body louse. PMID:25434030
Wang, Yan; Ma, Yan; Hao, Shuang; Xu, Xiaoting; Han, Yue; Yao, Wenqing; Zhao, Zhuo
2016-03-01
To analyze the genetic characterization of epidemic mumps virus strains in Liaoning Province and provide the basis for mumps control. A total of 32 mumps viruses strains were isolated during 2008-2104. The fragment of SH genes and HN genes were amplified by RT-PCR, the PCR products were sequenced and analyzed. Basing on the 316 nucleotides of SH gene, The phylogenetic analyses were processed with the data of WHO mumps reference strains downloaded from GenBank and 32 mumps viruses strains. It showed that the 31 mumps virus strains belong to F genotype except MuVi/Liaoning. CHN/16.11 which was G genotype . Comparing to the A reference strains (Jeryl-Lynn and S-79), F genotype MuV were mutated on 12 amino acids sites and 27 amino acids siteson on HN gene. F genotype MuV added one N-glycosylation site in 464th-466th amino acids. The antigenic sites on HN were mutated on 121th, 123th, 279th, 287th, 336th, 356th and 442th. Maybe, it will influence the MuV antigenic.
PRGdb: a bioinformatics platform for plant resistance gene analysis
Sanseverino, Walter; Roma, Guglielmo; De Simone, Marco; Faino, Luigi; Melito, Sara; Stupka, Elia; Frusciante, Luigi; Ercolano, Maria Raffaella
2010-01-01
PRGdb is a web accessible open-source (http://www.prgdb.org) database that represents the first bioinformatic resource providing a comprehensive overview of resistance genes (R-genes) in plants. PRGdb holds more than 16 000 known and putative R-genes belonging to 192 plant species challenged by 115 different pathogens and linked with useful biological information. The complete database includes a set of 73 manually curated reference R-genes, 6308 putative R-genes collected from NCBI and 10463 computationally predicted putative R-genes. Thanks to a user-friendly interface, data can be examined using different query tools. A home-made prediction pipeline called Disease Resistance Analysis and Gene Orthology (DRAGO), based on reference R-gene sequence data, was developed to search for plant resistance genes in public datasets such as Unigene and Genbank. New putative R-gene classes containing unknown domain combinations were discovered and characterized. The development of the PRG platform represents an important starting point to conduct various experimental tasks. The inferred cross-link between genomic and phenotypic information allows access to a large body of information to find answers to several biological questions. The database structure also permits easy integration with other data types and opens up prospects for future implementations. PMID:19906694
Rosenthal, Lisa M; Larsson, Karl-Henrik; Branco, Sara; Chung, Judy A; Glassman, Sydney I; Liao, Hui-Ling; Peay, Kabir G; Smith, Dylan P; Talbot, Jennifer M; Taylor, John W; Vellinga, Else C; Vilgalys, Rytas; Bruns, Thomas D
2017-01-01
The corticioid fungi are commonly encountered, highly diverse, ecologically important, and understudied. We collected specimens in 60 pine and spruce forests across North America to survey corticioid fungal frequency and distribution and to compile an internal transcribed spacer (ITS) database for the group. Sanger sequences from the ITS region of vouchered specimens were compared with sequences on GenBank and UNITE, and with high-throughput sequence data from soil and roots taken at the same sites. Out of 425 high-quality Sanger sequences from vouchered specimens, we recovered 223 distinct operational taxonomic units (OTUs), the majority of which could not be assigned to species by matching to the BLAST database. Corticioid fungi were found to be hyperdiverse, as supported by the observations that nearly two-thirds of our OTUs were represented by single collections and species estimator curves showed steep slopes with no plateaus. We estimate that 14.8-24.7% of our voucher-based OTUs are likely to be ectomycorrhizal (EM). Corticioid fungi recovered from the soil formed a different community assemblage, with EM taxa accounting for 40.5-58.6% of OTUs. We compared basidioma sequences with EM root tips from our data, GenBank, or UNITE, and with this approach, we reiterate existing speculations that Trechispora stellulata is EM. We found that corticioid fungi have a significant distance-decay pattern, adding to the literature supporting fungi as having geographically structured communities. This study provides a first view of the diversity of this important group across North American pine forests, but much of the biology and taxonomy of these diverse, important, and widespread fungi remains unknown.
Lei, Haiyan; Li, Tianwei; Hung, Guo-Chiuan; Li, Bingjie; Tsai, Shien; Lo, Shyh-Ching
2013-11-19
We conducted genomic sequencing to identify Epstein Barr Virus (EBV) genomes in 2 human peripheral blood B lymphocytes that underwent spontaneous immortalization promoted by mycoplasma infections in culture, using the high-throughput sequencing (HTS) Illumina MiSeq platform. The purpose of this study was to examine if rapid detection and characterization of a viral agent could be effectively achieved by HTS using a platform that has become readily available in general biology laboratories. Raw read sequences, averaging 175 bps in length, were mapped with DNA databases of human, bacteria, fungi and virus genomes using the CLC Genomics Workbench bioinformatics tool. Overall 37,757 out of 49,520,834 total reads in one lymphocyte line (# K4413-Mi) and 28,178 out of 45,335,960 reads in the other lymphocyte line (# K4123-Mi) were identified as EBV sequences. The two EBV genomes with estimated 35.22-fold and 31.06-fold sequence coverage respectively, designated K4413-Mi EBV and K4123-Mi EBV (GenBank accession number KC440852 and KC440851 respectively), are characteristic of type-1 EBV. Sequence comparison and phylogenetic analysis among K4413-Mi EBV, K4123-Mi EBV and the EBV genomes previously reported to GenBank as well as the NA12878 EBV genome assembled from database of the 1000 Genome Project showed that these 2 EBVs are most closely related to B95-8, an EBV previously isolated from a patient with infectious mononucleosis and WT-EBV. They are less similar to EBVs associated with nasopharyngeal carcinoma (NPC) from Hong Kong and China as well as the Akata strain of a case of Burkitt's lymphoma from Japan. They are most different from type 2 EBV found in Western African Burkitt's lymphoma.
Merino, Emilio F; Fernandez-Becerra, Carmen; Madeira, Alda M B N; Machado, Ariane L; Durham, Alan; Gruber, Arthur; Hall, Neil; del Portillo, Hernando A
2003-07-21
Plasmodium vivax is the most widely distributed human malaria, responsible for 70-80 million clinical cases each year and large socio-economical burdens for countries such as Brazil where it is the most prevalent species. Unfortunately, due to the impossibility of growing this parasite in continuous in vitro culture, research on P. vivax remains largely neglected. A pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of P. vivax was performed. To do so, 1,184 clones from a cDNA library constructed with parasites obtained from 10 different human patients in the Brazilian Amazon were sequenced. Sequences were automatedly processed to remove contaminants and low quality reads. A total of 806 sequences with an average length of 586 bp met such criteria and their clustering revealed 666 distinct events. The consensus sequence of each cluster and the unique sequences of the singlets were used in similarity searches against different databases that included P. vivax, Plasmodium falciparum, Plasmodium yoelii, Plasmodium knowlesi, Apicomplexa and the GenBank non-redundant database. An E-value of <10(-30) was used to define a significant database match. ESTs were manually assigned a gene ontology (GO) terminology A total of 769 ESTs could be assigned a putative identity based upon sequence similarity to known proteins in GenBank. Moreover, 292 ESTs were annotated and a GO terminology was assigned to 164 of them. These are the first ESTs reported for P. vivax and, as such, they represent a valuable resource to assist in the annotation of the P. vivax genome currently being sequenced. Moreover, since the GC-content of the P. vivax genome is strikingly different from that of P. falciparum, these ESTs will help in the validation of gene predictions for P. vivax and to create a gene index of this malaria parasite.
First Report of a New Isolate of Metarhizium rileyi from Maize Fields of Quivicán, Cuba.
Álvarez, Sandra Pérez; Guerrero, Amaury Méndez; Duarte, Bernardo Nayar Débora; Tapia, Marco Antonio Magallanes; Medina, Jesús Alicia Chávez; Domínguez Rodríguez, Yoannis
2018-06-01
Metarhizium rileyi (Farlow) Samson is an important entomopathogenic fungus of more than 30 species of Lepidoptera larvae. The aim of this research was to characterize isolate of M. rileyi from Quivicán, Cuba on the basis of morphological and molecular approaches. The fungus was isolated from samples of S . frugiperda larvae collected from maize fields of Quivicán municipality, Mayabeque province, Cuba, and it was cultured on PDA + Ampicillin solid media for morphological characterization. The DNA was isolated using CTAB method and internal transcribed spacer (ITS1, ITS4) were used as the primers for the amplification. The amplified products of 1335 bp were purified and sequenced at CINVESTAV-IPN in both the directions using the above primers. A consensus sequence was obtained by alignment of the forward and reverse sequences for this region and deposited in GenBank (MG637450). The fungus produced slightly cottony colony of pale green color and dispersed conidia and septal mycelium were observed under the optical microscope. A BLAST search of the sequence in GenBank revealed a 99% of identity with several strains of N. rileyi (e.g., AF368501.1, AB268359.1 and EU553337.1) and M. rileyi (e.g., KY436756.1). This is the first report of M. rileyi isolate from maize fields of Quivicán in Cuba and this is important for biodiversity studies and is another possibility for Integrated Pest Management.
High prevalence of human parvovirus 4 infection in HBV and HCV infected individuals in shanghai.
Yu, Xuelian; Zhang, Jing; Hong, Liang; Wang, Jiayu; Yuan, Zhengan; Zhang, Xi; Ghildyal, Reena
2012-01-01
Human parvovirus 4 (PARV4) has been detected in blood and diverse tissues samples from HIV/AIDS patients who are injecting drug users. Although B19 virus, the best characterized human parvovirus, has been shown to co-infect patients with hepatitis B or hepatitis C virus (HBV, HCV) infection, the association of PARV4 with HBV or HCV infections is still unknown.The aim of this study was to characterise the association of viruses belonging to PARV4 genotype 1 and 2 with chronic HBV and HCV infection in Shanghai.Serum samples of healthy controls, HCV infected subjects and HBV infected subjects were retrieved from Shanghai Center for Disease Control and Prevention (SCDC) Sample Bank. Parvovirus-specific nested-PCR was performed and results confirmed by sequencing. Sequences were compared with reference sequences obtained from Genbank to derive phylogeny trees.The frequency of parvovirus molecular detection was 16-22%, 33% and 41% in healthy controls, HCV infected and HBV infected subjects respectively, with PARV4 being the only parvovirus detected. HCV infected and HBV infected subjects had a significantly higher PARV4 prevalence than the healthy population. No statistical difference was found in PARV4 prevalence between HBV or HCV infected subjects. PARV4 sequence divergence within study groups was similar in healthy subjects, HBV or HCV infected subjects.Our data clearly demonstrate that PARV4 infection is strongly associated with HCV and HBV infection in Shanghai but may not cause increased disease severity.
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.
Hiscock, D; Upton, C
2000-05-01
The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
USDA-ARS?s Scientific Manuscript database
Penicillium species cause postharvest blue mold decay of apple and pear fruits in the United States and around the world. This genus is responsible for severe economic losses and produces an array of mycotoxins that contaminate processed apple products. Among the species that cause blue mold, isolat...
USDA-ARS?s Scientific Manuscript database
Penicillium species cause postharvest blue mold decay of apples and pears in the United States and in many countries worldwide. This genus is responsible for severe economic losses and produces an array of mycotoxins that contaminate processed apple products. Among the species that cause blue mold,...
Wang, Xu; Qiu, Yue; Wei, Cong
2016-03-02
One new species of the genus Hyalessa China, H. wangi sp. nov., from Yunnan, China is described. Partial mitochondrial COI gene (DNA barcoding) of this new species is sequenced and uploaded to GenBank. A key to all species of Hyalessa is provided.
Guinea Pig ID-Like Families of SINEs
Kass, David H.; Schaetz, Brian A.; Beitler, Lindsey; Bonney, Kevin M.; Jamison, Nicole; Wiesner, Cathy
2009-01-01
Previous studies have indicated a paucity of SINEs within the genomes of the guinea pig and nutria, representatives of the Hystricognathi suborder of rodents. More recent work has shown that the guinea pig genome contains a large number of B1 elements, expanding to various levels among different rodents. In this work we utilized A–B PCR and screened GenBank with sequences from isolated clones to identify potentially uncharacterized SINEs within the guinea pig genome, and identified numerous sequences with a high degree of similarity (>92%) specific to the guinea pig. The presence of A-tails and flanking direct repeats associated with these sequences supported the identification of a full-length SINE, with a consensus sequence notably distinct from other rodent SINEs. Although most similar to the ID SINE, it clearly was not derived from the known ID master gene (BC1), hence we refer to this element as guinea pig ID-like (GPIDL). Using the consensus to screen the guinea pig genomic database (Assembly CavPor2) with Ensembl BlastView, we estimated at least 100,000 copies, which contrasts markedly to just over 100 copies of ID elements. Additionally we provided evidence of recent integrations of GPIDL as two of seven analyzed conserved GPIDL-containing loci demonstrated presence/absence variants in Cavia porcellus and C. aperea. Using intra-IDL PCR and sequence analyses we also provide evidence that GPIDL is derived from a hystricognath-specific SINE family. These results demonstrate that this SINE family continues to contribute to the dynamics of genomes of hystricognath rodents. PMID:19232383
Guinea pig ID-like families of SINEs.
Kass, David H; Schaetz, Brian A; Beitler, Lindsey; Bonney, Kevin M; Jamison, Nicole; Wiesner, Cathy
2009-05-01
Previous studies have indicated a paucity of SINEs within the genomes of the guinea pig and nutria, representatives of the Hystricognathi suborder of rodents. More recent work has shown that the guinea pig genome contains a large number of B1 elements, expanding to various levels among different rodents. In this work we utilized A-B PCR and screened GenBank with sequences from isolated clones to identify potentially uncharacterized SINEs within the guinea pig genome, and identified numerous sequences with a high degree of similarity (>92%) specific to the guinea pig. The presence of A-tails and flanking direct repeats associated with these sequences supported the identification of a full-length SINE, with a consensus sequence notably distinct from other rodent SINEs. Although most similar to the ID SINE, it clearly was not derived from the known ID master gene (BC1), hence we refer to this element as guinea pig ID-like (GPIDL). Using the consensus to screen the guinea pig genomic database (Assembly CavPor2) with Ensembl BlastView, we estimated at least 100,000 copies, which contrasts markedly to just over 100 copies of ID elements. Additionally we provided evidence of recent integrations of GPIDL as two of seven analyzed conserved GPIDL-containing loci demonstrated presence/absence variants in Cavia porcellus and C. aperea. Using intra-IDL PCR and sequence analyses we also provide evidence that GPIDL is derived from a hystricognath-specific SINE family. These results demonstrate that this SINE family continues to contribute to the dynamics of genomes of hystricognath rodents.
Hasing, Maria E; Hazes, Bart; Lee, Bonita E; Preiksaitis, Jutta K; Pang, Xiaoli L
2014-10-01
Recombination is an important mechanism generating genetic diversity in norovirus (NoV) that occurs commonly at the NoV polymerase-capsid (ORF1/2) junction. The genotyping method based on partial ORF2 sequences currently used to characterize circulating NoV strains in gastroenteritis outbreaks in Alberta cannot detect such recombination events and provides only limited information on NoV genetic evolution. The objective of this study was to determine whether any NoV GII.4 strains causing outbreaks in Alberta are recombinants. Twenty stool samples collected during outbreaks occurring between July 2004 and January 2012 were selected to include the GII.4 variants Farmington Hills 2002, Hunter 2004, Yerseke 2006a, Den Haag 2006b, Apeldoorn 2007, New Orleans 2009, and Sydney 2012 based on previous NoV ORF2-genotyping results. Near full-length NoV genome sequences were obtained, aligned with reference sequences from GenBank and analyzed with RDPv4.13. Two sequences corresponding to Apeldoorn 2007, and Sydney 2012 were identified as recombinants with breakpoints near the ORF1/2 junction and putative parental strains as previously reported. We also identified, for the first time, a non-recombinant sequence resembling the ORF2-3 parent of the recombinant cluster Sydney 2012 responsible for the most recent pandemic. Our results confirmed the presence of recombinant NoV GII.4 strains in Alberta, and highlight the importance of including additional genomic regions in surveillance studies to trace the evolution of pandemic NoV GII.4 strains. Copyright © 2014 Elsevier B.V. All rights reserved.
Ehsan, Muhammad; Akhter, Nasreen; Bhutto, Bachal; Arijo, Abdullah; Ali Gadahi, Javaid
2017-05-30
Cystic echinococcosis is an important zoonotic disease; it has serious impacts on animals as well as human health throughout the world. Genotypic characterization of Echinocossus granulosus (E. granulosus) in buffaloes has not been addressed in Pakistan. Therefore, the present study was conducted to evaluate the incidence and genotypic characterization of bovine E. granulosus. Out of 832 buffaloes examined, 112 (13.46%) were found infected. The favorable site for hydatid cyst development was liver (8.65%) followed by lungs (4.80%). The rate of cystic echinococcosis was found higher in females 14.43% than males 9.77%. The females above seven years aged were more infected as compared to the young ones. The partial sequence of mitochondrial cytochrome oxidase 1 (CO1) gene was used for identification and molecular analysis of buffalo's E. granulosus isolates. The alignment of redundant sequences were compared with already identified 10 genotypes available at National Centre for Biotechnology Information (NCBI) GenBank. The sequencing and phylogenetic analysis of all randomly selected buffalo isolates were belong to the G1- G3 complex (E. granulosus sensu stricto). All sequences were diverse from the reference sequence. No one showed complete identity to the buffalo strain (G3), representing substantial microsequence variability in G1, G2 and G3 genotypes. We evaluated the echinococcal infectivity and first time identification of genotypes in buffaloes in Sindh, Pakistan. This study will lead to determine accurate source of this zoonotic disease to humans in Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.
Vesicular monoamine transporter-1 (VMAT-1) mRNA and immunoreactive proteins in mouse brain.
Ashe, Karen M; Chiu, Wan-Ling; Khalifa, Ahmed M; Nicolas, Antoine N; Brown, Bonnie L; De Martino, Randall R; Alexander, Clayton P; Waggener, Christopher T; Fischer-Stenger, Krista; Stewart, Jennifer K
2011-01-01
Vesicular monoamine transporter 1 (VMAT-1) mRNA and protein were examined (1) to determine whether adult mouse brain expresses full-length VMAT-1 mRNA that can be translated to functional transporter protein and (2) to compare immunoreactive VMAT-1 proteins in brain and adrenal. VMAT-1 mRNA was detected in mouse brain with RT-PCR. The cDNA was sequenced, cloned into an expression vector, transfected into COS-1 cells, and cell protein was assayed for VMAT-1 activity. Immunoreactive proteins were examined on western blots probed with four different antibodies to VMAT-1. Sequencing confirmed identity of the entire coding sequences of VMAT-1 cDNA from mouse medulla oblongata/pons and adrenal to a Gen-Bank reference sequence. Transfection of the brain cDNA into COS-1 cells resulted in transporter activity that was blocked by the VMAT inhibitor reserpine and a proton ionophore, but not by tetrabenazine, which has a high affinity for VMAT-2. Antibodies to either the C- or N- terminus of VMAT-1 detected two proteins (73 and 55 kD) in transfected COS-1 cells. The C-terminal antibodies detected both proteins in extracts of mouse medulla/pons, cortex, hypothalamus, and cerebellum but only the 73 kD protein and higher molecular weight immunoreactive proteins in mouse adrenal and rat PC12 cells, which are positive controls for rodent VMAT-1. These findings demonstrate that a functional VMAT-1 mRNA coding sequence is expressed in mouse brain and suggest processing of VMAT-1 protein differs in mouse adrenal and brain.
Ren, Lipin; Chen, Wei; Shang, Yanjie; Meng, Fanming; Zha, Lagabaiyila; Wang, Yong; Guo, Yadong
2018-05-17
Muscid Flies (Diptera: Muscidae) are of great forensic importance due to their wide distribution, ubiquitous and synanthropic nature. They are frequently neglected as they tend to arrive at the corpses later than the flesh flies and blow flies. Moreover, the lack of species-level identification also hinders investigation of medicolegal purposes. To overcome the difficulty of morphological identification, molecular method has gained relevance. Cytochrome c oxidase subunit I (COI) gene has been widely utilized. Nonetheless, to achieve correct identification of an unknown sample, it is important to survey certain muscid taxa from its geographic distribution range. Accordingly, the aim of this study is to contribute more geographically specific. We sequenced the COI gene of 51 muscid specimens of 12 species, and added all correct sequences available in GenBank to yield a total data set of 125 COI sequences from 33 muscid species to evaluate the COI gene as a molecular diagnostic tool. The interspecific distances were extremely high (4.7-19.8%) in either the standard barcoding fragment (658 bp) or the long COI sequence (1,019-1,535 bp), demonstrating that these two genetic markers were nearly identical in the species identification. However, the intraspecific distances of the long COI sequences were significantly higher than the barcoding region for the conspecific species that geographical locations vary greatly. Therefore, genetic diversity presented in this study provides a reference for species identification of muscid flies. Nevertheless, further investigation and data from more muscid species are required to enhance the efficacy of species-level identification using COI gene as a genetic marker.
Frías-De-León, María Guadalupe; Ramírez-Bárcenas, José Antonio; Rodríguez-Arellanes, Gabriela; Velasco-Castrejón, Oscar; Taylor, Maria Lucia; Reyes-Montes, María Del Rocío
2017-03-01
Histoplasmosis is considered the most important systemic mycosis in Mexico, and its diagnosis requires fast and reliable methodologies. The present study evaluated the usefulness of PCR using Hcp100 and 1281-1283 (220) molecular markers in detecting Histoplasma capsulatum in occupational and recreational outbreaks. Seven clinical serum samples of infected individuals from three different histoplasmosis outbreaks were processed by enzyme-linked immunosorbent assay (ELISA) to titre anti-H. capsulatum antibodies and to extract DNA. Fourteen environmental samples were also processed for H. capsulatum isolation and DNA extraction. Both clinical and environmental DNA samples were analysed by PCR with Hcp100 and 1281-1283 (220) markers. Antibodies to H. capsulatum were detected by ELISA in all serum samples using specific antigens, and in six of these samples, the PCR products of both molecular markers were amplified. Four environmental samples amplified one of the two markers, but only one sample amplified both markers and an isolate of H. capsulatum was cultured from this sample. All PCR products were sequenced, and the sequences for each marker were analysed using the Basic Local Alignment Search Tool (BLASTn), which revealed 95-98 and 98-100 % similarities with the reference sequences deposited in the GenBank for Hcp100 and 1281-1283 (220) , respectively. Both molecular markers proved to be useful in studying histoplasmosis outbreaks because they are matched for pathogen detection in either clinical or environmental samples.
Low, V L; Lim, P E; Chen, C D; Lim, Y A L; Tan, T K; Norma-Rashid, Y; Lee, H L; Sofian-Azirun, M
2014-06-01
The present study explored the intraspecific genetic diversity, dispersal patterns and phylogeographic relationships of Culex quinquefasciatus Say (Diptera: Culicidae) in Malaysia using reference data available in GenBank in order to reveal this species' phylogenetic relationships. A statistical parsimony network of 70 taxa aligned as 624 characters of the cytochrome c oxidase subunit I (COI) gene and 685 characters of the cytochrome c oxidase subunit II (COII) gene revealed three haplotypes (A1-A3) and four haplotypes (B1-B4), respectively. The concatenated sequences of both COI and COII genes with a total of 1309 characters revealed seven haplotypes (AB1-AB7). Analysis using tcs indicated that haplotype AB1 was the common ancestor and the most widespread haplotype in Malaysia. The genetic distance based on concatenated sequences of both COI and COII genes ranged from 0.00076 to 0.00229. Sequence alignment of Cx. quinquefasciatus from Malaysia and other countries revealed four haplotypes (AA1-AA4) by the COI gene and nine haplotypes (BB1-BB9) by the COII gene. Phylogenetic analyses demonstrated that Malaysian Cx. quinquefasciatus share the same genetic lineage as East African and Asian Cx. quinquefasciatus. This study has inferred the genetic lineages, dispersal patterns and hypothetical ancestral genotypes of Cx. quinquefasciatus. © 2013 The Royal Entomological Society.
Aquatic environmental DNA detects seasonal fish abundance and habitat preference in an urban estuary
Soboleva, Lyubov; Charlop-Powers, Zachary
2017-01-01
The difficulty of censusing marine animal populations hampers effective ocean management. Analyzing water for DNA traces shed by organisms may aid assessment. Here we tested aquatic environmental DNA (eDNA) as an indicator of fish presence in the lower Hudson River estuary. A checklist of local marine fish and their relative abundance was prepared by compiling 12 traditional surveys conducted between 1988–2015. To improve eDNA identification success, 31 specimens representing 18 marine fish species were sequenced for two mitochondrial gene regions, boosting coverage of the 12S eDNA target sequence to 80% of local taxa. We collected 76 one-liter shoreline surface water samples at two contrasting estuary locations over six months beginning in January 2016. eDNA was amplified with vertebrate-specific 12S primers. Bioinformatic analysis of amplified DNA, using a reference library of GenBank and our newly generated 12S sequences, detected most (81%) locally abundant or common species and relatively few (23%) uncommon taxa, and corresponded to seasonal presence and habitat preference as determined by traditional surveys. Approximately 2% of fish reads were commonly consumed species that are rare or absent in local waters, consistent with wastewater input. Freshwater species were rarely detected despite Hudson River inflow. These results support further exploration and suggest eDNA will facilitate fine-scale geographic and temporal mapping of marine fish populations at relatively low cost. PMID:28403183
Subtype Distribution of Blastocystis Isolates in Sebha, Libya
Abdulsalam, Awatif M.; Ithoi, Init; Al-Mekhlafi, Hesham M.; Al-Mekhlafi, Abdulsalam M.; Ahmed, Abdulhamid; Surin, Johari
2013-01-01
Background Blastocystis is a genetically diverse and a common intestinal parasite of humans with a controversial pathogenic potential. This study was carried out to identify the Blastocystis subtypes and their association with demographic and socioeconomic factors among outpatients living in Sebha city, Libya. Methods/Findings Blastocystis in stool samples were cultured followed by isolation, PCR amplification of a partial SSU rDNA gene, cloning, and sequencing. The DNA sequences of isolated clones showed 98.3% to 100% identity with the reference Blastocystis isolates from the Genbank. Multiple sequence alignment showed polymorphism from one to seven base substitution and/or insertion/deletion in several groups of non-identical nucleotides clones. Phylogenetic analysis revealed three assemblage subtypes (ST) with ST1 as the most prevalent (51.1%) followed by ST2 (24.4%), ST3 (17.8%) and mixed infections of two concurrent subtypes (6.7%). Blastocystis ST1 infection was significantly associated with female (P = 0.009) and low educational level (P = 0.034). ST2 was also significantly associated with low educational level (P= 0.008) and ST3 with diarrhoea (P = 0.008). Conclusion Phylogenetic analysis of Libyan Blastocystis isolates identified three different subtypes; with ST1 being the predominant subtype and its infection was significantly associated with female gender and low educational level. More extensive studies are needed in order to relate each Blastocystis subtype with clinical symptoms and potential transmission sources in this community. PMID:24376805
Subtype distribution of Blastocystis isolates in Sebha, Libya.
Abdulsalam, Awatif M; Ithoi, Init; Al-Mekhlafi, Hesham M; Al-Mekhlafi, Abdulsalam M; Ahmed, Abdulhamid; Surin, Johari
2013-01-01
Blastocystis is a genetically diverse and a common intestinal parasite of humans with a controversial pathogenic potential. This study was carried out to identify the Blastocystis subtypes and their association with demographic and socioeconomic factors among outpatients living in Sebha city, Libya. Blastocystis in stool samples were cultured followed by isolation, PCR amplification of a partial SSU rDNA gene, cloning, and sequencing. The DNA sequences of isolated clones showed 98.3% to 100% identity with the reference Blastocystis isolates from the Genbank. Multiple sequence alignment showed polymorphism from one to seven base substitution and/or insertion/deletion in several groups of non-identical nucleotides clones. Phylogenetic analysis revealed three assemblage subtypes (ST) with ST1 as the most prevalent (51.1%) followed by ST2 (24.4%), ST3 (17.8%) and mixed infections of two concurrent subtypes (6.7%). ST1 infection was significantly associated with female (P = 0.009) and low educational level (P = 0.034). ST2 was also significantly associated with low educational level (P= 0.008) and ST3 with diarrhoea (P = 0.008). Phylogenetic analysis of Libyan Blastocystis isolates identified three different subtypes; with ST1 being the predominant subtype and its infection was significantly associated with female gender and low educational level. More extensive studies are needed in order to relate each Blastocystis subtype with clinical symptoms and potential transmission sources in this community.
Identification of a novel astrovirus in domestic sheep in Hungary.
Reuter, Gábor; Pankovics, Péter; Delwart, Eric; Boros, Ákos
2012-02-01
The family Astroviridae consists of two genera, Avastrovirus and Mamastrovirus, whose members are associated with gastroenteritis in avian and mammalian hosts, respectively. We serendipitously identified a novel ovine astrovirus in a fecal specimen from a domestic sheep (Ovis aries) in Hungary by viral metagenomic analysis. Sequencing of the fragment indicated that it was an ORF1b/ORF2/3'UTR sequence, and it has been submitted to the GenBank database as ovine astrovirus type 2 (OAstV-2/Hungary/2009) with accession number JN592482. The unique sequence characteristics and the phylogenetic position of OAstV-2 suggest that genetically divergent lineages of astroviruses exist in sheep.
Vera-Cabrera, L; Johnson, W M; Welsh, O; Resendiz-Uresti, F L; Salinas-Carmona, M C
1999-06-01
An immunodominant protein from Nocardia brasiliensis, P61, was subjected to amino-terminal and internal sequence analysis. Three sequences of 22, 17, and 38 residues, respectively, were obtained and compared with the protein database from GenBank by using the BLAST system. The sequences showed homology to some eukaryotic catalases and to a bromoperoxidase-catalase from Streptomyces violaceus. Its identity as a catalase was confirmed by analysis of its enzymatic activity on H2O2 and by a double-staining method on a nondenaturing polyacrylamide gel with 3,3'-diaminobenzidine and ferricyanide; the result showed only catalase activity, but no peroxidase. By using one of the internal amino acid sequences and a consensus catalase motif (VGNNTP), we were able to design a PCR assay that generated a 500-bp PCR product. The amplicon was analyzed, and the nucleotide sequence was compared to the GenBank database with the observation of high homology to other bacterial and eukaryotic catalases. A PCR assay based on this target sequence was performed with primers NB10 and NB11 to confirm the presence of the NB10-NB11 gene fragment in several N. brasiliensis strains isolated from mycetoma. The same assay was used to determine whether there were homologous sequences in several type strains from the genera Nocardia, Rhodococcus, Gordona, and Streptomyces. All of the N. brasiliensis strains presented a positive result but only some of the actinomycetes species tested were positive in the PCR assay. In order to confirm these findings, genomic DNA was subjected to Southern blot analysis. A 1.7-kbp band was observed in the N. brasiliensis strains, and bands of different molecular weight were observed in cross-reacting actinomycetes. Sequence analysis of the amplicons of selected actinomycetes showed high homology in this catalase fragment, thus demonstrating that this protein is highly conserved in this group of bacteria.
Wei, Feng; Song, Mingxin; Liu, Huanhuan; Wang, Bo; Wang, Shuchao; Wang, Zedong; Ma, Hongyu; Li, Zhongyu; Zeng, Zheng; Qian, Jun; Liu, Quan
2016-01-01
Tick-borne diseases are considered as emerging infectious diseases in humans and animals in China. In this study, Ixodes persulcatus (n = 1699), Haemaphysalis concinna (n = 412), Haemaphysalis longicornis (n = 390), Dermacentor nuttalli (n = 253), and Dermacentor silvarum (n = 204) ticks were collected by flagging from northeastern China, and detected for infection with Anaplasma, Ehrlichia, Babesia, and Hepatozoon spp. by using nested polymerase chain reaction assays and sequencing analysis. Anaplasma phagocytophilum was detected in all tick species, i.e., I. persulcatus (9.4%), H. longicornis (1.9%), H. concinna (6.5%), D. nuttalli (1.7%), and D. silvarum (2.3%); Anaplasma bovis was detected in H. longicornis (0.3%) and H. concinna (0.2%); Ehrlichia muris was detected in I. persulcatus (2.5%) and H. concinna (0.2%); Candidatus Neoehrlichia mikurensis was only detected in I. persulcatus (0.4%). The Ehrlichia variant (GenBank access number KU921424), closely related to Ehrlichia ewingii, was found in H. longicornis (0.8%) and H. concinna (0.2%). I. persulcatus was infected with Babesia venatorum (1.2%), Babesia microti (0.6%), and Babesia divergens (0.6%). Additionally, four Babesia sequence variants (GenBank access numbers 862303–862306) were detected in I. persulcatus, H. longicornis, and H. concinna, which belonged to the clusters formed by the parasites of dogs, sheep, and cattle (B. gibsoni, B. motasi, and B. crassa). Two Hepatozoon spp. (GenBank access numbers KX016028 and KX016029) associated with hepatozoonosis in Japanese martens were found in the collected ticks (0.1–3.1%). These findings showed the genetic variability of Anaplasma, Ehrlichia, Babesia, and Hepatozoon spp. circulating in ticks in northeastern China, highlighting the necessity for further research of these tick-associated pathogens and their role in human and animal diseases. PMID:27965644
Rousselet, Estelle; Stacy, Nicole I; Rotstein, David S; Waltzek, Tom B; Griffin, Matt J; Francis-Floyd, Ruth
2018-06-08
This report describes a case of systemic bacterial infection caused by Edwardsiella tarda in a Western African lungfish (Protopterus annectens) exposed to poor environmental and husbandry conditions. The fish presented with a large, external ulcerative lesion and died 2 weeks after developing anorexia. Histological evaluation revealed multifocal areas of necrosis and heterophilic and histiocytic inflammation throughout multiple tissues. Gram stain identified small numbers of intra- and extracellular monomorphic Gram-negative 1 to 2 μm rod-shaped bacilli. Cytology of lung granuloma, kidney and testes imprints identified heterophilic inflammation with phagocytosis of small monomorphic bacilli and some heterophils exhibiting cytoplasmic projections indicative of heterophil extracellular traps (HETs). Initial phenotypic analysis of isolates from coelomic fluid cultures identified E. tarda. Subsequent molecular analysis of spleen, liver and intestine DNA using an E. tarda-specific endpoint PCR assay targeting the bacterial fimbrial subunit yielded a 115 bp band. Sequencing and BLASTN search revealed the sequence was identical (76/76) to E. tarda strain FL95-01 (GenBank acc. CP011359) and displayed 93% sequence identity (66/71) to Edwardsiella hoshinae strain ATCC 35051 (GenBank acc. CP011359). This is the first report of systemic edwardsiellosis in a lungfish with concurrent cytologically identified structures suggestive of HETs. © 2018 John Wiley & Sons Ltd.
Pilgrim, B L; Perry, R C; Barron, J L; Marshall, H D
2012-09-26
Levels and patterns of mitochondrial DNA (mtDNA) variation were examined to investigate the population structure and possible routes of postglacial recolonization of the world's northernmost native populations of brook trout (Salvelinus fontinalis), which are found in Labrador, Canada. We analyzed the sequence diversity of a 1960-bp portion of the mitochondrial genome (NADH dehydrogenase 1 gene and part of cytochrome oxidase 1) of 126 fish from 32 lakes distributed throughout seven regions of northeastern Canada. These populations were found to have low levels of mtDNA diversity, a characteristic trait of populations at northern extremes, with significant structuring at the level of the watershed. Upon comparison of northeastern brook trout sequences to the publicly available brook trout whole mitochondrial genome (GenBank AF154850), we infer that the GenBank sequence is from a fish whose mtDNA has recombined with that of Arctic charr (S. alpinus). The haplotype distribution provides evidence of two different postglacial founding groups contributing to present-day brook trout populations in the northernmost part of their range; the evolution of the majority of the haplotypes coincides with the timing of glacier retreat from Labrador. Our results exemplify the strong influence that historical processes such as glaciations have had on shaping the current genetic structure of northern species such as the brook trout.
Al-Hosary, Amira A T
2017-03-01
Ticks and tick-borne diseases are the main problems affecting the livestock production in Egypt. Bovine babesiosis has adverse effects on the animal health and production. A comparison of Giemsa stained blood smears, polymerase chain reaction (PCR) and nested PCR (nPCR) assays for detection of Babesia bovis infection in Egyptian Baladi cattle ( Bos taurus ) in reference to reverse line blot was carried out. The sensitivity of PCR and nested PCR (nPCR) assays were 65 and 100 % respectively. Giemsa stained blood smears showed the lowest sensitivity (30 %). According to these results using of PCR and nPCR target for B. bovis , [BBOV-IV005650 (BV5650)] gene are suitable for diagnosis of B. bovis infection. The 18Ss rRNA partial sequence confirmed that all the positive samples were Babesia bovis and all of them were deposited in the GenBank databases (Accession No: KM455548, KM455549 and KM455550).
Zoonotic Onchocerca lupi Infection in Dogs, Greece and Portugal, 2011–2012
Dantas-Torres, Filipe; Giannelli, Alessio; Latrofa, Maria Stefania; Papadopoulos, Elias; Cardoso, Luís; Cortes, Helder
2013-01-01
Onchocerca lupi infection is reported primarily in symptomatic dogs. We aimed to determine the infection in dogs from areas of Greece and Portugal with reported cases. Of 107 dogs, 9 (8%) were skin snip–positive for the parasite. DNA sequences of parasites in specimens from distinct dog populations differed genetically from thoses in GenBank. PMID:24274145
USDA-ARS?s Scientific Manuscript database
A growing interest in the biological control of locusts and grasshoppers (Acrididae) has led to the development of biopesticides based on naturally occurring pathogens which offers an environmentally safe alternative to chemical pesticides. However, the fungal strains which are being sought for biop...
NCBI-compliant genome submissions: tips and tricks to save time and money.
Pirovano, Walter; Boetzer, Marten; Derks, Martijn F L; Smit, Sandra
2017-03-01
Genome sequences nowadays play a central role in molecular biology and bioinformatics. These sequences are shared with the scientific community through sequence databases. The sequence repositories of the International Nucleotide Sequence Database Collaboration (INSDC, comprising GenBank, ENA and DDBJ) are the largest in the world. Preparing an annotated sequence in such a way that it will be accepted by the database is challenging because many validation criteria apply. In our opinion, it is an undesirable situation that researchers who want to submit their sequence need either a lot of experience or help from partners to get the job done. To save valuable time and money, we list a number of recommendations for people who want to submit an annotated genome to a sequence database, as well as for tool developers, who could help to ease the process. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Van Neste, Christophe; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip
2016-01-01
It is difficult to predict if and when massively parallel sequencing of forensic STR loci will replace capillary electrophoresis as the new standard technology in forensic genetics. The main benefits of sequencing are increased multiplexing scales and SNP detection. There is not yet a consensus on how sequenced profiles should be reported. We present the Forensic Loci Allele Database (FLAD) service, made freely available on http://forensic.ugent.be/FLAD/. It offers permanent identifiers for sequenced forensic alleles (STR or SNP) and their microvariants for use in forensic allele nomenclature. Analogous to Genbank, its aim is to provide permanent identifiers for forensically relevant allele sequences. Researchers that are developing forensic sequencing kits or are performing population studies, can register on http://forensic.ugent.be/FLAD/ and add loci and allele sequences with a short and simple application interface (API). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Genomes: At the edge of chaos with maximum information capacity
NASA Astrophysics Data System (ADS)
Kong, Sing-Guan; Chen, Hong-Da; Torda, Andrew; Lee, H. C.
2016-12-01
We propose an order index, ϕ, which quantifies the notion of “life at the edge of chaos” when applied to genome sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length and base composition. The 786 complete genomic sequences in GenBank were found to have ϕ values in a very narrow range, 0.037 ± 0.027. We show this implies that genomes are halfway towards being completely random, namely, at the edge of chaos. We argue that this narrow range represents the neighborhood of a fixed-point in the space of sequences, and genomes are driven there by the dynamics of a robust, predominantly neutral evolution process.
Study of infectious diseases in archaeological bone material - A dataset.
Pucu, Elisa; Cascardo, Paula; Chame, Marcia; Felice, Gisele; Guidon, Niéde; Cleonice Vergne, Maria; Campos, Guadalupe; Roberto Machado-Silva, José; Leles, Daniela
2017-08-01
Bones of human and ground sloth remains were analyzed for presence of Trypanosoma cruzi by conventional PCR using primers TC, TC1 and TC2. Sequence results amplified a fragment with the same product size as the primers (300 and 350pb). Amplified PCR product was sequenced and analyzed on GenBank, using Blast. Although these sequences did not match with these parasites they showed high amplification with species of bacteria. This article presents the methodology used and the alignment of the sequences. The display of this dataset will allow further analysis of our results and discussion presented in the manuscript "Finding the unexpected: a critical view on molecular diagnosis of infectious diseases in archaeological samples" (Pucu et al. 2017) [1].
Clarification of the Concept of Ganoderma orbiforme with High Morphological Plasticity
Wang, Dong-Mei; Wu, Sheng-Hua; Yao, Yi-Jian
2014-01-01
Ganoderma has been considered a very difficult genus among the polypores to classify and is currently in a state of taxonomic chaos. In a study of Ganoderma collections including numerous type specimens, we found that six species namely G. cupreum, G. densizonatum, G. limushanense, G. mastoporum, G. orbiforme, G. subtornatum, and records of G. fornicatum from Mainland China and Taiwan are very similar to one another in basidiocarp texture, pilear cuticle structure, context color, pore color and basidiospore characteristics. Further, we sequenced the nrDNA ITS region (ITS1 and ITS2) and partial mtDNA SSU region of the studied materials, and performed phylogenetic analyses based on these sequence data. The nrDNA ITS sequence analysis results show that the eight nrDNA ITS sequences derived from this study have single-nucleotide polymorphisms in ITS1 and/or ITS2 at inter- and intra-individual levels. In the nrDNA ITS phylogenetic trees, all the sequences from this study are grouped together with those of G. cupreum and G. mastoporum retrieved from GenBank to form a distinct clade. The mtDNA SSU sequence analysis results reveal that the five mtDNA SSU sequences derived from this study are clustered together with those of G. cupreum retrieved from GenBank and also form a distinct clade in the mtDNA SSU phylogenetic trees. Based on morphological and molecular data, we conclude that the studied taxa are conspecific. Among the names assigned to this species, G. fornicatum given to Asian collections has nomenclatural priority over the others. However, the type of G. fornicatum from Brazil is probably lost and a modern description based on the type lacks. The identification of the Asian collections to G. fornicatum therefore cannot be confirmed. To the best of our knowledge, G. orbiforme is the earliest valid name for use. PMID:24875218
Cloning of K26 Hydrophilic Antigen from Iranian Strain of Leishmania infantum
HOSSEINI FARASH, Bibi Razieh; MOHEBALI, Mehdi; KAZEMI, Bahram; HAJJARAN, Homa; AKHOUNDI, Behnaz; RAOOFIAN, Reza; FATA, Abdolmajid; MOJARRAD, Majid; SHARIFI-YAZDI, Mohammad Kazem
2017-01-01
Background: Visceral leishmaniasis (VL) caused by Leishmania infantum is the most severe form of leishmaniasis in Iran, which causes a high mortality rate in the case of inaccurate diagnosis and treatment. This study aimed to clone of K26 gene from Iranian strain of L. infantum and register the sequencing results in Genbank to facilitate the preparation a new K26 antigen for the detection of L. infantum infection. Methods: L. infantum was obtained from an infected domestic dog in Meshkin-Shahr area from northwestern Iran in 2015. Canine visceral leishmaniasis was confirmed by direct agglutination test (DAT), rK39 dipstick and parasitological methods. L. infantum was confirmed by N-acetyl glucosamine -1-phosphate transferase (nagt)–PCR and its sequencing. The band of interest for k26 form Iranian strain of L. infantum was purified by gel extraction kit after PCR amplification and then ligated into pBluescript II SK (+) and pET-32a (+), respectively. The sequences of recombinant plasmids were analyzed and submitted to Genbank. Results: The submission of rk26 nucleotide sequence was performed to the GeneBank/NCBI Data Base under accession number KY212883. The related gene was showed a homology about 99% to L. chagasi and L. infantum k26 gene, while the level of homology in comparison with different strains of L. donovani ranged from 84–94%. Conclusion: The successful rk26 cloning into an expression vector performed in this study could help to produce a new recombinant antigen for serodiagnosis of VL especially in areas where L. infantum is the main causative agent. PMID:29308379
WATTHANAKAIWAN, Vichan; SUKMAK, Manakorn; HAMARIT, Kriengsak; KAOLIM, Nongnid; WAJJWALKU, Worawidh; MUANGKRAM, Yuttamol
2017-01-01
Sarcocystis species are heteroxenous cyst-forming coccidian protozoan parasites with a wide host range, including rodents. In this study, Sarcocystis spp. samples were isolated from Bandicota indica, Rattus argentiventer, R. tiomanicus and R. norvegicus across five provinces of Thailand. Two major groups of Sarcocystis cysts were determined in this study: large and small cysts. By sequence comparisons and phylogenetic analyses based on the partial sequences of 28S ribosomal DNA, the large cysts showed the highest identity value (99%) with the S. zamani in GenBank database. While the small cysts could be divided into 2 groups of Sarcocystis: S. singaporensis and presupposed S. zuoi. The further analysis on 18S rDNA supported that the 2 isolates (S2 and B6 no.2) were as identified as S. singaporensis shared a high sequence identity with the S. singaporensis in GenBank database and the unidentified Sarcocystis (4 isolates, i.e., B6 no.10, B6 no.12, B10 no.4 and B10 no.7) showed 96.3–99.5% identity to S. zuoi as well as high distinct identity from others Sarcocystis spp. (≤93%). The result indicated that these four samples should be S. zuoi. In this study, we provided complete sequence of internal transcribed spacer 1 (ITS1), 5.8S rDNA and internal transcribed spacer 2 (ITS2) of these three Sarcocystis species and our new primer set could be useful to study the evolution of Sarcocystis. PMID:28701623
Watthanakaiwan, Vichan; Sukmak, Manakorn; Hamarit, Kriengsak; Kaolim, Nongnid; Wajjwalku, Worawidh; Muangkram, Yuttamol
2017-08-18
Sarcocystis species are heteroxenous cyst-forming coccidian protozoan parasites with a wide host range, including rodents. In this study, Sarcocystis spp. samples were isolated from Bandicota indica, Rattus argentiventer, R. tiomanicus and R. norvegicus across five provinces of Thailand. Two major groups of Sarcocystis cysts were determined in this study: large and small cysts. By sequence comparisons and phylogenetic analyses based on the partial sequences of 28S ribosomal DNA, the large cysts showed the highest identity value (99%) with the S. zamani in GenBank database. While the small cysts could be divided into 2 groups of Sarcocystis: S. singaporensis and presupposed S. zuoi. The further analysis on 18S rDNA supported that the 2 isolates (S2 and B6 no.2) were as identified as S. singaporensis shared a high sequence identity with the S. singaporensis in GenBank database and the unidentified Sarcocystis (4 isolates, i.e., B6 no.10, B6 no.12, B10 no.4 and B10 no.7) showed 96.3-99.5% identity to S. zuoi as well as high distinct identity from others Sarcocystis spp. (≤93%). The result indicated that these four samples should be S. zuoi. In this study, we provided complete sequence of internal transcribed spacer 1 (ITS1), 5.8S rDNA and internal transcribed spacer 2 (ITS2) of these three Sarcocystis species and our new primer set could be useful to study the evolution of Sarcocystis.
Wang, Xu-Hua; Wang, Yong; Zhang, De-Bao; Liu, A-Ke; Yao, Qin; Chen, Ke-Ping
2014-01-01
Basic helix-loop-helix (bHLH) proteins comprise a large superfamily of transcription factors, which are involved in the regulation of various developmental processes. bHLH family members are widely distributed in various eukaryotes including yeast, fruit fly, zebrafish, mouse, and human. In this study, we identified 55 bHLH motifs encoded in genome sequence of the human body louse, Pediculus humanus corporis (Phthiraptera: Pediculidae). Phylogenetic analyses of the identified P. humanus corporis bHLH (PhcbHLH) motifs revealed that there are 23, 11, 9, 1, 10, and 1 member(s) in groups A, B, C, D, E, and F, respectively. Examination to GenBank annotations of the 55 PhcbHLH members indicated that 29 PhcbHLH proteins were annotated in consistence with our analytical result, 8 were annotated different with our analytical result, 12 were merely annotated as hypothetical protein, and the rest 6 were not deposited in GenBank. A comparison on insect bHLH gene composition revealed that human body louse possibly has more hairy and E(spl) genes than other insect species. Because hairy and E(spl) genes have been found to negatively regulate the differentiation of insect preneural cells, it is suggested that the existence of additional hairy and E(spl) genes in human body louse is probably the consequence of its long period adaptation to the relatively dark and stable environment. These data provide good references for further studies on regulatory functions of bHLH proteins in the growth and development of human body louse. © The Author 2014. Published by Oxford University Press on behalf of the Entomological Society of America.
[Symptomatic Black Queen Cell Virus infection of drone brood in Hessian apiaries].
Siede, Reinhold; Büchler, Ralph
2003-01-01
The Black Queen Cell Virus (BQCV) can affect brood of the honey bee (Apis mellifera). In general queen cells are endangered showing dark coloured cell walls as typical symptoms. Worker- and dronebrood can be infected by BQCV but normally without clinical symptoms. This paper describes for the first time a symptomatic BQCV-infection of diseased drone brood found on two bee yards in Hessen/Germany in 2001. The drone larvae were seriously damaged and some of them were dead. Samples of the affected brood were tested for BQCV by the PCR detection method. A BQCV specific nucleic acid fragment was found. The PCR product were sequenced and aligned with the relevant GenBank entry. At the nucleic acid level as well as at the deduced protein level the isolate showed a high similarity with the south african isolate noted in GenBank.
Zhang, Yang; Zhu, Zhen; Xu, Qi; Chen, Guohong
2014-01-07
Primers based on the cDNA sequence of the goose growth hormone (GH) gene in GenBank were designed to amplify exon 2 of the GH gene in Huoyan goose. A total of 552 individuals were brooded in one batch and raised in Liaoning and Jiangsu Provinces, China. Single nucleotide polymorphisms (SNPs) of exon 2 in the GH gene were detected by the polymerase chain reaction (single strand conformation polymorphism method). Homozygotes were subsequently cloned, sequenced and analyzed. Two SNP mutations were detected, and 10 genotypes (referred to as AA, BB, CC, DD, AB, AC, AD, BC, BD and CD) were obtained. Allele D was predominant, and the frequencies of the 10 genotypes fit the Hardy-Weinberg equilibrium in the male, female and whole populations according to the chi-square test. Based on SNP types, the 10 genotypes were combined into three main genotypes. Multiple comparisons were carried out between different genotypes and production traits when the geese were 10 weeks old. Some indices of production performance were significantly (p < 0.05) associated with the genotype. Particularly, geese with genotype AB or BB were highly productive. Thus, these genotypes may serve as selection markers for production traits in Huoyan geese.
Zhou, Jindan; Rudd, Kenneth E.
2013-01-01
EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection. PMID:23197660
Miya, Masaki; Friedman, Matt; Satoh, Takashi P.; Takeshima, Hirohiko; Sado, Tetsuya; Iwasaki, Wataru; Yamanoue, Yusuke; Nakatani, Masanori; Mabuchi, Kohji; Inoue, Jun G.; Poulsen, Jan Yde; Fukunaga, Tsukasa; Sato, Yukuto; Nishida, Mutsumi
2013-01-01
Uncertainties surrounding the evolutionary origin of the epipelagic fish family Scombridae (tunas and mackerels) are symptomatic of the difficulties in resolving suprafamilial relationships within Percomorpha, a hyperdiverse teleost radiation that contains approximately 17,000 species placed in 13 ill-defined orders and 269 families. Here we find that scombrids share a common ancestry with 14 families based on (i) bioinformatic analyses using partial mitochondrial and nuclear gene sequences from all percomorphs deposited in GenBank (10,733 sequences) and (ii) subsequent mitogenomic analysis based on 57 species from those targeted 15 families and 67 outgroup taxa. Morphological heterogeneity among these 15 families is so extraordinary that they have been placed in six different perciform suborders. However, members of the 15 families are either coastal or oceanic pelagic in their ecology with diverse modes of life, suggesting that they represent a previously undetected adaptive radiation in the pelagic realm. Time-calibrated phylogenies imply that scombrids originated from a deep-ocean ancestor and began to radiate after the end-Cretaceous when large predatory epipelagic fishes were selective victims of the Cretaceous-Paleogene mass extinction. We name this clade of open-ocean fishes containing Scombridae “Pelagia” in reference to the common habitat preference that links the 15 families. PMID:24023883
Genetic diversity and vector transmission of phytoplasmas associated with sesame phyllody in Iran.
Salehi, M; Esmailzadeh Hosseini, S A; Salehi, E; Bertaccini, A
2017-03-01
During 2010-14 surveys in the major sesame growing areas of Fars, Yazd and Isfahan provinces (Iran), genetic diversity and vector transmission of phytoplasmas associated with sesame phyllody were studied. Virtual RFLP, phylogenetic, and DNA homology analyses of partial 16S ribosomal sequences of phytoplasma strains associated with symptomatic plants revealed the presence of phytoplasmas referable to three ribosomal subgroups, 16SrII-D, 16SrVI-A, and 16SrIX-C. The same analyses using 16S rDNA sequences from sesame phyllody-associated phytoplasmas retrieved from GenBank database showed the presence of phytoplasmas clustering with strains in the same subgroups in other Iranian provinces including Bushehr and Khorasan Razavi. Circulifer haematoceps and Orosius albicinctus, known vectors of the disease in Iran, were tested for transmission of the strains identified in this study. C. haematoceps transmitted 16SrII-D, 16SrVI-A, and 16SrIX-C phytoplasmas, while O. albicinctus only transmitted 16SrII-D strains. Based on the results of the present study and considering the reported presence of phytoplasmas belonging to the same ribosomal subgroups in other crops, sesame fields probably play an important role in the epidemiology of other diseases associated with these phytoplasmas in Iran.
Zhou, Jindan; Rudd, Kenneth E
2013-01-01
EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection.
The Importance of Biological Databases in Biological Discovery.
Baxevanis, Andreas D; Bateman, Alex
2015-06-19
Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
Limited evidence of intercontinental dispersal of avian paramyxovirus serotype 4 by migratory birds
Reeves, Andrew; Poulson, Rebecca L.; Muzyka, Denys; Ogawa, Haruko; Imai, Kunitoshi; Nghia Bui, Vuong; Hall, Jeffrey S.; Pantin-Jackwood, Mary; Stallknecht, David E.; Ramey, Andrew M.
2016-01-01
Avian paramyxovirus serotype 4 (APMV-4) is a single stranded RNA virus that has most often been isolated from waterfowl. Limited information has been reported regarding the prevalence, pathogenicity, and genetic diversity of AMPV-4. To assess the intercontinental dispersal of this viral agent, we sequenced the fusion gene of 58 APMV-4 isolates collected in the United States, Japan and the Ukraine and compared them to all available sequences on GenBank. With only a single exception the phylogenetic clades of APMV-4 sequences were monophyletic with respect to their continents of origin (North America, Asia and Europe). Thus, we detected limited evidence for recent intercontinental dispersal of APMV-4 in this study.
Jill E. Petrisko; Christopher A. Pearl; David S. Pilliod; Peter P. Sheridan; Charles F. Williams; Charles R. Peterson; R. Bruce Bury
2008-01-01
We assessed the diversity and phylogeny of Saprolegniaceae on amphibian eggs from the Pacific Northwest, with particular focus on Saprolegnia ferax, a species implicated in high egg mortality. We identified isolates from eggs of six amphibians with the internal transcribed spacer (ITS) and 5.8S gene regions and BLAST of the GenBank database. We...
Fusarium musae as cause of superficial and deep-seated human infections.
Esposto, M C; Prigitano, A; Tortorano, A M
2016-12-01
BLAST analysis in GenBank of 60 Fusarium verticillioides clinical isolates using the sequence of translation elongation factor 1-alpha allowed the identification of four F. musae confirming that this species is not a rare etiology of superficial and deep infections and that its habitat is not restricted to banana fruits. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Choi, Young Jin; Park, Kwi Sung; Baek, Kyoung Ah; Jung, Eun Hye; Nam, Hae Seon; Kim, Yong Bae; Park, Joon Soo
2010-03-01
Evaluation of the primary etiologic agents that cause aseptic meningitis outbreaks may provide valuable information regarding the prevention and management of aseptic meningitis. In Korea, an outbreak of aseptic meningitis caused by echovirus type 30 (E30) occurred from May to October in 2008. In order to determine the etiologic agent, CSF and/or stool specimens from 140 children hospitalized for aseptic meningitis at Soonchunhyang University Cheonan Hospital between June and October of 2008 were tested for virus isolation and identification. E30 accounted for 61.7% (37 cases) and echovirus 6 accounted for 21.7% (13 cases) of all the human enteroviruses (HEVs) isolates (60 cases in total). For the molecular characterization of the isolates, the VP1 gene sequence of 18 Korean E30 isolates was compared pairwise using the MegAlign with 34 reference strains from the GenBank database. The pairwise comparison of the nucleotide sequences of the VP1 genes demonstrated that the sequences of the Korean strains differed from those of lineage groups A, B, C, D, E, F and G. Reconstruction of the phylogenetic tree based on the complete VP1 nucleotide sequences resulted in a monophyletic tree, with eight clustered lineage groups. All Korean isolates were segregated from other lineage groups, thus suggesting that the Korean strains were a distinct lineage of E30, and a probable cause of this outbreak. This manuscript is the first report, to the best of our knowledge, of the molecular characteristics of E30 strains associated with an aseptic meningitis outbreak in Korea, and their respective phylogenetic relationships.
Mirhendi, H; Ghiasian, A; Vismer, Hf; Asgary, Mr; Jalalizand, N; Arendrup, Mc; Makimura, K
2010-01-01
Fusarium species are capable of causing a wide range of crop plants infections as well as uncommon human infections. Many species of the genus produce mycotoxins, which are responsible for acute or chronic diseases in animals and humans. Identification of Fusaria to the species level is necessary for biological, epidemiological, pathological, and toxicological purposes. In this study, we undertook a computer-based analysis of ITS1-5.8SrDNA-ITS2 in 192 GenBank sequences from 36 Fusarium species to achieve data for establishing a molecular method for specie-specific identification. Sequence data and 610 restriction enzymes were analyzed for choosing RFLP profiles, and subsequently designed and validated a PCR-restriction enzyme system for identification and typing of species. DNA extracted from 32 reference strains of 16 species were amplified using ITS1 and ITS4 universal primers followed by sequencing and restriction enzyme digestion of PCR products. The following 3 restriction enzymes TasI, ItaI and CfoI provide the best discriminatory power. Using ITS1 and ITS4 primers a product of approximately 550bp was observed for all Fusarium strains, as expected regarding the sequence analyses. After RFLP of the PCR products, some species were definitely identified by the method and some strains had different patterns in same species. Our profile has potential not only for identification of species, but also for genotyping of strains. On the other hand, some Fusarium species were 100% identical in their ITS-5.8SrDNA-ITS2 sequences, therefore differentiation of these species is impossible regarding this target alone. ITS-PCR-RFLP method might be useful for preliminary differentiation and typing of most common Fusarium species.
Genotypic analysis of Mucor from the platypus in Australia.
Connolly, J H; Stodart, B J; Ash, G J
2010-01-01
Mucor amphibiorum is the only pathogen known to cause significant morbidity and mortality in the free-living platypus (Ornithorhynchus anatinus) in Tasmania. Infection has also been reported in free-ranging cane toads (Bufo marinus) and green tree frogs (Litoria caerulea) from mainland Australia but has not been confirmed in platypuses from the mainland. To date, there has been little genotyping specifically conducted on M. amphibiorum. A collection of 21 Mucor isolates representing isolates from the platypus, frogs and toads, and environmental samples were obtained for genotypic analysis. Internal transcribed spacer (ITS) region sequencing and GenBank comparison confirmed the identity of most of the isolates. Representative isolates from infected platypuses formed a clade containing the reference isolates of M. amphibiorum from the Centraal Bureau voor Schimmelcultures repository. The M. amphibiorum isolates showed a close sequence identity with Mucor indicus and consisted of two haplotypes, differentiated by single nucleotide polymorphisms within the ITS1 and ITS2 regions. With the exception of isolate 96-4049, all isolates from platypuses were in one haplotype. Multilocus fingerprinting via the use of intersimple sequence repeats polymerase chain reaction identified 19 genotypes. Two major clusters were evident: 1) M. amphibiorum and Mucor racemosus; and 2) Mucor circinelloides, Mucor ramosissimus, and Mucor fragilis. Seven M. amphibiorum isolates from platypuses were present in two subclusters, with isolate 96-4053 appearing genetically distinct from all other isolates. Isolates classified as M. circinelloides by sequence analysis formed a separate subcluster, distinct from other Mucor spp. The combination of sequencing and multilocus fingerprinting has the potential to provide the tools for rapid identification of M. amphibiorum. Data presented on the diversity of the pathogen and further work in linking genetic diversity to functional diversity will provide critical information for its management in Tasmanian river systems.
Ferchichi, M; Valcheva, R; Prévost, H; Onno, B; Dousset, X
2008-06-01
Species-specific primers targeting the 16S-23S ribosomal DNA (rDNA) intergenic spacer region (ISR) were designed to rapidly discriminate between Lactobacillus mindensis, Lactobacillus panis, Lactobacillus paralimentarius, Lactobacillus pontis and Lactobacillus frumenti species recently isolated from French sourdough. The 16S-23S ISRs were amplified using primers 16S/p2 and 23S/p7, which anneal to positions 1388-1406 of the 16S rRNA gene and to positions 207-189 of the 23S rRNA gene respectively, Escherichia coli numbering (GenBank accession number V00331). Clone libraries of the resulting amplicons were constructed using a pCR2.1 TA cloning kit and sequenced. Species-specific primers were designed based on the sequences obtained and were used to amplify the 16S-23S ISR in the Lactobacillus species considered. For all of them, two PCR amplicons, designated as small ISR (S-ISR) and large ISR (L-ISR), were obtained. The L-ISR is composed of the corresponding S-ISR, interrupted by a sequence containing tRNA(Ile) and tRNA(Ala) genes. Based on these sequences, species-specific primers were designed and proved to identify accurately the species considered among 30 reference Lactobacillus species tested. Designed species-specific primers enable a rapid and accurate identification of L. mindensis, L. paralimentarius, L. panis, L. pontis and L. frumenti species among other lactobacilli. The proposed method provides a powerful and convenient means of rapidly identifying some sourdough lactobacilli, which could be of help in large starter culture surveys.
Failla, A J; Vasquez, A A; Hudson, P; Fujimoto, M; Ram, J L
2016-02-01
Establishing reliable methods for the identification of benthic chironomid communities is important due to their significant contribution to biomass, ecology and the aquatic food web. Immature larval specimens are more difficult to identify to species level by traditional morphological methods than their fully developed adult counterparts, and few keys are available to identify the larval species. In order to develop molecular criteria to identify species of chironomid larvae, larval and adult chironomids from Western Lake Erie were subjected to both molecular and morphological taxonomic analysis. Mitochondrial cytochrome c oxidase I (COI) barcode sequences of 33 adults that were identified to species level by morphological methods were grouped with COI sequences of 189 larvae in a neighbor-joining taxon-ID tree. Most of these larvae could be identified only to genus level by morphological taxonomy (only 22 of the 189 sequenced larvae could be identified to species level). The taxon-ID tree of larval sequences had 45 operational taxonomic units (OTUs, defined as clusters with >97% identity or individual sequences differing from nearest neighbors by >3%; supported by analysis of all larval pairwise differences), of which seven could be identified to species or 'species group' level by larval morphology. Reference sequences from the GenBank and BOLD databases assigned six larval OTUs with presumptive species level identifications and confirmed one previously assigned species level identification. Sequences from morphologically identified adults in the present study grouped with and further classified the identity of 13 larval OTUs. The use of morphological identification and subsequent DNA barcoding of adult chironomids proved to be beneficial in revealing possible species level identifications of larval specimens. Sequence data from this study also contribute to currently inadequate public databases relevant to the Great Lakes region, while the neighbor-joining analysis reported here describes the application and confirmation of a useful tool that can accelerate identification and bioassessment of chironomid communities.
Failla, Andrew Joseph; Vasquez, Adrian Amelio; Hudson, Patrick L.; Fujimoto, Masanori; Ram, Jeffrey L.
2016-01-01
Establishing reliable methods for the identification of benthic chironomid communities is important due to their significant contribution to biomass, ecology and the aquatic food web. Immature larval specimens are more difficult to identify to species level by traditional morphological methods than their fully developed adult counterparts, and few keys are available to identify the larval species. In order to develop molecular criteria to identify species of chironomid larvae, larval and adult chironomids from Western Lake Erie were subjected to both molecular and morphological taxonomic analysis. Mitochondrial cytochrome c oxidase I (COI) barcode sequences of 33 adults that were identified to species level by morphological methods were grouped with COI sequences of 189 larvae in a neighbor-joining taxon-ID tree. Most of these larvae could be identified only to genus level by morphological taxonomy (only 22 of the 189 sequenced larvae could be identified to species level). The taxon-ID tree of larval sequences had 45 operational taxonomic units (OTUs, defined as clusters with >97% identity or individual sequences differing from nearest neighbors by >3%; supported by analysis of all larval pairwise differences), of which seven could be identified to species or ‘species group’ level by larval morphology. Reference sequences from the GenBank and BOLD databases assigned six larval OTUs with presumptive species level identifications and confirmed one previously assigned species level identification. Sequences from morphologically identified adults in the present study grouped with and further classified the identity of 13 larval OTUs. The use of morphological identification and subsequent DNA barcoding of adult chironomids proved to be beneficial in revealing possible species level identifications of larval specimens. Sequence data from this study also contribute to currently inadequate public databases relevant to the Great Lakes region, while the neighbor-joining analysis reported here describes the application and confirmation of a useful tool that can accelerate identification and bioassesment of chironomid communities.
Zhang, Yong; Hong, Mei; Sun, Qiang; Zhu, Shuangli; Tsewang; Li, Xiaolei; Yan, Dongmei; Wang, Dongyan; Xu, Wenbo
2014-04-01
Molecular methods, based on sequencing the region encoding the complete VP1 or P1 protein, have enabled the rapid identification of new enterovirus serotypes. In the present study, the complete genome of a newly discovered enterovirus serotype, strain Q0011/XZ/CHN/2000 (hereafter referred to as Q0011), was sequenced and analyzed. The virus, isolated from a stool sample from a patient with acute flaccid paralysis in the Tibet region of China in 2000, was characterized by amplicon sequencing and comparison to a GenBank database of enterovirus nucleotide sequences. The nucleotide sequence encoding the complete VP1 capsid protein is most closely related to the sequences of viruses within the species enterovirus B (EV-B), but is less than 72.1% identical to the homologous sequences of the recognized human enterovirus serotypes, with the greatest homology to EV-B101 and echovirus 32. Moreover, the deduced amino acid sequence of the complete VP1 region is less than 84.7% identical to those of the recognized serotypes, suggesting that the strain is a new serotype of enterovirus within EV-B. The virus was characterized as a new enterovirus type, named EV-B111, by the Picornaviridae Study Group of the International Committee on Taxonomy of Viruses. Low positive rate and titer of neutralizing antibody against EV-B111 were found in the Tibet region of China. Nearly 50% of children ≤5 years had no neutralizing antibody against EV-B111. So the extent of transmission and the exposure of the population to this new EV are very limited. This is the first identification of a new serotype of human enterovirus in China, and strain Q0011 was designated the prototype strain of EV-B111. Copyright © 2014 Elsevier B.V. All rights reserved.
Linear and Nonlinear Statistical Characterization of DNA
NASA Astrophysics Data System (ADS)
Norio Oiwa, Nestor; Goldman, Carla; Glazier, James
2002-03-01
We find spatial order in the distribution of protein-coding (including RNAs) and control segments of GenBank genomic sequences, irrespective of ATCG content. This is achieved by correlations, histograms, fractal dimensions and singularity spectra. Estimates of these quantities in complete nuclear genome indicate that coding sequences are long-range correlated and their disposition are self-similar (multifractal) for eukaryotes. These characteristics are absent in prokaryotes, where there are few noncoding sequences, suggesting the `junk' DNA play a relevant role to the genome structure and function. Concerning the genetic message of ATCG sequences, we build a random walk (Levy flight), using DNA symmetry arguments, where we associate A, T, C and G as left, right, down and up steps, respectively. Nonlinear analysis of mitochondrial DNA walks reveal multifractal pattern based on palindromic sequences, which fold in hairpins and loops.
Oliveira, L.P.; Cardozo, G.P.; Santos, E.V.; Mansur, M.A.B.; Donini, I.A.N.; Zissou, V.G.; Roberto, P.G.; Marins, M.
2009-01-01
The partial DNA sequences of the 18S rRNA gene of Babesia canis and the 16S rRNA gene of Ehrlichia canis detected in dogs from Ribeirão Preto, Brazil, were compared to sequences from other strains deposited in GenBank. The E. canis strain circulating in Ribeirão Preto is identical to other strains previously detected in the region, whereas the subspecies Babesia canis vogeli is the main Babesia strain circulating in dogs from Ribeirão Preto. PMID:24031351
Ormeño-Orrillo, Ernesto; Rey, Luis; Durán, David; Canchaya, Carlos A; Rogel, Marco A; Zúñiga-Dávila, Doris; Imperial, Juan; Ruiz-Argüeso, Tomás; Martínez-Romero, Esperanza
2017-09-01
Bradyrhizobium paxllaeri is a prevalent species in root nodules of the Lima bean ( Phaseolus lunatus ) in Peru. LMTR 21 T is the type strain of the species and was isolated from a root nodule collected in an agricultural field in the Peruvian central coast. Its 8.29 Mbp genome encoded 7635 CDS, 71 tRNAs and 3 rRNAs genes. All genes required to stablish a nitrogen-fixing symbiosis with its host were present. The draft genome sequence and annotation have been deposited at GenBank under the accession number MAXB00000000.
NASA Astrophysics Data System (ADS)
Kraljić, K.; Strüngmann, L.; Fimmel, E.; Gumbel, M.
2018-01-01
The genetic code is degenerated and it is assumed that redundancy provides error detection and correction mechanisms in the translation process. However, the biological meaning of the code's structure is still under current research. This paper presents a Genetic Code Analysis Toolkit (GCAT) which provides workflows and algorithms for the analysis of the structure of nucleotide sequences. In particular, sets or sequences of codons can be transformed and tested for circularity, comma-freeness, dichotomic partitions and others. GCAT comes with a fertile editor custom-built to work with the genetic code and a batch mode for multi-sequence processing. With the ability to read FASTA files or load sequences from GenBank, the tool can be used for the mathematical and statistical analysis of existing sequence data. GCAT is Java-based and provides a plug-in concept for extensibility. Availability: Open source Homepage:http://www.gcat.bio/
Deng, Ke-Jun; Yang, Zu-Jun; Liu, Cheng; Zhao, Wei; Liu, Chang; Feng, Juan; Ren, Zheng-Long
2007-03-01
Genetic characterization of 9 populations of Rhodiola crenulata, R. fastigiata and R. sachalinensis (Crassulaceae) species from Sichuan and Jilin Provinces of China, was investigated using the conserved primer of nad7 intron 2. All PCR products about 800 bp long were shorter than other Crassulaceae plants, which were used as molecular markers to identify the Rhodiola species. The sequence of the products indicated that total exon of 53 bp and intron of 738 bp exhibit only 9 nucleotide variations. Blasting the nad7 sequences to GenBank and the phylogenetic analysis showed that the sequence of Rhodiola species was clusted independently, and the length was smaller than all the registered sequences of higher plants. The result suggests that the Rhiodola species had a unique sequence in this gene region, which might be related to the special growth condition.
Liao, Jing; Chao, Zhi; Zhang, Liang
2013-11-01
To identify the common snakes in medicated liquor of Guangdong using COI barcode sequence,and to test the feasibility. The COI barcode sequences of collected medicinal snakes were amplified and sequenced. The sequences combined with the data from GenBank were analyzed for divergence and building a neighbor-joining(NJ) tree with MEGA 5.0. The genetic distance and NJ tree demonstrated that there were 241 variable sites in these species, and the average (A + T) content of 56.2% was higher than the average (G + C) content of 43.7%. The maximum interspecific genetic distance was 0.2568, and the minimum was 0. 1519. In the NJ tree,each species formed a monophyletic clade with bootstrap supports of 100%. DNA barcoding identification method based on the COI sequence is accurate and can be applied to identify the common medicinal snakes.
Genome sequence of the oleaginous yeast Rhodotorula toruloides strain CGMCC 2.1609.
Sambles, Christine; Middelhaufe, Sabine; Soanes, Darren; Kolak, Dagmara; Lux, Thomas; Moore, Karen; Matoušková, Petra; Parker, David; Lee, Rob; Love, John; Aves, Stephen J
2017-09-01
Most eukaryotic oleaginous species are yeasts and among them the basidiomycete red yeast, Rhodotorula ( Rhodosporidium ) toruloides (Pucciniomycotina) is known to produce high quantities of lipids when grown in nitrogen-limiting media, and has potential for biodiesel production. The genome of the CGMCC 2.1609 strain of this oleaginous red yeast was sequenced using a hybrid of Roche 454 and Illumina technology generating 13 × coverage. The de novo assembly was carried out using MIRA and scaffolded using MAQ and BAMBUS. The sequencing and assembly resulted in 365 scaffolds with total genome size of 33.4 Mb. The complete genome sequence of this strain was deposited in GenBank and the accession number is LKER00000000. The annotation is available on Figshare (doi:10.6084/m9.figshare.4754251).
Ribeiro, José M. C.; Schwarz, Alexandra; Francischetti, Ivo M. B.
2015-01-01
Saliva of blood-sucking arthropods contains a complex cocktail of pharmacologically active compounds that assists feeding by counteracting their hosts’ hemostatic and inflammatory reactions. Panstrongylus megistus (Burmeister) is an important vector of Chagas disease in South America, but despite its importance there is only one salivary protein sequence publicly deposited in GenBank. In the present work, we used Illumina technology to disclose and publicly deposit 3,703 coding sequences obtained from the assembly of >70 million reads. These sequences should assist proteomic experiments aimed at identifying pharmacologically active proteins and immunological markers of vector exposure. A supplemental file of the transcriptome and deducted protein sequences can be obtained from http://exon.niaid.nih.gov/transcriptome/P_megistus/Pmeg-web.xlsx. PMID:26334808
[Study on correlation between ITS sequence of Arctium lappa and quality of Fructus Arctii].
Xu, Liang; Dou, Deqiang; Wang, Bing; Yang, Yanyun; Kang, Tingguo
2011-07-01
To study the correlation between ITS sequence of Arctium lappa and Fructus Arctii quality of different origin. The samples of Fructu arctii materials were collected from 26 different producing areas. Their ITS sequence were determined after polymerase chain reaction (PCR) and quality were evaluated through the determination of arctiin content by HPLC. Genetic diversity, genotype and correlation were analyzed by ClustalX (1.81), Mage 4.0, SPSS 13.0 statistical software. ITS sequence of A. was obtained from 26 samples, and was registered in the GenBank. Corresponding arctiin content of Fructus arctii and 1000-grain weight were determined. A. lappa genotype correlated with Fructus arctii quality by statistical analysis. The research provided a foundation for revealing the molecular mechanism of Fructus arctii geoherbs.
Šišić, Adnan; Al-Hatmi, Abdullah M S; Baćanović-Šišić, Jelena; Ahmed, Sarah A; Dennenmoser, Dominic; de Hoog, G Sybren; Finckh, Maria R
2018-03-22
Two new species in the Fusarium solani species complex (FSSC) are described and introduced. The new taxa are represented by German isolates CBS 142481 and CBS 142480 collected from commercial yard waste compost and vascular tissue of a wilting branch of hibiscus, respectively. The phylogenetic relationships of the collected strains to one another and within the FSSC were evaluated based on DNA sequences of 6 gene loci. Due to the limited sequence data available for reference strains in GenBank, however, a multi-gene phylogenetic analysis included partial sequences for the internal transcribed spacer region and intervening 5.8S nrRNA gene (ITS), translation elongation factor 1-alpha (tef1) and the RNA polymerase II second largest subunit (rpb2). Morphological and molecular phylogenetic data independently showed that these strains are distinct populations of the FSSC, nested within Clade 3. Thus, we introduce Fusarium stercicola and Fusarium witzenhausenense as novel species in the complex. In addition, 19 plant species of 7 legume genera were evaluated for their potential to host the newly described taxa. Eighteen plant species were successfully colonized, with 6 and 9 of these being symptomatic hosts for F. stercicola and F. witzenhausenense, respectively. As plants of the family Fabaceae are very distant to the originally sourced material from which the new taxa were recovered, our results suggest that F. stercicola and F. witzenhausenense are not host-specific and are ecologically fit to sustain stable populations in variety of habitats.
Genomic analysis of the Chinese genotype 1F rubella virus that disappeared after 2002 in China.
Zhu, Zhen; Chen, Min-Hsin; Abernathy, Emily; Zhou, Shujie; Wang, Changyin; Icenogle, Joseph; Xu, Wenbo
2014-12-01
Genotype 1F was likely localized geographically to China as it has not been reported elsewhere. In this study, whole genome sequences of two rubella 1F virus isolates were completed. Both viruses contained 9,761 nt with a single nucleotide deletion in the intergenic region, compared to the NCBI rubella reference sequence (NC 001545). No evidence of recombination was found between 1F and other rubella viruses. The genetic distance between 1F viruses and 10 other rubella virus genotypes (1a, 1B, 1C, 1D, 1E, 1G, 1J 2A, 2B, and 2C) ranged from 3.9% to 8.6% by pairwise comparison. A region known to be hypervariable in other rubella genotypes was also the most variable region in the 1F genomes. Comparisons to all available rubella virus sequences from GenBank identified 22 nucleotide variations exclusively in 1F viruses. Among these unique variations, C9306U is located within the recommended molecular window for rubella virus genotyping assignment, could be useful to confirm 1F viruses. Using the Bayesian Markov Chain Monte Carlo (MCMC) method, the time of the most recent common ancestor for the genotype 1F was estimated between 1976 and 1995. Recent rubella molecular surveillance suggests that this indigenous strain may have circulated for less than three decades, as it has not been detected since 2002. © 2014 Wiley Periodicals, Inc.
Identification and nomenclature of the genus Penicillium.
Visagie, C M; Houbraken, J; Frisvad, J C; Hong, S-B; Klaassen, C H W; Perrone, G; Seifert, K A; Varga, J; Yaguchi, T; Samson, R A
2014-06-01
Penicillium is a diverse genus occurring worldwide and its species play important roles as decomposers of organic materials and cause destructive rots in the food industry where they produce a wide range of mycotoxins. Other species are considered enzyme factories or are common indoor air allergens. Although DNA sequences are essential for robust identification of Penicillium species, there is currently no comprehensive, verified reference database for the genus. To coincide with the move to one fungus one name in the International Code of Nomenclature for algae, fungi and plants, the generic concept of Penicillium was re-defined to accommodate species from other genera, such as Chromocleista, Eladia, Eupenicillium, Torulomyces and Thysanophora, which together comprise a large monophyletic clade. As a result of this, and the many new species described in recent years, it was necessary to update the list of accepted species in Penicillium. The genus currently contains 354 accepted species, including new combinations for Aspergillus crystallinus, A. malodoratus and A. paradoxus, which belong to Penicillium section Paradoxa. To add to the taxonomic value of the list, we also provide information on each accepted species MycoBank number, living ex-type strains and provide GenBank accession numbers to ITS, β-tubulin, calmodulin and RPB2 sequences, thereby supplying a verified set of sequences for each species of the genus. In addition to the nomenclatural list, we recommend a standard working method for species descriptions and identifications to be adopted by laboratories working on this genus.
Identification and nomenclature of the genus Penicillium
Visagie, C.M.; Houbraken, J.; Frisvad, J.C.; Hong, S.-B.; Klaassen, C.H.W.; Perrone, G.; Seifert, K.A.; Varga, J.; Yaguchi, T.; Samson, R.A.
2014-01-01
Penicillium is a diverse genus occurring worldwide and its species play important roles as decomposers of organic materials and cause destructive rots in the food industry where they produce a wide range of mycotoxins. Other species are considered enzyme factories or are common indoor air allergens. Although DNA sequences are essential for robust identification of Penicillium species, there is currently no comprehensive, verified reference database for the genus. To coincide with the move to one fungus one name in the International Code of Nomenclature for algae, fungi and plants, the generic concept of Penicillium was re-defined to accommodate species from other genera, such as Chromocleista, Eladia, Eupenicillium, Torulomyces and Thysanophora, which together comprise a large monophyletic clade. As a result of this, and the many new species described in recent years, it was necessary to update the list of accepted species in Penicillium. The genus currently contains 354 accepted species, including new combinations for Aspergillus crystallinus, A. malodoratus and A. paradoxus, which belong to Penicillium section Paradoxa. To add to the taxonomic value of the list, we also provide information on each accepted species MycoBank number, living ex-type strains and provide GenBank accession numbers to ITS, β-tubulin, calmodulin and RPB2 sequences, thereby supplying a verified set of sequences for each species of the genus. In addition to the nomenclatural list, we recommend a standard working method for species descriptions and identifications to be adopted by laboratories working on this genus. PMID:25505353
USDA-ARS?s Scientific Manuscript database
Plant ß-1,3-glucanase is commonly found to be involved in the disease resistance. A ß-1,3-glucanase gene was isolated from both the genomic DNA and cDNA of peanut variety Huayu20 by PCR and RT-PCR, respectively (GenBank Accession No. JQ801335). The genomic DNA sequence was 1,471 bp including two ext...
[Expression and analysis of the nucleoprotein of paramyxovirus Tianjin strain].
Wang, Qing; Li, Mei; Shi, Li-Ying; Yuan, Li-Jun; Wang, Wen-Xiu
2008-05-01
Paramyxovirus Tianjin strain is a novel strain of virus causing common cotton-eared marmoset fatal infection. To investigate the relationship between the gene structure and function of nucleoprotein (NP) of Tianjin strain, NP gene of paramyxovirus Tianjin strain was cloned and three domains of NP were expressed. The homologous and phylogenetic analysis of NP sequences among the paramyxovirus Tianjin strain and eight strains of Sendai viruses from GenBank were performed. The results indicated the recombinant proteins NP1, NP2 and NP3 showed the native antigenicity to the polyclonal antiserum of paramyxovirus Tianjin strain, ranking as NP3>NP1>NP2 (precedence order). The homology of NP nucleotide and the deduced amino acid sequences between paramyxovirus Tianjin strain and Sendai virus BB1 strain were 94.5%, 96.2%, respectively, whereas the identity were 85.1% - 88.7% and 92.4% - 94.7% among Tianjin strain and the 7 strains of Sendai viruses from GenBank respectively. There were 15 unique amino acid substitutions in Tianjin strain NP protein and 11 common amino acid substitutions same with BB1 strain. This research confirmed that paramyxovirus Tianjin strain might be a new genotype of Sendai virus and can be helpful in the establishment of detection assay applying recombinant NP as antigen instead of the whole virions.
PipeOnline 2.0: automated EST processing and functional data sorting.
Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A
2002-11-01
Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.
Bouadjenek, Mohamed Reda; Verspoor, Karin; Zobel, Justin
2017-07-01
We investigate and analyse the data quality of nucleotide sequence databases with the objective of automatic detection of data anomalies and suspicious records. Specifically, we demonstrate that the published literature associated with each data record can be used to automatically evaluate its quality, by cross-checking the consistency of the key content of the database record with the referenced publications. Focusing on GenBank, we describe a set of quality indicators based on the relevance paradigm of information retrieval (IR). Then, we use these quality indicators to train an anomaly detection algorithm to classify records as "confident" or "suspicious". Our experiments on the PubMed Central collection show assessing the coherence between the literature and database records, through our algorithms, is an effective mechanism for assisting curators to perform data cleansing. Although fewer than 0.25% of the records in our data set are known to be faulty, we would expect that there are many more in GenBank that have not yet been identified. By automated comparison with literature they can be identified with a precision of up to 10% and a recall of up to 30%, while strongly outperforming several baselines. While these results leave substantial room for improvement, they reflect both the very imbalanced nature of the data, and the limited explicitly labelled data that is available. Overall, the obtained results show promise for the development of a new kind of approach to detecting low-quality and suspicious sequence records based on literature analysis and consistency. From a practical point of view, this will greatly help curators in identifying inconsistent records in large-scale sequence databases by highlighting records that are likely to be inconsistent with the literature. Copyright © 2017 Elsevier Inc. All rights reserved.
Vidigal, Pedro M P; Mafra, Claudio L; Silva, Fernanda M F; Fietto, Juliana L R; Silva Júnior, Abelardo; Almeida, Márcia R
2012-01-01
Porcine circovirus-2 (PCV-2) is an emerging virus associated with a number of different syndromes in pigs known as Porcine Circovirus Associated Diseases (PCVAD). Since its identification and characterization in the early 1990s, PCV-2 has achieved a worldwide distribution, becoming endemic in most pig-producing countries, and is currently considered as the main cause of losses on pig farms. In this study, we analyzed the main routes of the spread of PCV-2 between pig-producing countries using phylogenetic and phylogeographical approaches. A search for PCV-2 genome sequences in GenBank was performed, and the 420 PCV-2 sequences obtained were grouped into haplotypes (group of sequences that showed 100% identity), based on the infinite sites model of genome evolution. A phylogenetic hypothesis was inferred by Bayesian Inference for the classification of viral strains and a haplotype network was constructed by Median Joining to predict the geographical distribution of and genealogical relationships between haplotypes. In order to establish an epidemiological and economic context in these analyses, we considered all information about PCV-2 sequences available in GenBank, including papers published on viral isolation, and live pig trading statistics available on the UN Comtrade database (http://comtrade.un.org/). In these analyses, we identified a strong correlation between the means of PCV-2 dispersal predicted by the haplotype network and the statistics on the international trading of live pigs. This correlation provides a new perspective on the epidemiology of PCV-2, highlighting the importance of the movement of animals around the world in the emergence of new pathogens, and showing the need for effective sanitary barriers when trading live animals. Copyright © 2011 Elsevier B.V. All rights reserved.
Prevalence of Theileria and Babesia species in Tunisian sheep.
Rjeibi, Mohamed R; Darghouth, Mohamed A; Gharbi, Mohamed
2016-05-24
In this study, the prevalence of Theileria and Babesia species in sheep was assessed with Giemsastained blood smear examination and polymerase chain reaction to identify the different piroplasms in 270 sheep from three Tunisian bioclimatic zones (north, centre, and south). The overall infection prevalence by Babesia spp. and Theileria spp. in Giemsa-stained blood smears was 2.9% (8/270) and 4.8% (13/270) respectively. The molecular results showed that sheep were more often infected by Theileria ovis than Babesia ovis with an overall prevalence of 16.3% (44/270) and 7.8% (21/270) respectively (p = 0.01). The molecular prevalence by Babesia ovis was significantly higher in females than in males (p < 0.05). According to localities B. ovis was found exclusively in sheep from the centre of Tunisia (Kairouan) whereas Theileria ovis was found in all regions. Infections with T. ovis and B. ovis were confirmed by sequencing. The sequence of T. ovis in this study (accession numbers KM924442) falls into the same clade as T. ovis deposited in GenBank. The T. ovis amplicons (KM924442) showed 99%-100% identities with GenBank sequences. Moreover, comparison of the partial sequences of 18S rRNA gene of B. ovis described in this study (KP670199) revealed 99.4% similarity with B. ovis recently reported in northern Tunisia from sheep and goats. Three nucleotides were different at positions 73 (A/T), 417 (A/T), and 420 (G/T). It also had 99% identity with B. ovis from Spain, Turkey and Iraq. The results suggest a high T. ovis prevalence in Tunisia with a decreasing north-south gradient. This could be correlated to the vector tick distribution.
Riojas, Marco A; McGough, Katya J; Rider-Riojas, Cristin J; Rastogi, Nalin; Hazbón, Manzour Hernando
2018-01-01
The species within the Mycobacterium tuberculosis Complex (MTBC) have undergone numerous taxonomic and nomenclatural changes, leaving the true structure of the MTBC in doubt. We used next-generation sequencing (NGS), digital DNA-DNA hybridization (dDDH), and average nucleotide identity (ANI) to investigate the relationship between these species. The type strains of Mycobacterium africanum, Mycobacterium bovis, Mycobacterium caprae, Mycobacterium microti and Mycobacterium pinnipedii were sequenced via NGS. Pairwise dDDH and ANI comparisons between these, previously sequenced MTBC type strain genomes (including 'Mycobacterium canettii', 'Mycobacterium mungi' and 'Mycobacterium orygis') and M. tuberculosis H37Rv T were performed. Further, all available genome sequences in GenBank for species in or putatively in the MTBC were compared to H37Rv T . Pairwise results indicated that all of the type strains of the species are extremely closely related to each other (dDDH: 91.2-99.2 %, ANI: 99.21-99.92 %), greatly exceeding the respective species delineation thresholds, thus indicating that they belong to the same species. Results from the GenBank genomes indicate that all the strains examined are within the circumscription of H37Rv T (dDDH: 83.5-100 %). We, therefore, formally propose a union of the species of the MTBC as M. tuberculosis. M. africanum, M. bovis, M. caprae, M. microti and M. pinnipedii are reclassified as later heterotypic synonyms of M. tuberculosis. 'M. canettii', 'M. mungi', and 'M. orygis' are classified as strains of the species M. tuberculosis. We further recommend use of the infrasubspecific term 'variant' ('var.') and infrasubspecific designations that generally retain the historical nomenclature associated with the groups or otherwise convey such characteristics, e.g. M. tuberculosis var. bovis.
Chung, H Y; Choi, Y C; Park, H N
2015-05-18
We investigated the phylogenetic relationships between pig breeds, compared the genetic similarity between humans and pigs, and provided basic genetic information on Korean native pigs (KNPs), using genetic variants of the swine leukocyte antigen 3 (SLA-3) gene. Primers were based on sequences from GenBank (accession Nos. AF464010 and AF464009). Polymerase chain reaction analysis amplified approximately 1727 bp of segments, which contained 1086 bp of coding regions and 641 bp of the 3'- and 5'-untranslated regions. Bacterial artificial chromosome clones of miniature pigs were used for sequencing the SLA-3 genomic region, which was 3114 bp in total length, including the coding (1086 bp) and non-coding (2028 bp) regions. Sequence analysis detected 53 single nucleotide polymorphisms (SNPs), based on a minor allele frequency greater than 0.01, which is low compared with other pig breeds, and the results suggest that there is low genetic variability in KNPs. Comparative analysis revealed that humans possess approximately three times more genetic variation than do pigs. Approximately 71% of SNPs in exons 2 and 3 were detected in KNPs, and exon 5 in humans is a highly polymorphic region. Newly identified sequences of SLA-3 using KNPs were submitted to GenBank (accession No. DQ992512-18). Cluster analysis revealed that KNPs were grouped according to three major alleles: SLA-3*0502 (DQ992518), SLA-3*0302 (DQ992513 and DQ992516), and SLA-3*0303 (DQ992512, DQ992514, DQ992515, and DQ992517). Alignments revealed that humans have a relatively close genetic relationship with pigs and chimpanzees. The information provided by this study may be useful in KNP management.
Limited evidence of intercontinental dispersal of avian paramyxovirus serotype 4 by migratory birds.
Reeves, Andrew B; Poulson, Rebecca L; Muzyka, Denys; Ogawa, Haruko; Imai, Kunitoshi; Bui, Vuong Nghia; Hall, Jeffrey S; Pantin-Jackwood, Mary; Stallknecht, David E; Ramey, Andrew M
2016-06-01
Avian paramyxovirus serotype 4 (APMV-4) is a single stranded RNA virus that has most often been isolated from waterfowl. Limited information has been reported regarding the prevalence, pathogenicity, and genetic diversity of AMPV-4. To assess the intercontinental dispersal of this viral agent, we sequenced the fusion gene of 58 APMV-4 isolates collected in the United States, Japan and the Ukraine and compared them to all available sequences on GenBank. With only a single exception the phylogenetic clades of APMV-4 sequences were monophyletic with respect to their continents of origin (North America, Asia and Europe). Thus, we detected limited evidence for recent intercontinental dispersal of APMV-4 in this study. Published by Elsevier B.V.
Limited evidence of intercontinental dispersal of avian paramyxovirus serotype 4 by migratory birds
Reeves, Andrew B.; Poulson, Rebecca L.; Muzyka, Denys; Ogawa, Haruko; Imai, Kunitoshi; Bui, Vuong Nghia; Hall, Jeffrey S.; Pantin-Jackwood, Mary; Stallknecht, David E.; Ramey, Andrew M.
2016-01-01
Avian paramyxovirus serotype 4 (APMV-4) is a single stranded RNA virus that has most often been isolated from waterfowl. Limited information has been reported regarding the prevalence, pathogenicity, and genetic diversity of AMPV-4. To assess the intercontinental dispersal of this viral agent, we sequenced the fusion gene of 58 APMV-4 isolates collected in the United States, Japan and the Ukraine and compared them to all available sequences on GenBank. With only a single exception the phylogenetic clades of APMV-4 sequences were monophyletic with respect their continents of origin (North America, Asia and Europe). Thus, we detected limited evidence for recent intercontinental dispersal of APMV-4 in this study. PMID:26925702
Montoya, Leticia; Bandala, Victor M; Garay-Serrano, Edith
2015-08-01
Two pure Alnus acuminata stands established in a montane forest in central Mexico (Puebla State) were monitored between 2010 and 2013 to confirm and recognize the ectomycorrhizal (EcM) systems of A. acuminata with Lactarius cuspidoaurantiacus and Lactarius herrerae, two recently described species. Through comparison of internal transcribed spacer (ITS) of nuclear ribosomal DNA sequences from basidiomes and ectomycorrhizas sampled in the forest stands, we confirmed their ectomycorrhizal association. The phytobiont was corroborated by comparing ITS sequences obtained from EcM root tips and leaves collected in the study site and from other sequences of A. acuminata available in Genbank. Detailed morphological and anatomical descriptions of the ectomycorrhizal systems are presented and complemented with photographs.
Molecular cloning and nucleotide sequence of CYP6BF1 from the diamondback moth, Plutella xylostella
Li, Hongshan; Dai, Huaguo; Wei, Hui
2005-01-01
A novel cDNA clong encoding a cytochrome P450 was screened from the insecticide-susceptible strain of Plutella xylostella (L.) (Lepidoptera:Yponomeutidae). The nucleotide sequence of the clone, designated CYP6BF1, was determined. This is the first full-length sequence of the CYP6 family from Plutella xylostella (L.). The cDNA is 1661bp in length and contains an open reading frame from base pairs 26 to 1570, encoding a protein of 514 amino acid residues. It is similar to the other insect P450s in gene family 6, including CYP6AE1 from Depressaria pastinacella, (46%). The GenBank accession number is AY971374. PMID:17119627
Designing oligo libraries taking alternative splicing into account
NASA Astrophysics Data System (ADS)
Shoshan, Avi; Grebinskiy, Vladimir; Magen, Avner; Scolnicov, Ariel; Fink, Eyal; Lehavi, David; Wasserman, Alon
2001-06-01
We have designed sequences for DNA microarrays and oligo libraries, taking alternative splicing into account. Alternative splicing is a common phenomenon, occurring in more than 25% of the human genes. In many cases, different splice variants have different functions, are expressed in different tissues or may indicate different stages of disease. When designing sequences for DNA microarrays or oligo libraries, it is very important to take into account the sequence information of all the mRNA transcripts. Therefore, when a gene has more than one transcript (as a result of alternative splicing, alternative promoter sites or alternative poly-adenylation sites), it is very important to take all of them into account in the design. We have used the LEADS transcriptome prediction system to cluster and assemble the human sequences in GenBank and design optimal oligonucleotides for all the human genes with a known mRNA sequence based on the LEADS predictions.
Comparative analysis of the prion protein gene sequences in African lion.
Wu, Chang-De; Pang, Wan-Yong; Zhao, De-Ming
2006-10-01
The prion protein gene of African lion (Panthera Leo) was first cloned and polymorphisms screened. The results suggest that the prion protein gene of eight African lions is highly homogenous. The amino acid sequences of the prion protein (PrP) of all samples tested were identical. Four single nucleotide polymorphisms (C42T, C81A, C420T, T600C) in the prion protein gene (Prnp) of African lion were found, but no amino acid substitutions. Sequence analysis showed that the higher homology is observed to felis catus AF003087 (96.7%) and to sheep number M31313.1 (96.2%) Genbank accessed. With respect to all the mammalian prion protein sequences compared, the African lion prion protein sequence has three amino acid substitutions. The homology might in turn affect the potential intermolecular interactions critical for cross species transmission of prion disease.
Krzeminska, Urszula; Wilson, Robyn; Rahman, Sadequr; Song, Beng Kah; Seneviratne, Sampath; Gan, Han Ming; Austin, Christopher M
2016-07-01
The complete mitochondrial genomes of two jungle crows (Corvus macrorhynchos) were sequenced. DNA was extracted from tissue samples obtained from shed feathers collected in the field in Sri Lanka and sequenced using the Illumina MiSeq Personal Sequencer. Jungle crow mitogenomes have a structural organization typical of the genus Corvus and are 16,927 bp and 17,066 bp in length, both comprising 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal subunit genes, and a non-coding control region. In addition, we complement already available house crow (Corvus spelendens) mitogenome resources by sequencing an individual from Singapore. A phylogenetic tree constructed from Corvidae family mitogenome sequences available on GenBank is presented. We confirm the monophyly of the genus Corvus and propose to use complete mitogenome resources for further intra- and interspecies genetic studies.
[Molecular identification of astragali radix and its adulterants by ITS sequences].
Cui, Zhan-Hu; Li, Yue; Yuan, Qing-Jun; Zhou, Li-She; Li, Min-Hui
2012-12-01
To explore a new method for identification Astragali Radix from its adulterants by using ITS sequence. Thirteen samples of the different Astragali Radix materials and 6 samples of the adulterants of the roots of Hedysarum polybotrys, Medicago sativa and Althaea rosea were collected. ITS sequence was amplified by PCR and sequenced unidirectionally. The interspecific K-2-P distances of Astragali Radix and its adulterants were calculated, and NJ tree and UPGMA tree were constructed by MEGA 4. ITS sequences were obtained from 19 samples respectively, there were Astragali Radix 646-650 bp, H. polybotrys 664 bp, Medicago sativa 659 bp, Althaea rosea 728 bp, which were registered in the GenBank. Phylogeny trees reconstruction using NJ and UPGMA analysis based on ITS nucleotide sequences can effectively distinguish Astragali Radix from adulterants. ITS sequence can be used to identify Astragali Radix from its adulterants successfully and is an efficient molecular marker for authentication of Astragali Radix and its adulterants.
Purkayastha, Anjan; Su, Jing; McGraw, John; Ditty, Susan E; Hadfield, Ted L; Seto, Jason; Russell, Kevin L; Tibbetts, Clark; Seto, Donald
2005-07-01
Vaccine strains of human adenovirus serotypes 4 and 7 (HAdV-4vac and HAdV-7vac) have been used successfully to prevent adenovirus-related acute respiratory disease outbreaks. The genomes of these two vaccine strains have been sequenced, annotated, and compared with their prototype equivalents with the goals of understanding their genomes for molecular diagnostics applications, vaccine redevelopment, and HAdV pathoepidemiology. These reference genomes are archived in GenBank as HAdV-4vac (35,994 bp; AY594254) and HAdV-7vac (35,240 bp; AY594256). Bioinformatics and comparative whole-genome analyses with their recently reported and archived prototype genomes reveal six mismatches and four insertions-deletions (indels) between the HAdV-4 prototype and vaccine strains, in contrast to the 611 mismatches and 130 indels between the HAdV-7 prototype and vaccine strains. Annotation reveals that the HAdV-4vac and HAdV-7vac genomes contain 51 and 50 coding units, respectively. Neither vaccine strain appears to be attenuated for virulence based on bioinformatics analyses. There is evidence of genome recombination, as the inverted terminal repeat of HAdV-4vac is initially identical to that of species C whereas the prototype is identical to species B1. These vaccine reference sequences yield unique genome signatures for molecular diagnostics. As a molecular forensics application, these references identify the circulating and problematic 1950s era field strains as the original HAdV-4 prototype and the Greider prototype, from which the vaccines are derived. Thus, they are useful for genomic comparisons to current epidemic and reemerging field strains, as well as leading to an understanding of pathoepidemiology among the human adenoviruses.
Purkayastha, Anjan; Su, Jing; McGraw, John; Ditty, Susan E.; Hadfield, Ted L.; Seto, Jason; Russell, Kevin L.; Tibbetts, Clark; Seto, Donald
2005-01-01
Vaccine strains of human adenovirus serotypes 4 and 7 (HAdV-4vac and HAdV-7vac) have been used successfully to prevent adenovirus-related acute respiratory disease outbreaks. The genomes of these two vaccine strains have been sequenced, annotated, and compared with their prototype equivalents with the goals of understanding their genomes for molecular diagnostics applications, vaccine redevelopment, and HAdV pathoepidemiology. These reference genomes are archived in GenBank as HAdV-4vac (35,994 bp; AY594254) and HAdV-7vac (35,240 bp; AY594256). Bioinformatics and comparative whole-genome analyses with their recently reported and archived prototype genomes reveal six mismatches and four insertions-deletions (indels) between the HAdV-4 prototype and vaccine strains, in contrast to the 611 mismatches and 130 indels between the HAdV-7 prototype and vaccine strains. Annotation reveals that the HAdV-4vac and HAdV-7vac genomes contain 51 and 50 coding units, respectively. Neither vaccine strain appears to be attenuated for virulence based on bioinformatics analyses. There is evidence of genome recombination, as the inverted terminal repeat of HAdV-4vac is initially identical to that of species C whereas the prototype is identical to species B1. These vaccine reference sequences yield unique genome signatures for molecular diagnostics. As a molecular forensics application, these references identify the circulating and problematic 1950s era field strains as the original HAdV-4 prototype and the Greider prototype, from which the vaccines are derived. Thus, they are useful for genomic comparisons to current epidemic and reemerging field strains, as well as leading to an understanding of pathoepidemiology among the human adenoviruses. PMID:16000418
Pfeiler, E; Markow, T A
2008-10-01
Mitochondrial DNA sequence data from the control region and 12S rRNA in leopard frogs from the Sierra El Aguaje of southern Sonora, Mexico, together with GenBank sequences, were used to infer taxonomic identity and provide phylogenetic hypotheses for relationships with other members of the Rana pipiens complex. We show that frogs from the Sierra El Aguaje belong to the Rana berlandieri subgroup, or Scurrilirana clade, of the R. pipiens group, and are most closely related to Rana magnaocularis from Nayarit, Mexico. We also provide further evidence that Rana magnaocularis and R. yavapaiensis are close relatives.
[Identification and polymorphism of pectinase genes PGU in the Saccharomyces bayanus complex].
Shalamitskiy, M Yu; Naumov, G I
2016-05-01
Pectinase (endo-polygalacturonase) is the key enzyme splitting plant pectin. The corresponding single gene PGU1 is documented for the yeast S. cerevisiae. On the basis of phylogenetic analysis of the PGU nucleotide sequence available in the GenBank, a family of divergent PGU genes is found in the species complex S. bayanus: S. bayanus var. uvarum, S. eubayanus, and hybrid taxon S. pastorianus. The PGU genes have different chromosome localization.
Fourment, Mathieu; Gibbs, Mark J
2008-02-05
Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically.
Extension of the COG and arCOG databases by amino acid and nucleotide sequences
Meereis, Florian; Kaufmann, Michael
2008-01-01
Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535
MannDB: A microbial annotation database for protein characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, C; Lam, M; Smith, J
2006-05-19
MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-sourcemore » tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports.« less
[A surveillance study on CRISPR/Cas molecular biomarker in Escherichia coli].
Liang, W J; Zhang, R G; Duan, G C; Hong, L J; Zhang, B; Xi, Y L; Yang, H Y; Chen, S Y; Lou, T Y; Zhao, Y X
2016-08-10
A new method related to molecular biomarker with CRISPR/Cas (clustered regularly interspaced short palindromic repeats-cas) in Escherichia (E.) coli was developed and used for surveillance programs. CRISPR/Cas sequence that containing 135 strains with complete sequence and 203 strains with whole genome shotgun sequence of E. coli in GenBank by BLAST and 361 strains of E. coli (including 38 strains of E. coli O157∶H7) in laboratory were identified by PCR and analyzed with the CRISPR Finder. Spacers were compared with DANMAN and the phylogenetic trees of cas gene were constructed under Clustal Ⅹ and Mega 5.1. With new perspective, a descriptive method was developed targeting on the position of CRISPR/cas in E. coli. The CRISPR1 was detected in 77.04%, 100.00% and 75.62% and the CRISPR2 was detected in 74.81%, 100.00% and 92.24% and the CRISPR3 and CRISPR4 were detected in 11.85%, 0 and 1.39% for 135 strains with complete sequence, 203 strains with whole genome shotgun sequence and 361 strains in the laboratory, respectively. One strain downloaded in GenBank with whole genome sequencing and 2 strains in the our laboratory were identified that containing four CRISPR locus. The other E. coli strain was with insertion sequence in downstream of the non-cas CRISPR1. The unique CRISPR was found in 8 strains of O55∶H7, in 180 strains of O157∶H7, in 8 strains of O157∶HNM, in 40 strains of O104∶H4, in 4 strains of O145∶H28, in all the 699 E. coli strains. The phylogenetic tree could be divided into two groups-cas with type I-E or type I-F. CRISPR/Cas might be used as a valuable molecular biomarker in epidemiological surveillance studies to identify the high virulent strains or new strains of E. coli. Phage night be related to the missing or obtaining of spacers.
Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S
2017-02-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.
Babbitt, Patricia C.; Ferrin, Thomas E.
2017-01-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences. PMID:28187133
Suwannasai, Nuttika; Martín, María P; Phosri, Cherdchai; Sihanonth, Prakitsin; Whalley, Anthony J S; Spouge, John L
2013-01-01
Thailand, a part of the Indo-Burma biodiversity hotspot, has many endemic animals and plants. Some of its fungal species are difficult to recognize and separate, complicating assessments of biodiversity. We assessed species diversity within the fungal genera Annulohypoxylon and Hypoxylon, which produce biologically active and potentially therapeutic compounds, by applying classical taxonomic methods to 552 teleomorphs collected from across Thailand. Using probability of correct identification (PCI), we also assessed the efficacy of automated species identification with a fungal barcode marker, ITS, in the model system of Annulohypoxylon and Hypoxylon. The 552 teleomorphs yielded 137 ITS sequences; in addition, we examined 128 GenBank ITS sequences, to assess biases in evaluating a DNA barcode with GenBank data. The use of multiple sequence alignment in a barcode database like BOLD raises some concerns about non-protein barcode markers like ITS, so we also compared species identification using different alignment methods. Our results suggest the following. (1) Multiple sequence alignment of ITS sequences is competitive with pairwise alignment when identifying species, so BOLD should be able to preserve its present bioinformatics workflow for species identification for ITS, and possibly therefore with at least some other non-protein barcode markers. (2) Automated species identification is insensitive to a specific choice of evolutionary distance, contributing to resolution of a current debate in DNA barcoding. (3) Statistical methods are available to address, at least partially, the possibility of expert misidentification of species. Phylogenetic trees discovered a cryptic species and strongly supported monophyletic clades for many Annulohypoxylon and Hypoxylon species, suggesting that ITS can contribute usefully to a barcode for these fungi. The PCIs here, derived solely from ITS, suggest that a fungal barcode will require secondary markers in Annulohypoxylon and Hypoxylon, however. The URL http://tinyurl.com/spouge-barcode contains computer programs and other supplementary material relevant to this article.
Hoffmann, B.; Freuling, C. M.; Wakeley, P. R.; Rasmussen, T. B.; Leech, S.; Fooks, A. R.; Beer, M.; Müller, T.
2010-01-01
To improve the diagnosis of classical rabies virus with molecular methods, a validated, ready-to-use, real-time reverse transcription-PCR (RT-PCR) assay was developed. In a first step, primers and 6-carboxyfluorescien-labeled TaqMan probes specific for rabies virus were selected from the consensus sequence of the nucleoprotein gene of 203 different rabies virus sequences derived from GenBank. The selected primer-probe combination was highly specific and sensitive. During validation using a sample set of rabies virus strains from the virus archives of the Friedrich-Loeffler-Institut (FLI; Germany), the Veterinary Laboratories Agency (VLA; United Kingdom), and the DTU National Veterinary Institute (Lindholm, Denmark), covering the global diversity of rabies virus lineages, it was shown that both the newly developed assay and a previously described one had some detection failures. This was overcome by a combined assay that detected all samples as positive. In addition, the introduction of labeled positive controls (LPC) increased the diagnostic safety of the single as well as the combined assay. Based on the newly developed, alternative assay for the detection of rabies virus and the application of LPCs, an improved diagnostic sensitivity and reliability can be ascertained for postmortem and intra vitam real-time RT-PCR analyses in rabies reference laboratories. PMID:20739489
Hoffmann, B; Freuling, C M; Wakeley, P R; Rasmussen, T B; Leech, S; Fooks, A R; Beer, M; Müller, T
2010-11-01
To improve the diagnosis of classical rabies virus with molecular methods, a validated, ready-to-use, real-time reverse transcription-PCR (RT-PCR) assay was developed. In a first step, primers and 6-carboxyfluorescien-labeled TaqMan probes specific for rabies virus were selected from the consensus sequence of the nucleoprotein gene of 203 different rabies virus sequences derived from GenBank. The selected primer-probe combination was highly specific and sensitive. During validation using a sample set of rabies virus strains from the virus archives of the Friedrich-Loeffler-Institut (FLI; Germany), the Veterinary Laboratories Agency (VLA; United Kingdom), and the DTU National Veterinary Institute (Lindholm, Denmark), covering the global diversity of rabies virus lineages, it was shown that both the newly developed assay and a previously described one had some detection failures. This was overcome by a combined assay that detected all samples as positive. In addition, the introduction of labeled positive controls (LPC) increased the diagnostic safety of the single as well as the combined assay. Based on the newly developed, alternative assay for the detection of rabies virus and the application of LPCs, an improved diagnostic sensitivity and reliability can be ascertained for postmortem and intra vitam real-time RT-PCR analyses in rabies reference laboratories.
Li, Yinsheng; Zhao, Chun; Lu, Xiaoxu; Ai, Xiaojie; Qiu, Jiangping
2018-04-15
Cytochrome P450 (CYP450) enzymes are a family of hemoproteins primarily responsible for detoxification functions. Earthworms have been used as a bioindicator of soil pollution in numerous studies, but no CYP450 gene has so far been cloned. RT-PCR and RACE-PCR were employed to construct and sequence the CYP450 gene DNA from the extracted mRNA in the earthworm Eisenia fetida. The cloned gene (EW1) has an open reading frame of 477bp. The 3'-terminal region contained both the consensus and the signature sequences characteristic of CYP450. It was closely related to the CYP450 gene from the flatworm genus Opisthorchis felineus with 87% homology. The predicted structure of the putative protein was 97% homologous to human CYP450 family 27. This gene has been deposited in GenBank (accession no. KM881474). Earthworms (E. fetida) were then exposed to 1, 10, 100, and 500mgkg -1 enrofloxacin in soils to explore the mRNA expression by real time qPCR. The effect of enrofloxacin on mRNA expression levels of EW1 exhibited a marked hormesis pattern across the enrofloxacin dose range tested. This is believed to be the first reported CYP450 gene in earthworms, with reference value for molecular studies on detoxification processes in earthworms. Copyright © 2017 Elsevier Inc. All rights reserved.
NCBI Bookshelf: books and documents in life sciences and health care
Hoeppner, Marilu A.
2013-01-01
Bookshelf (http://www.ncbi.nlm.nih.gov/books/) is a full-text electronic literature resource of books and documents in life sciences and health care at the National Center for Biotechnology Information (NCBI). Created in 1999 with a single book as an encyclopedic reference for resources such as PubMed and GenBank, it has grown to its current size of >1300 titles. Unlike other NCBI databases, such as GenBank and Gene, which have a strict data structure, books come in all forms; they are diverse in publication types, formats, sizes and authoring models. The Bookshelf data format is XML tagged in the NCBI Book DTD (Document Type Definition), modeled after the National Library of Medicine journal article DTDs. The book DTD has been used for systematically tagging the diverse data formats of books, a move that has set the foundation for the growth of this resource. Books at NCBI followed the route of journal articles in the PubMed Central project, using the PubMed Central architectural framework, workflows and processes. Through integration with other NCBI molecular databases, books at NCBI can be used to provide reference information for biological data and facilitate its discovery. This article describes Bookshelf at NCBI: its growth, data handling and retrieval and integration with molecular databases. PMID:23203889
NCBI Bookshelf: books and documents in life sciences and health care.
Hoeppner, Marilu A
2013-01-01
Bookshelf (http://www.ncbi.nlm.nih.gov/books/) is a full-text electronic literature resource of books and documents in life sciences and health care at the National Center for Biotechnology Information (NCBI). Created in 1999 with a single book as an encyclopedic reference for resources such as PubMed and GenBank, it has grown to its current size of >1300 titles. Unlike other NCBI databases, such as GenBank and Gene, which have a strict data structure, books come in all forms; they are diverse in publication types, formats, sizes and authoring models. The Bookshelf data format is XML tagged in the NCBI Book DTD (Document Type Definition), modeled after the National Library of Medicine journal article DTDs. The book DTD has been used for systematically tagging the diverse data formats of books, a move that has set the foundation for the growth of this resource. Books at NCBI followed the route of journal articles in the PubMed Central project, using the PubMed Central architectural framework, workflows and processes. Through integration with other NCBI molecular databases, books at NCBI can be used to provide reference information for biological data and facilitate its discovery. This article describes Bookshelf at NCBI: its growth, data handling and retrieval and integration with molecular databases.
Boehm; Gibson; Lubzens
2000-01-01
This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.
Creager, Hannah M; Becker, Ericka A; Sandman, Kelly K; Karl, Julie A; Lank, Simon M; Bimber, Benjamin N; Wiseman, Roger W; Hughes, Austin L; O'Connor, Shelby L; O'Connor, David H
2011-09-01
In recent years, the use of cynomolgus macaques in biomedical research has increased greatly. However, with the exception of the Mauritian population, knowledge of the MHC class II genetics of the species remains limited. Here, using cDNA cloning and Sanger sequencing, we identified 127 full-length MHC class II alleles in a group of 12 Indonesian and 12 Vietnamese cynomolgus macaques. Forty two of these were completely novel to cynomolgus macaques while 61 extended the sequence of previously identified alleles from partial to full length. This more than doubles the number of full-length cynomolgus macaque MHC class II alleles available in GenBank, significantly expanding the allele library for the species and laying the groundwork for future evolutionary and functional studies.
Information resources at the National Center for Biotechnology Information.
Woodsmall, R M; Benson, D A
1993-01-01
The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583
Co-circulation of Peste-des-Petits-Ruminants Virus Asian lineage IV with Lineage II in Nigeria.
Woma, T Y; Adombi, C M; Yu, D; Qasim, A M M; Sabi, A A; Maurice, N A; Olaiya, O D; Loitsch, A; Bailey, D; Shamaki, D; Dundon, W G; Quan, M
2016-06-01
Peste-des-petits-ruminants (PPR), a major small ruminant transboundary animal disease, is endemic in Nigeria. Strains of the causal agent, peste-des-petits-ruminants virus (PPRV), have been differentiated into four genetically distinct lineages based on the partial sequence of the virus nucleoprotein (N) or fusion (F) genes. Peste-des-petits-ruminants virus strains that were identified initially in Africa were grouped into lineages I, II and III and viruses from Asia were classified as lineage IV and referred to as the Asian lineage. Many recent reports indicate that the Asian lineage is now also present in Africa. With this in mind, this study was conducted to reassess the epidemiology of PPRV in Nigeria. A total of 140 clinical samples from 16 sheep and 63 goats with symptoms suggestive of PPR were collected from different states of Nigeria during a four-year period (2010-2013). They were analysed by the amplification of fragments of the N gene. Results for 33 (42%) animals were positive. The phylogenetic analysis of the N gene sequences with those available in GenBank showed that viruses that were detected belong to both lineage II and IV. Based on an analysis of the N gene sequences, the lineage IV isolates grouped into two clades, one being predominant in the north-eastern part of the country and the other found primarily in the southern regions of the country. This study reports the presence of PPRV Asian lineage IV in Nigeria for the first time. © 2016 Blackwell Verlag GmbH.
Garcia, Daniel Fantozzi; Oliveira, Ticiano G.; Molfetta, Greice A.; Garcia, Luiz V.; Ferreira, Cristiane A.; Marques, Adriana A.; Silva, Wilson Araujo
2011-01-01
Butyrylcholinesterase (BChE) is a plasma enzyme that catalyzes the hydrolysis of choline esters, including the muscle-relaxant succinylcholine and mivacurium. Patients who present sustained neuromuscular blockade after using succinylcholine usually carry BChE variants with reduced enzyme activity or an acquired BChE deficiency. We report here the molecular basis of the BCHE gene underlying the slow catabolism of succinylcholine in a patient who underwent endoscopic nasal surgery. We measured the enzyme activity of BChE and extracted genomic DNA in order to study the promoter region and all exons of the BCHE gene of the patient, her parents and siblings. PCR products were sequenced and compared with reference sequences from GenBank. We detected that the patient and one of her brothers have two homozygous mutations: nt1615 GCA > ACA (Ala539Thr), responsible for the K variant, and nt209 GAT > GGT (Asp70Gly), which produces the atypical variant A. Her parents and two of her brothers were found to be heterozygous for the AK allele, and another brother is homozygous for the normal allele. Sequence analysis of exon 1 including 5′UTR showed that the proband and her brother are homozygous for –116GG. The AK/AK genotype is considered the most frequent in hereditary hypocholinesterasemia (44%). This work demonstrates the importance of defining the phenotype and genotype of the BCHE gene in patients who are subjected to neuromuscular block by succinylcholine, because of the risk of prolonged neuromuscular paralysis. PMID:21637541
Khalil, Farghama; Yueyu, Xu; Naiyan, Xiao; Di, Liu; Tayyab, Muhammad; Hengbo, Wang; Islam, Waqar; Rauf, Saeed; Pinghua, Chen
2018-05-04
Sugarcane is an essential crop for sugar and biofuel. Globally, its production is severely affected by sugarcane yellow leaf disease (SCYLD) caused by Sugarcane Yellow Leaf Virus (SCYLV). Many aphid vectors are involved in the spread of the disease which reduced the effectiveness of cultural and chemical management. Empirical methods of plant breeding such as introgression from wild and cultivated germplasm were not possible or at least challenging due to the absence of resistance in cultivated and wild germplasm of sugarcane. RNA interference (RNAi) transformation is an effective method to create virus-resistant varieties. Nevertheless, limited progress has been made due to lack of comprehensive research program on SCYLV based on RNAi technique. In order to show improvement and to propose future strategies for the feasibility of the RNAi technique to cope SCYLV, genome-wide consensus sequences of SCYLV were analyzed through GenBank. The coverage rates of every consensus sequence in SCYLV isolates were calculated to evaluate their practicability. Our analysis showed that single consensus sequence from SCYLV could not work well for RNAi based sugarcane breeding programs. This may be due to high mutation rate and continuous recombination within and between various viral strains. Alternative multi-target RNAi strategy is suggested to combat several strains of the viruses and to reduce the silencing escape. The multi-target small interfering RNA (siRNA) can be used together to construct RNAi plant expression plasmid, and to transform sugarcane tissues to develop new sugarcane varieties resistant to SCYLV. Copyright © 2018 Elsevier Ltd. All rights reserved.
Li, Lin-Feng; Häkkinen, Markku; Yuan, Yong-Ming; Hao, Gang; Ge, Xue-Jun
2010-10-01
Musaceae is a small paleotropical family. Three genera have been recognised within this family although the generic delimitations remain controversial. Most species of the family (around 65 species) have been placed under the genus Musa and its infrageneric classification has long been disputed. In this study, we obtained nuclear ribosomal ITS and chloroplast (atpB-rbcL, rps16, and trnL-F) DNA sequences of 36 species (42 accessions of ingroups representing three genera) together with 10 accessions of ingroups retrieved from GenBank database and 4 accessions of outgroups, to construct the phylogeny of the family, with a special reference to the infrageneric classification of the genus Musa. Our phylogenetic analyses elaborated previous results in supporting the monophyly of the family and suggested that Musella and Ensete may be congeneric or at least closely related, but refuted the previous infrageneric classification of Musa. None of the five sections of Musa previously defined based on morphology was recovered as monophyletic group in the molecular phylogeny. Two infrageneric clades were identified, which corresponded well to the basic chromosome numbers of x=11 and 10/9/7, respectively: the former clade comprises species from the sections Musa and Rhodochlamys while the latter contains sections of Callimusa, Australimusa, and Ingentimusa. Copyright 2010 Elsevier Inc. All rights reserved.
Karaca, Gürsel; Jonathan, Rinita; Paul, Bernard
2009-06-01
Pythium stipitatum is a slow-growing oomycete and has been isolated from soil samples and plant materials from France, Tunisia, Turkey and India. Its morphological characteristics are reminiscent of those of Pythium ramificatum, discovered in Algeria by the corresponding author. Unfortunately, the Algerian isolate was not deposited in any culture collection and ultimately got lost. Those were the days when molecular description of fungi was not a fashion; hence, no molecular characteristics of the Algerian isolates were deposited to the GenBank. Moreover, its coralloid antheridial branches made it an easy prey to be considered as synonymous to Pythium minus. Because there are no living strains of P. ramificatum, and no sequence at the GenBank, it is being treated as 'nomen invalidum' here. However, we have now isolated the same type of oomycete from four different countries and we have sufficient evidence, both molecular and morphological, to describe it as a new species, quite different from P. minus. In this article, we are giving the morphological and molecular evidence to separate it as a distinct species, P. stipitatum, belonging to the 'Clade E' of the genus Pythium. Taxonomic description of this oomycete, its comparison with related species, and the sequence of the internal transcribed spacer region of its rRNA gene, are discussed here.
High fungal diversity and abundance recovered in the deep-sea sediments of the Pacific Ocean.
Xu, Wei; Pang, Ka-Lai; Luo, Zhu-Hua
2014-11-01
Knowledge about the presence and ecological significance of bacteria and archaea in the deep-sea environments has been well recognized, but the eukaryotic microorganisms, such as fungi, have rarely been reported. The present study investigated the composition and abundance of fungal community in the deep-sea sediments of the Pacific Ocean. In this study, a total of 1,947 internal transcribed spacer (ITS) regions of fungal rRNA gene clones were recovered from five sediment samples at the Pacific Ocean (water depths ranging from 5,017 to 6,986 m) using three different PCR primer sets. There were 16, 17, and 15 different operational taxonomic units (OTUs) identified from fungal-universal, Ascomycota-, and Basidiomycota-specific clone libraries, respectively. Majority of the recovered sequences belonged to diverse phylotypes of Ascomycota (25 phylotypes) and Basidiomycota (18 phylotypes). The multiple primer approach totally recovered 27 phylotypes which showed low similarities (≤97 %) with available fungal sequences in the GenBank, suggesting possible new fungal taxa occurring in the deep-sea environments or belonging to taxa not represented in the GenBank. Our results also recovered high fungal LSU rRNA gene copy numbers (3.52 × 10(6) to 5.23 × 10(7)copies/g wet sediment) from the Pacific Ocean sediment samples, suggesting that the fungi might be involved in important ecological functions in the deep-sea environments.
Al-Qurainy, F; Khan, S; Nadeem, M; Tarroum, M; Alaklabi, A
2013-03-11
The rare and endangered plants of any country are important genetic resources that often require urgent conservation measures. Assessment of phylogenetic relationships and evaluation of genetic diversity is very important prior to implementation of conservation strategies for saving rare and endangered plant species. We used internal transcribed spacer sequences of nuclear ribosomal DNA for the evaluation of sequence identity from the available taxa in the GenBank database by using the Basic Local Alignment Search Tool (BLAST). Two rare plant species viz, Heliotropium strigosum claded with H. pilosum (98% branch support) and Pancratium tortuosum claded with P. tenuifolium (61% branch support) clearly. However, some species, viz Scadoxus multiflorus, Commiphora myrrha and Senecio hadiensis showed close relationships with more than one species. We conclude that nuclear ribosomal internal transcribed spacer sequences are useful markers for phylogenetic study of these rare plant species in Saudi Arabia.
de Miranda, R L; O'Dwyer, L H; de Castro, J R; Metzger, B; Rubini, A S; Mundim, A V; Eyal, O; Talmi-Frank, D; Cury, M C; Baneth, G
2014-10-01
The objective of this survey was to investigate the prevalence of Hepatozoon infection in dogs in the rural and urban areas of Uberlândia, Brazil by PCR and molecular characterization. DNA was obtained from blood samples collected from 346 local dogs from both genders and various ages. Seventeen PCR products from positive blood samples of urban dogs and 13 from the rural dogs were sequenced. Partial sequences of the 18S rRNA gene indicated that all 30 dogs were infected with Hepatozoon canis similar in sequence to H. canis from southern Europe. Four local dog sequences were submitted to GenBank (accessions JN835188; KF692038; KF692039; KF692040). This study indicates that H. canis is the cause of canine hepatozoonosis in Uberlândia and that infection is similarly widespread in rural and urban dogs. Copyright © 2014. Published by Elsevier Ltd.
Costa, Maria Eduarda S M; Oliveira, Claudio Bruno S; Andrade, Joelma Maria de A; Medeiros, Thatiany A; Neto, Valter F Andrade; Lanza, Daniel C F
2016-07-01
Toxoplasma gondii is a widespread parasite able to infect virtually any nucleated cells of warm-blooded hosts. In some cases, T. gondii detection using already developed PCR primers can be inefficient in routine laboratory tests, especially to detect atypical strains. Here we report a new nested-PCR protocol able to detect virtually all T. gondii isolates. Analyzing 685 sequences available in GenBank, we determine that GRA7 is one of the most conserved genes of T. gondii genome. Based on an alignment of 85 GRA7 sequences new primer sets that anneal in the highly conserved regions of this gene were designed. The new GRA7 nested-PCR assay providing sensitivity and specificity equal to or greater than the gold standard PCR assays for T. gondii detection, that amplify the B1 sequence or the repetitive 529bp element. Copyright © 2016 Elsevier B.V. All rights reserved.
Identification of tissue-embedded ascarid larvae by ribosomal DNA sequencing.
Ishiwata, Kenji; Shinohara, Akio; Yagi, Kinpei; Horii, Yoichiro; Tsuchiya, Kimiyuki; Nawa, Yukifumi
2004-01-01
Polymerase chain reaction (PCR) was applied to identify tissue-embedded ascarid nematode larvae. Two sequences of the internal transcribed spacer (ITS) regions of ribosomal DNA (rDNA), ITS1 and ITS2, of the ascarid parasites were amplified and compared with those of ascarid-nematodes registered in a DNA database (GenBank). The ITS sequences of the PCR products obtained from the ascarid parasite specimen in our laboratory were compatible with those of registered adult Ascaris and Toxocara parasites. PCR amplification of the ITS regions was sensitive enough to detect a single larva of Ascaris suum mixed with porcine liver tissue. Using this method, ascarid larvae embedded in the liver of a naturally infected turkey were identified as Toxocara canis. These results suggest that even a single larva embedded in tissues from patients with larva migrans could be identified by sequencing the ITS regions.
Plant DNA sequences from feces: potential means for assessing diets of wild primates.
Bradley, Brenda J; Stiller, Mathias; Doran-Sheehy, Diane M; Harris, Tara; Chapman, Colin A; Vigilant, Linda; Poinar, Hendrik
2007-06-01
Analyses of plant DNA in feces provides a promising, yet largely unexplored, means of documenting the diets of elusive primates. Here we demonstrate the promise and pitfalls of this approach using DNA extracted from fecal samples of wild western gorillas (Gorilla gorilla) and black and white colobus monkeys (Colobus guereza). From these DNA extracts we amplified, cloned, and sequenced small segments of chloroplast DNA (part of the rbcL gene) and plant nuclear DNA (ITS-2). The obtained sequences were compared to sequences generated from known plant samples and to those in GenBank to identify plant taxa in the feces. With further optimization, this method could provide a basic evaluation of minimum primate dietary diversity even when knowledge of local flora is limited. This approach may find application in studies characterizing the diets of poorly-known, unhabituated primate species or assaying consumer-resource relationships in an ecosystem. (c) 2007 Wiley-Liss, Inc.
Compilation of DNA sequences of Escherichia coli (update 1991)
Kröger, Manfred; Wahl, Ralf; Rice, Peter
1991-01-01
We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the third listing replacing and increasing the former listing roughly by one fifth. However, in order to save space this printed version contains DNA sequence information only. The complete compilation is now available in machine readable form from the EMBL data library (ECD release 6). After deletion of all detected overlaps a total of 1 492 282 individual bp is found to be determined till the beginning of 1991. This corresponds to a total of 31.62% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for statistical purposes only. PMID:2041799
The EMBL nucleotide sequence database
Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann
2001-01-01
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039
Lotte, Romain; Lotte, Laurène; Degand, Nicolas; Gaudart, Alice; Gabriel, Sylvie; Ben H'dech, Mouna; Blois, Mathilde; Rinaldi, Jean-Paul; Ruimy, Raymond
2015-06-23
Helcococcus kunzii is a facultative anaerobic bacterium that was first described by Collins et al. in 1993, and was initially considered as a commensal of the human skin, in particular of lower extremities. Human infections caused by H. kunzii remain rare with only a few cases published in the pubmed database. Nevertheless recent reports indicate that this microorganism has to be considered as an opportunistic pathogen that can be involved in severe infections in human. To the best of our knowledge, we describe here the first known case of infectious endocarditis caused by H. kunzii. A 79 year-old man reporting severe polyvascular medical history attended the emergency ward for rapid deterioration of his general state of health. After physical examination and paraclinical investigations, the diagnosis of infectious endocarditis on native mitral valve caused by Helcococcus kunzii was established based on Dukes criteria. MALDI-TOF mass spectrometry and 16S rDNA sequencing allowed an accurate identification to the species level of Helcococcus kunzii. The patient was successfully treated by a medico-surgical approach. The treatment consisted in intravenous amoxicillin during four weeks and mitral valve replacement with a bioprosthestic valve. After an in depth review of patient's medical file, the origin of infection remained unknown. However, a cutaneous portal of entry cannot be excluded as the patient and his General Practitioner reported chronic ulcerations of both feet. We describe here the first case of endocarditis caused by H. kunzii in an elderly patient with polyvascular disease. This report along with previous data found in the literature emphasizes the invasive potential of this bacterial species as an opportunistic pathogen, in particular for patient with polyvascular diseases. MALDI-TOF mass spectrometry and 16S rDNA sequencing are reliable tools for H. kunzii identification. We also sequenced in this work H.kunzii type strain 103932T CIP and deposited in the Genbank under accession number KM403387. We noticed a 14 base difference between our sequence and the original sequence deposited by Collins et al. under Genbank accession number X69837. Hopefully, the spread of next generation sequencing tools would lead to a more accurate classification of clinical strains.
DNA microarray-based PCR ribotyping of Clostridium difficile.
Schneeberg, Alexander; Ehricht, Ralf; Slickers, Peter; Baier, Vico; Neubauer, Heinrich; Zimmermann, Stefan; Rabold, Denise; Lübke-Becker, Antina; Seyboldt, Christian
2015-02-01
This study presents a DNA microarray-based assay for fast and simple PCR ribotyping of Clostridium difficile strains. Hybridization probes were designed to query the modularly structured intergenic spacer region (ISR), which is also the template for conventional and PCR ribotyping with subsequent capillary gel electrophoresis (seq-PCR) ribotyping. The probes were derived from sequences available in GenBank as well as from theoretical ISR module combinations. A database of reference hybridization patterns was set up from a collection of 142 well-characterized C. difficile isolates representing 48 seq-PCR ribotypes. The reference hybridization patterns calculated by the arithmetic mean were compared using a similarity matrix analysis. The 48 investigated seq-PCR ribotypes revealed 27 array profiles that were clearly distinguishable. The most frequent human-pathogenic ribotypes 001, 014/020, 027, and 078/126 were discriminated by the microarray. C. difficile strains related to 078/126 (033, 045/FLI01, 078, 126, 126/FLI01, 413, 413/FLI01, 598, 620, 652, and 660) and 014/020 (014, 020, and 449) showed similar hybridization patterns, confirming their genetic relatedness, which was previously reported. A panel of 50 C. difficile field isolates was tested by seq-PCR ribotyping and the DNA microarray-based assay in parallel. Taking into account that the current version of the microarray does not discriminate some closely related seq-PCR ribotypes, all isolates were typed correctly. Moreover, seq-PCR ribotypes without reference profiles available in the database (ribotype 009 and 5 new types) were correctly recognized as new ribotypes, confirming the performance and expansion potential of the microarray. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
The NCBI BioCollections Database
Sharma, Shobha; Ciufo, Stacy; Starchenko, Elena; Darji, Dakshesh; Chlumsky, Larry; Karsch-Mizrachi, Ilene
2018-01-01
Abstract The rapidly growing set of GenBank submissions includes sequences that are derived from vouchered specimens. These are associated with culture collections, museums, herbaria and other natural history collections, both living and preserved. Correct identification of the specimens studied, along with a method to associate the sample with its institution, is critical to the outcome of related studies and analyses. The National Center for Biotechnology Information BioCollections Database was established to allow the association of specimen vouchers and related sequence records to their home institutions. This process also allows cross-linking from the home institution for quick identification of all records originating from each collection. Database URL: https://www.ncbi.nlm.nih.gov/biocollections PMID:29688360
NASA Technical Reports Server (NTRS)
La Duc, Myron Thomas (Inventor); Venkateswaran, Kasthuri (Inventor)
2007-01-01
The present invention relates to discovery and isolation of a biologically pure culture of a Bacillus odysseyi isolate with high adherence and sterilization resistant properties. B. odysseyi is a round spore forming Bacillus species that produces an exosporium. This novel species has been characterized on the basis of phenotypic traits, 16S rDNA sequence analysis and DNA-DNA hybridization. According to the results of these analyses, this strain belongs to the genus Bacillus and the type strain is 34hs-1.sup.T (=ATCC PTA-4993.sup.T=NRRL B-30641.sup.T=NBRC 100172.sup.T). The GenBank accession number for the 16S rDNA sequence of strain 34hs-1.sup.T is AF526913.
Lin, Shu-Xiang; Wang, Wei; Guo, Wei; Yang, Hong-Jiang; Ma, Bai-Cheng; Fang, Yu-Lian; Xu, Yong-Sheng
2017-07-01
To investigate the relationship of KI polyomavirus (KIPyV) and WU polyomavirus (WUPyV) with acute respiratory infection in children in Tianjin, China. A total of 3 730 nasopharyngeal secretions were collected from hospitalized children with acute respiratory infection in Tianjin Children's Hospital from January 2011 to December 2013. Viral nucleic acid was extracted, and virus infection (KIPyV and WUPyV) was determined by PCR. Some KIPyV-positive and WUPyV-positive PCR products were subjected to sequencing. Sequencing results were aligned with the known gene sequences of KIPyV and WUPyV to construct a phylogenetic tree. Amplified VP1 fragments of KIPyV were inserted into the cloning vector (PUCm-T) transformed into E. coli competent cells. Positive clones were identified by PCR and sequencing. The nucleotide sequences were submitted to GenBank. In addition, another seven common respiratory viruses in all samples were detected by direct immunofluorescence assay. In the 3 730 specimens, the KIPyV-positive rate was 12.14% (453/3 730) and the WUPyV-positive rate was 1.69% (63/3 730). The mean infection rate of KIPyV was significantly higher in June and July, while the mean infection rate of WUPyV peaked in February and March. Most of the KIPyV-positive or WUPyV-positive children were <3 years. The co-infections with KIPyV, WUPyV, and other respiratory viruses were observed in the children. The co-infection rate was 2.31% (86/3 730) and there were nine cases of co-infections with WUPyV and KIPyV. Thirty-five KIPyV-positive and twelve WUPyV-positive PCR products were sequenced and the alignment analysis showed that they had high homology with the known sequences (94%-100% vs 95%-100%). The VP1 gene sequences obtained from two KIPyV strains in this study were recorded in GenBank with the accession numbers of KY465925 and KY465926. For some children with acute respiratory infection in Tianjin, China, the acute respiratory infection may be associated with KIPyV and WUPyV infections. KIPyV infection is common in summer, and WUPyV infection in spring. The epidemic strains in Tianjin have a high homology with those in other regions.
Analysis of the Genome and Chromium Metabolism-Related Genes of Serratia sp. S2.
Dong, Lanlan; Zhou, Simin; He, Yuan; Jia, Yan; Bai, Qunhua; Deng, Peng; Gao, Jieying; Li, Yingli; Xiao, Hong
2018-05-01
This study is to investigate the genome sequence of Serratia sp. S2. The genomic DNA of Serratia sp. S2 was extracted and the sequencing library was constructed. The sequencing was carried out by Illumina 2000 and complete genomic sequences were obtained. Gene function annotation and bioinformatics analysis were performed by comparing with the known databases. The genome size of Serratia sp. S2 was 5,604,115 bp and the G+C content was 57.61%. There were 5373 protein coding genes, and 3732, 3614, and 3942 genes were respectively annotated into the GO, KEGG, and COG databases. There were 12 genes related to chromium metabolism in the Serratia sp. S2 genome. The whole genome sequence of Serratia sp. S2 is submitted to the GenBank database with gene accession number of LNRP00000000. Our findings may provide theoretical basis for the subsequent development of new biotechnology to repair environmental chromium pollution.
Genetic analysis of duck circovirus in Pekin ducks from South Korea.
Cha, S-Y; Kang, M; Cho, J-G; Jang, H-K
2013-11-01
The genetic organization of the 24 duck circovirus (DuCV) strains detected in commercial Pekin ducks from South Korea between 2011 and 2012 is described in this study. Multiple sequence alignment and phylogenetic analyses were performed on the 24 viral genome sequences as well as on 45 genome sequences available from the GenBank database. Phylogenetic analyses based on the genomic and open reading frame 2/cap sequences demonstrated that all DuCV strains belonged to genotype 1 and were designated in a subcluster under genotype 1. Analysis of the capsid protein amino acid sequences of the 24 Korean DuCV strains showed 10 substitutions compared with that of other genotype 1 strains. Our analysis showed that genotype 1 is predominant and circulating in South Korea. These present results serve as incentive to add more data to the DuCV database and provide insight to conduct further intensive study on the geographic relationships among these virus strains.
Kamarudin, Kamarul Rahim; Rehan, Maryam Mohamed
2015-01-01
This preliminary study aimed to identify a commercial gamat species, Stichopus horrens Selenka, 1867, and a timun laut species, Holothuria (Mertensiothuria) leucospilota (Brandt, 1835), from Pangkor Island, Perak, Malaysia, employing morphological techniques based on the shape of the ossicles and molecular techniques based on the cytochrome c oxidase I (COI) mitochondrial DNA (mtDNA) gene. In Malaysia, a gamat is defined as a sea cucumber species of the family Stichopodidae with medicinal value, and timun laut refers to non-gamat species. S. horrens is very popular on Pangkor Island as a main ingredient in the traditional production of air gamat and minyak gamat, while H. leucospilota is the most abundant species in Malaysia. In contrast to previous studies, internal body parts (the respiratory tree and gastrointestine) were examined in this study to obtain better inferences based on morphology. The results showed that there were no ossicles present in the gastrointestine of H. leucospilota, and this characteristic is suggested as a unique diagnostic marker for the timun laut species. In addition, the presence of Y-shaped rods in the respiratory tree of S. horrens subsequently supported the potential to use internal body parts to identify the gamat species. Phylogenetic analysis of the COI mtDNA gene of the sea cucumber specimens using the neighbour-joining method and maximum likelihood methods further confirmed the species status of H. leucospilota and S. horrens from Pangkor Island, Perak, Malaysia. The COI mtDNA gene sequences were registered with GenBank, National Center for Biotechnology Information (NCBI), US National Library of Medicine (GenBank accession no.: KC405565-KC405568). Although additional specimens from various localities will be required to produce more conclusive results, the current findings provide better insight into the importance of complementary approaches involving morphological and molecular techniques in the identification of the two Malaysian sea cucumber species. PMID:26868593
Kamarudin, Kamarul Rahim; Rehan, Maryam Mohamed
2015-04-01
This preliminary study aimed to identify a commercial gamat species, Stichopus horrens Selenka, 1867, and a timun laut species, Holothuria (Mertensiothuria) leucospilota (Brandt, 1835), from Pangkor Island, Perak, Malaysia, employing morphological techniques based on the shape of the ossicles and molecular techniques based on the cytochrome c oxidase I (COI) mitochondrial DNA (mtDNA) gene. In Malaysia, a gamat is defined as a sea cucumber species of the family Stichopodidae with medicinal value, and timun laut refers to non-gamat species. S. horrens is very popular on Pangkor Island as a main ingredient in the traditional production of air gamat and minyak gamat, while H. leucospilota is the most abundant species in Malaysia. In contrast to previous studies, internal body parts (the respiratory tree and gastrointestine) were examined in this study to obtain better inferences based on morphology. The results showed that there were no ossicles present in the gastrointestine of H. leucospilota, and this characteristic is suggested as a unique diagnostic marker for the timun laut species. In addition, the presence of Y-shaped rods in the respiratory tree of S. horrens subsequently supported the potential to use internal body parts to identify the gamat species. Phylogenetic analysis of the COI mtDNA gene of the sea cucumber specimens using the neighbour-joining method and maximum likelihood methods further confirmed the species status of H. leucospilota and S. horrens from Pangkor Island, Perak, Malaysia. The COI mtDNA gene sequences were registered with GenBank, National Center for Biotechnology Information (NCBI), US National Library of Medicine (GenBank accession no.: KC405565-KC405568). Although additional specimens from various localities will be required to produce more conclusive results, the current findings provide better insight into the importance of complementary approaches involving morphological and molecular techniques in the identification of the two Malaysian sea cucumber species.
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
[Experimental and calculated spectra of the amplicons UBC-85 and UBC-126 (RAPD-PCR)].
Glazko, G V; Rogozin, I B; Glazko, V I; Zelenaia, L B; Sozinov, A A
1997-01-01
The comparative analysis of experimental amplification spectrum in 13 Ungulata species and counting ones in DNA sequences of different taxa in GenBank (mammalian, other vertebrate, invertebrate, viruses, prokaryote) with the uses of RAPD-PCR primers UBC-85 and UBC-126 was carried out. The particularities of the distribution of amplicons' frequencies in experimental and counting spectrums were revealed, for some of them the similar increased frequencies in mammalian and prokaryotic species were observed.
Fourment, Mathieu; Gibbs, Mark J
2008-01-01
Background Viruses of the Bunyaviridae have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database. Results The VirusBanker database contains Bunyaviridae sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. Bunyaviridae sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview. Conclusion VirusBanker allows large datasets of aligned nucleotide and protein sequences from the Bunyaviridae to be compiled and winnowed rapidly using criteria that are formulated heuristically. PMID:18251994
Cooling towers--a potential environmental source of slow-growing mycobacterial species.
Black, Walter C; Berk, Sharon G
2003-01-01
Over the last decade a rise in the frequency of disease caused by nontuberculous mycobacteria (NTM) has occurred, especially among AIDS patients. The lack of evidence for person-to-person transmission indicates the environment is a source of infection. The ecology and environmental sources of NTMs are poorly understood, and many pathogenic strains have not been observed outside of clinical cases. Several species of NTMs have been reported from treated water distribution systems; however, one type of manmade environment that has not been examined for mycobacteria is that of cooling towers of air-conditioning systems. Such environments not only harbor a variety of microbial species, they also disseminate them in aerosols. The present investigation examined nine cooling towers from various locations in the United States. Cooling tower water was concentrated, treated with cetylpyridinium chloride, and plated onto Middlebrook 7H10 agar supplemented with OADC and cycloheximide. Colonies presumed to be mycobacterial species were isolated and acid-fast stained. Identification was made by amplifying and sequencing 1450 bp fragments of the 16S rRNA gene in both directions, and comparing resulting sequences with those in GenBank. Results showed that at least 75% of tower samples contained NTMs, and most of the isolates closely matched known mycobacterial pathogens. Isolates most closely matched the following GenBank sequences: Mycobacterium intracellulare, M. szulgai, M. bohemicum, M. gordonae, M. nonchromogenicum, and M. n. sp. "Fuerth 1999." This is the first report of specific NTMs in cooling tower water, and the first report of M. n. sp. "Fuerth 1999" from any environmental sample. Although cooling towers have a relatively high pH, they may favor the growth and dissemination of such potential pathogens, and future epidemiologic investigations should consider cooling towers as possible environmental sources of mycobacteria.
Purification and Characterization of Four β-Expansins (Zea m 1 Isoforms) from Maize Pollen1[w
Li, Lian-Chao; Bedinger, Patricia A.; Volk, Carol; Jones, A. Daniel; Cosgrove, Daniel J.
2003-01-01
Four proteins with wall extension activity on grass cell walls were purified from maize (Zea mays) pollen by conventional column chromatography and high-performance liquid chromatography. Each is a basic glycoprotein (isoelectric point = 9.1–9.5) of approximately 28 kD and was identified by immunoblot analysis as an isoform of Zea m 1, the major group 1 allergen of maize pollen and member of the β-expansin family. Four distinctive cDNAs for Zea m 1 were identified by cDNA library screening and by GenBank analysis. One pair (GenBank accession nos. AY104999 and AY104125) was much closer in sequence to well-characterized allergens such as Lol p 1 and Phl p 1 from ryegrass (Lolium perenne) and Phleum pretense, whereas a second pair was much more divergent. The N-terminal sequence and mass spectrometry fingerprint of the most abundant isoform (Zea m 1d) matched that predicted for AY197353, whereas N-terminal sequences of the other isoforms matched or nearly matched AY104999 and AY104125. Highly purified Zea m 1d induced extension of a variety of grass walls but not dicot walls. Wall extension activity of Zea m 1d was biphasic with respect to protein concentration, had a broad pH optimum between 5 and 6, required more than 50 μg mL-1 for high activity, and led to cell wall breakage after only approximately 10% extension. These characteristics differ from those of α-expansins. Some of the distinctive properties of Zea m 1 may not be typical of β-expansins as a class but may relate to the specialized function of this β-expansin in pollen function. PMID:12913162
New report of additional enterobacterial species causing wilt in West Bengal, India.
Sarkar, Shamayeeta; Chaudhuri, Sujata
2015-07-01
Ralstonia solanacearum is known to be the most prominent causal agent of bacterial wilt worldwide. It has a wide host range comprising solanaceous and nonsolanaceous plants. Typical symptoms of the disease are leaf wilt, browning of vascular tissues, and collapsing of the plant. With the objective of studying the diversity of pathogens causing bacterial wilt in West Bengal, we collected samples of diseased symptomatic crops and adjacent symptomatic and asymptomatic weeds from widespread locations in West Bengal. By means of a routine molecular identification test specific to "R. solanacearum species complex", the majority of these strains (68 out of 71) were found to not be R. solanacearum. Presumptive identification of these isolates with conventional biochemicals, extensive testing of pathogenicity of a subset involving greenhouse trials fulfilling Koch's postulate test, and scanning electron microscopic analysis for the presence of pathogen in diseased plants were done. 16S rDNA sequencing of a subset of these strains (GenBank accession Nos. JX880249-JX880251) and analysis of sequences with the nBLAST programme showed a high similarity (97%-99%) to sequences of the Enterobacteriaceae group available in GenBank. Molecular phylogeny further established the taxonomic position of the strains. The 3 bacterial strain cultures have been submitted to MTCC, Institute of Microbial Technology, Chandigarh, India, and were identified as Klebsiella oxytoca, Enterobacter cowanii, and Klebsiella oxytoca, respectively. Although Enterobacter sp. has previously been reported to cause wilt in many plants, susceptibility of most of the dedicated hosts of R. solanacearum to wilt caused by Enterobacter and other bacteria from Enterobacteriaceae is being reported for the first time in this work.
Molecular epidemiology and genetic diversity of Entamoeba species in a chelonian collection.
García, Gabriela; Ramos, Fernando; Pérez, Rodrigo Gutiérrez; Yañez, Jorge; Estrada, Mónica Salmerón; Mendoza, Lilian Hernández; Martinez-Hernandez, Fernando; Gaytán, Paul
2014-02-01
Veterinary medicine has focused recently on reptiles, due to the existence of captive collections in zoos and an increase in the acquisition of reptiles as pets. The protozoan parasite, Entamoeba can cause amoebiasis in various animal species and humans. Although amoebiasis disease is remarkably rare in most species of chelonians and crocodiles, these species may serve as Entamoeba species carriers that transmit parasites to susceptible reptile species, such as snakes and lizards, which can become sick and die. In this study, we identified the Entamoeba species in a population of healthy (disease-free) chelonians, and evaluated their diversity through the amplification and sequencing of a small subunit rDNA region. Using this procedure, three Entamoeba species were identified: Entamoeba invadens in 4.76 % of chelonians, Entamoeba moshkovskii in 3.96 % and Entamoeba terrapinae in 50 %. We did not detect mixed Entamoeba infections. Comparative analysis of the amplified region allowed us to determine the intra-species variations. The E. invadens and E. moshkovskii strains isolated in this study did not exhibit marked differences with respect to the sequences reported in GenBank. The analysis of the E. terrapinae isolates revealed three different subgroups (A, B and C). Although subgroups A and C were very similar, subgroup B showed a relatively marked difference with respect to subgroups A and C (Fst = 0.984 and Fst = 1.000, respectively; 10-14 % nucleotide variation, as determined by blast) and with respect to the sequences reported in GenBank. These results suggested that E. terrapinae subgroup B may be either in a process of speciation or belong to a different lineage. However, additional research is necessary to support this statement conclusively.
Networking Biology: The Origins of Sequence-Sharing Practices in Genomics.
Stevens, Hallam
2015-10-01
The wide sharing of biological data, especially nucleotide sequences, is now considered to be a key feature of genomics. Historians and sociologists have attempted to account for the rise of this sharing by pointing to precedents in model organism communities and in natural history. This article supplements these approaches by examining the role that electronic networking technologies played in generating the specific forms of sharing that emerged in genomics. The links between early computer users at the Stanford Artificial Intelligence Laboratory in the 1960s, biologists using local computer networks in the 1970s, and GenBank in the 1980s, show how networking technologies carried particular practices of communication, circulation, and data distribution from computing into biology. In particular, networking practices helped to transform sequences themselves into objects that had value as a community resource.
Wang, Qian; Hu, Chunjin; Ke, Fanggang; Huang, Siliang; Li, Qiqin
2010-09-01
Anthracnose caused by Colletotrichum gloeosporioides (Penz.) Sacc. is a main disease in citrus production. To develop an effective biocontrol measure against citrus postharvest anthracnose, we screened antagonistic microbes and obtained a bacterial strain 1404 from the rhizospheric soil of chili plants in Nanning city, Guangxi, China. The objectives of the present study were to: (1) identify and characterize the antagonistic bacterium; and (2) to evaluate the efficacy of the antagonistic strain in controlling citrus postharvest anthracnose disease. Strain 1404 was identified by comparing its 16S rDNA sequence with related bacteria from GenBank database, as well as analyzing its morphological, physiological and biochemical characters. The antagonistic stability of the strain 1404 was determined by continuously transferring it on artificial media. The effect of the strain on suppressing citrus anthracnose at postharvest stage was tested by stab inoculation method. The 16S rDNA of strain 1404 was amplified with primers PF1 (5'-AGAGTTTGATCATGGCTCAG-3') and PR1 (5'-TACGGTTACCTTGTTACGACTT-3') and its sequence submitted to GenBank (accession number: GU361113). Strain 1404 clustered with the GenBank-derived Brevibacillus brevis strains in the 16S-rDNA-sequence-based phylogenetic tree at 100% bootstrap level. The morphological traits, physiological and biochemical characters of strain 1404 agreed with that of Brevibacillus brevis. Less change in the suppressive ability of antagonist against growth of Colletotrichum gloeosporioides was observed during four continuous transfers on artificial media. The average control efficacy of the strain was 64. 9 % against the disease 20 days after the antagonist application. Strain 1404 was identified as Brevibacillus brevis based on its morphological traits, phyiological and biochemical characters as well as 16S rDNA sequence analysis. The antagonist was approved to be a promising biocontrol agent. This is the first report of Brevibacillus brevis as an effective antagonist against citrus postharvest anthracnose disease.
[Complete genome sequencing and sequence analysis of BCG Tice].
Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli
2012-10-04
The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.
Integrated databanks access and sequence/structure analysis services at the PBIL.
Perrière, Guy; Combet, Christophe; Penel, Simon; Blanchet, Christophe; Thioulouse, Jean; Geourjon, Christophe; Grassot, Julien; Charavay, Céline; Gouy, Manolo; Duret, Laurent; Deléage, Gilbert
2003-07-01
The World Wide Web server of the PBIL (Pôle Bioinformatique Lyonnais) provides on-line access to sequence databanks and to many tools of nucleic acid and protein sequence analyses. This server allows to query nucleotide sequence banks in the EMBL and GenBank formats and protein sequence banks in the SWISS-PROT and PIR formats. The query engine on which our data bank access is based is the ACNUC system. It allows the possibility to build complex queries to access functional zones of biological interest and to retrieve large sequence sets. Of special interest are the unique features provided by this system to query the data banks of gene families developed at the PBIL. The server also provides access to a wide range of sequence analysis methods: similarity search programs, multiple alignments, protein structure prediction and multivariate statistics. An originality of this server is the integration of these two aspects: sequence retrieval and sequence analysis. Indeed, thanks to the introduction of re-usable lists, it is possible to perform treatments on large sets of data. The PBIL server can be reached at: http://pbil.univ-lyon1.fr.
Wide distribution range of rhizobial symbionts associated with pantropical sea-dispersed legumes.
Bamba, Masaru; Nakata, Sayuri; Aoki, Seishiro; Takayama, Koji; Núñez-Farfán, Juan; Ito, Motomi; Miya, Masaki; Kajita, Tadashi
2016-12-01
To understand the geographic distributions of rhizobia that associated with widely distributed wild legumes, 66 nodules obtained from 41 individuals including three sea-dispersed legumes (Vigna marina, Vigna luteola, and Canavalia rosea) distributed across the tropical and subtropical coastal regions of the world were studied. Partial sequences of 16S rRNA and nodC genes extracted from the nodules showed that only Bradyrhizobium and Sinorhizobium were associated with the pantropical legumes, and some of the symbiont strains were widely distributed over the Pacific. Horizontal gene transfer of nodulation genes were observed within the Bradyrhizobium and Sinorhizobium lineages. BLAST searches in GenBank also identified records of these strains from various legumes across the world, including crop species. However, one of the rhizobial strains was not found in GenBank, which implies the strain may have adapted to the littoral environment. Our results suggested that some rhizobia, which associate with the widespread sea-dispersed legume, distribute across a broad geographic range. By establishing symbiotic relationships with widely distributed rhizobia, the pantropical legumes may also be able to extend their range much further than other legume species.
Woma, Timothy Y; van Vuuren, Moritz; Bosman, Ana-Mari; Quan, Melvyn; Oosthuizen, Marinda
2010-07-14
There are no reports of CDV isolations in southern Africa, and although CDV is said to have geographically distinct lineages, molecular information of African strains has not yet been documented. Viruses isolated in cell cultures were subjected to reverse transcription-polymerase chain reaction (RT-PCR), and the complete H gene was sequenced and phylogenetically analysed with other strains from GenBank. Phylogenetic comparisons of the complete H gene of CDV isolates from different parts of the world (available in GenBank) with wild-type South African isolates revealed nine clades. All South African isolates form a separate African clade of their own and thus are clearly separated from the American, European, Asian, Arctic and vaccine virus clades. It is likely that only the 'African lineage' of CDV may be circulating in South Africa currently, and the viruses isolated from dogs vaccinated against CDV are not the result of reversion to virulence of vaccine strains, but infection with wild-type strains. (c) 2009 Elsevier B.V. All rights reserved.
Seligmann, Hervé
2013-05-07
GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
Dehydrin expression as a potential diagnostic tool for cold stress in white clover.
Vaseva, Irina Ivanova; Anders, Iwona; Yuperlieva-Mateeva, Bistra; Nenkova, Rosa; Kostadinova, Anelia; Feller, Urs
2014-05-01
Cold acclimation is important for crop survival in environments undergoing seasonal low temperatures. It involves the induction of defensive mechanisms including the accumulation of different cryoprotective molecules among which are dehydrins (DHN). Recently several sequences coding for dehydrins were identified in white clover (Trifolium repens). This work aimed to select the most responsive to cold stress DHN analogues in search for cold stress diagnostic markers. The assessment of dehydrin transcript accumulation via RT-PCR and immunodetection performed with three antibodies against the conserved K-, Y-, and S-segment allowed to outline different dehydrin types presented in the tested samples. Both analyses confirmed that YnKn dehydrins were underrepresented in the controls but exposure to low temperature specifically induced their accumulation. Strong immunosignals corresponding to 37-40 kDa with antibodies against Y- and K-segment were revealed in cold-stressed leaves. Another 'cold-specific' band at position 52-55 kDa was documented on membranes probed with antibodies against K-segment. Real time RT-qPCR confirmed that low temperatures induced the accumulation of SKn and YnSKn transcripts in leaves and reduced their expression in roots. Results suggest that a YnKn dehydrin transcript with GenBank ID: KC247805 and the immunosignal at 37-40 kDa, obtained with antibodies against Y- and K-segment are reliable markers for cold stress in white clover. The assessment of SKn (GenBank ID: EU846208) and YnSKn (GenBank ID: KC247804) transcript levels in leaves could serve as additional diagnostic tools. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Wang, R F; Cao, W W; Cerniglia, C E
1996-01-01
In order to develop a PCR method to detect Fusobacterium prausnitzii in human feces and to clarify the phylogenetic position of this species, its 16S rRNA gene sequence was determined. The sequence described in this paper is different from the 16S rRNA gene sequence is specific for F. prausnitzii, and the results of this assay confirmed that F. prausnitzii is the most common species in human feces. However, a PCR assay based on the original GenBank sequence was negative when it was performed with two strains of F. prausnitzii obtained from the American Type Culture Collection. A phylogenetic tree based on the new 16S rRNA gene sequence was constructed. On this tree F. prausnitzii was not a member of the Fusobacterium group but was closer to some Eubacterium spp. and located between Clostridium "clusters III and IV" (M.D. Collins, P.A. Lawson, A. Willems, J.J. Cordoba, J. Fernandez-Garayzabal, P. Garcia, J. Cai, H. Hippe, and J.A.E. Farrow, Int. J. Syst. Bacteriol. 44:812-826, 1994).
Detection of Plasmodium sp. in capybara.
dos Santos, Leonilda Correia; Curotto, Sandra Mara Rotter; de Moraes, Wanderlei; Cubas, Zalmir Silvino; Costa-Nascimento, Maria de Jesus; de Barros Filho, Ivan Roque; Biondo, Alexander Welker; Kirchgatter, Karin
2009-07-07
In the present study, we have microscopically and molecularly surveyed blood samples from 11 captive capybaras (Hydrochaeris hydrochaeris) from the Sanctuary Zoo for Plasmodium sp. infection. One animal presented positive on blood smear by light microscopy. Polymerase chain reaction was carried out accordingly using a nested genus-specific protocol, which uses oligonucleotides from conserved sequences flanking a variable sequence region in the small subunit ribosomal RNA (ssrRNA) of all Plasmodium organisms. This revealed three positive animals. Products from two samples were purified and sequenced. The results showed less than 1% divergence between the two capybara sequences. When compared with GenBank sequences, a 55% similarity was obtained to Toxoplasma gondii and a higher similarity (73-77.2%) was found to ssrRNAs from Plasmodium species that infect reptile, avian, rodents, and human beings. The most similar Plasmodium sequence was from Plasmodium mexicanum that infects lizards of North America, where around 78% identity was found. This work is the first report of Plasmodium in capybaras, and due to the low similarity with other Plasmodium species, we suggest it is a new species, which, in the future could be denominated "Plasmodium hydrochaeri".
Habitat-Lite: A GSC case study based on free text terms for environmental metadata
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kyrpides, Nikos; Hirschman, Lynette; Clark, Cheryl
2008-04-01
There is an urgent need to capture metadata on the rapidly growing number of genomic, metagenomic and related sequences, such as 16S ribosomal genes. This need is a major focus within the Genomic Standards Consortium (GSC), and Habitat is a key metadata descriptor in the proposed 'Minimum Information about a Genome Sequence' (MIGS) specification. The goal of the work described here is to provide a light-weight, easy-to-use (small) set of terms ('Habitat-Lite') that captures high-level information about habitat while preserving a mapping to the recently launched Environment Ontology (EnvO). Our motivation for building Habitat-Lite is to meet the needs ofmore » multiple users, such as annotators curating these data, database providers hosting the data, and biologists and bioinformaticians alike who need to search and employ such data in comparative analyses. Here, we report a case study based on semi-automated identification of terms from GenBank and GOLD. We estimate that the terms in the initial version of Habitat-Lite would provide useful labels for over 60% of the kinds of information found in the GenBank isolation-source field, and around 85% of the terms in the GOLD habitat field. We present a revised version of Habitat-Lite and invite the community's feedback on its further development in order to provide a minimum list of terms to capture high-level habitat information and to provide classification bins needed for future studies.« less
Meena, Ram Prasnna; Baranwal, V K
2016-09-01
Citrus trees harbor a large number of viral and bacterial pathogens. Citrus yellow vein clearing virus (CYVCV), Indian citrus ringspot virus (ICRSV), Citrus yellow mosaic virus (CYMV), Citrus tristeza virus (CTV) and a bacterium, Candidatus Liberibacter asiaticus (CLa) associated with huanglongbing (HLB) disease, the most prevalent pathogens in citrus orchards of different regions in India and are responsible for debilitating citriculture. For detection of these viral and bacterial pathogens a quick, sensitive and cost effective detection method is required. With this objective a multiplex polymerase chain reaction (mPCR) assay was developed for simultaneous detection of four viruses and a bacterium in citrus. Several sets of primers were designed for each virus based on the retrieved reference sequences from the GenBank. A primer pair published previously was used for greening bacterium. Each pair of primers was evaluated for their sensitivity and differentiation by simplex and mPCR. The constant amplified products were identified on the basis of molecular size in mPCR and were compared with standard PCR. The amplicons were cloned and results were confirmed with sequencing analysis. The mPCR assay was validated using naturally infected field samples for one or more citrus viruses and the huanglongbing bacterium. The mPCR assay developed here will aid in the production of virus free planting materials and rapid indexing for certification of citrus budwood programme. Copyright © 2016 Elsevier B.V. All rights reserved.
Lupini, Caterina; Giovanardi, Davide; Pesente, Patrizia; Bonci, Michela; Felice, Viviana; Rossi, Giulia; Morandini, Emilio; Cecchinato, Mattia; Catelli, Elena
2016-08-01
A distinctive infectious bursal disease (IBD) virus genotype (ITA) was detected in IBD-live vaccinated broilers in Italy without clinical signs of IBD. It was isolated in specific-pathogen-free eggs and molecularly characterized in the hypervariable region of the virus protein (VP) 2. Phylogenetic analysis showed that ITA strains clustered separately from other homologous reference sequences of IBDVs, either classical or very virulent, retrieved from GenBank or previously reported in Italy, and from vaccine strains. The new genotype shows peculiar molecular characteristics in key positions of the VP2 hypervariable region, which affect charged or potentially glycosylated amino acids virtually associated with important changes in virus properties. Characterization of 41 IBDV strains detected in Italy between 2013 and 2014 showed that ITA is emergent in densely populated poultry areas of Italy, being 68% of the IBDV detections made during routine diagnostic activity over a two-year period, in spite of the immunity induced by large-scale vaccination. Four very virulent strains (DV86) and one classical strain (HPR2), together with eight vaccine strains, were also detected. The currently available epidemiological and clinical data do not allow the degree of pathogenicity of the ITA genotype to be defined. Only in vivo experimental pathogenicity studies conducted in secure isolation conditions, through the evaluation of clinical signs and macro/microscopic lesions, will clarify conclusively the virulence of the new Italian genotype.
Emergence of 2.1. subgenotype of classical swine fever virus in pig population of India in 2011.
Rajkhowa, T K; Hauhnar, Lalthapui; Lalrohlua, Isaac; Mohanarao G, Jagan
2014-01-01
Limited studies are available on molecular epidemiology of classical swine fever virus (CSFV) in India and are restricted to domestic pigs. These studies show the presence of 1.1. genotype. The aim of the present study was to subgenotype four CSFV isolates, two each from the outbreaks of CSF in wild (Sus scrofa) and domestic pigs of Mizoram state, India, in 2011. CSFV isolates were subjected to nucleotide sequencing in E2 and NS5B genomic regions. Phylogenetic analysis of the isolates in both genomic regions was carried out with 39 Indian isolates (4 isolates from the present study of Mizoram state and 35 isolates from the other states of India) and 57 reference sequences retrieved from the GenBank database. Two of the 39 isolates from India were collected from wild boar and were subgenotyped as 2.1. Out of 37 isolates from domestic pigs, only two were subgenotyped as 2.1. The analysis revealed the emergence of 2.1. subgenotype of CSFV in both wild and domestic pigs in India. The isolates from domestic pigs of Mizoram state (CSF/MZ/KOL/73 and CSF/MZ/AIZ/115) were grouped in genotype 1 and subgenotype 1.1., thus confirming that the source of CSF outbreaks in domesticated pigs in Mizoram was not from wild pigs. The current study forms an essential step for better understanding of the epidemiology of 2.1 subgroup as well as the movement and spread of the disease in India.
[New isolation methods and phylogenetic diversity of actinobacteria from hypersaline beach in Aksu].
Zhang, Yao; Xia, Zhanfeng; Cao, Xinbo; Li, Jun; Zhang, Lili
2013-08-04
We explored 4 new methods to improve the isolation of actinobacterial resources from high salt areas. Optimized media based on 4 new strategies were used for isolating actinobacteria from hypersaline beaches. Glycerin-arginine, trehalose-creatine, glycerol-asparticacid, mannitol-casein, casein-mannitol, mannitol-alanine, chitosan-asparagineand GAUZE' No. 1 were used as basic media. New isolation strategy includes 4 methods: ten-fold dilution culture, simulation of the original environment, actinobacterial culture guided by uncultured molecular technology detected, and reference of actinobacterial media for brackish marine environment. The 16S rRNA genes of the isolates were amplified with bacterial universal primers. The results of 16S rRNA gene sequences were compared with sequences obtained from GenBank databases. We constructed phylogenetic tree with the neighbor-joining method. No actinobacterial strains were isolated by 8 media of control group, while 403 strains were isolated by new strategies. The isolates by new methods were members of 14 genera (Streptomyces, Streptomonospora, Saccharomonospora, Plantactinospora, Nocardia, Amycolatopsis, Glycomyces, Micromonospora, Nocardiopsis, Isoptericola, Nonomuraea, Thermobifida, Actinopolyspora, Actinomadura) of 10 families in 8 suborders. The most abundant and diverse isolates were the two suborders of Streptomycineae (69.96%) and Streptosporangineaesuborder (9.68%) within the phylum Actinobacteria, including 9 potential novel species. New isolation methods significantly improved the actinobacterial culturability of hypersaline areas, and obtained many potential novel species, which provided a new and more effective way to isolate actinobacteria resources in hypersaline environments.
Wu, Haibo; Peng, Xiuming; Peng, Xiaorong; Cheng, Linfang; Lu, Xiangyun; Jin, Changzhong; Xie, Tiansheng; Yao, Hangping; Wu, Nanping
2015-12-01
The H4 subtype of the influenza virus was first isolated in 1999 from pigs with pneumonia in Canada. H4 avian influenza viruses (AIVs) are able to cross the species barrier to infect humans. In order to better understand the genetic relationships between H4 AIV strains circulating in Eastern China and other AIV strains from Asia, a survey of domestic ducks in live poultry markets was undertaken in Zhejiang province from 2013 to 2014. In this study, 23 H4N2 (n = 14) and H4N6 (n = 9) strains were isolated from domestic ducks, and all eight gene segments of these strains were sequenced and compared to reference AIV strains available in GenBank. The isolated strains clustered primarily within the Eurasian lineage. No mutations associated with adaption to mammalian hosts or drug resistance was observed. The H4 reassortant strains were found to be of low pathogenicity in mice and able to replicate in the lung of the mice without prior adaptation. Continued surveillance is required, given the important role of domestic ducks in reassortment events leading to new AIVs.
Sugarcane giant borer transcriptome analysis and identification of genes related to digestion.
Fonseca, Fernando Campos de Assis; Firmino, Alexandre Augusto Pereira; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; de Souza Júnior, José Dijair Antonino; de Sousa Júnior, José Dijair Antonino; Silva-Junior, Orzenil Bonfim; Togawa, Roberto Coiti; Pappas, Georgios Joannis; de Góis, Luiz Avelar Brandão; da Silva, Maria Cristina Mattar; Grossi-de-Sá, Maria Fátima
2015-01-01
Sugarcane is a widely cultivated plant that serves primarily as a source of sugar and ethanol. Its annual yield can be significantly reduced by the action of several insect pests including the sugarcane giant borer (Telchin licus licus), a lepidopteran that presents a long life cycle and which efforts to control it using pesticides have been inefficient. Although its economical relevance, only a few DNA sequences are available for this species in the GenBank. Pyrosequencing technology was used to investigate the transcriptome of several developmental stages of the insect. To maximize transcript diversity, a pool of total RNA was extracted from whole body insects and used to construct a normalized cDNA database. Sequencing produced over 650,000 reads, which were de novo assembled to generate a reference library of 23,824 contigs. After quality score and annotation, 43% of the contigs had at least one BLAST hit against the NCBI non-redundant database, and 40% showed similarities with the lepidopteran Bombyx mori. In a further analysis, we conducted a comparison with Manduca sexta midgut sequences to identify transcripts of genes involved in digestion. Of these transcripts, many presented an expansion or depletion in gene number, compared to B. mori genome. From the sugarcane giant borer (SGB) transcriptome, a number of aminopeptidase N (APN) cDNAs were characterized based on homology to those reported as Cry toxin receptors. This is the first report that provides a large-scale EST database for the species. Transcriptome analysis will certainly be useful to identify novel developmental genes, to better understand the insect's biology and to guide the development of new strategies for insect-pest control.
Ballados-González, G G; Sánchez-Montes, S; Romero-Salas, D; Colunga Salas, P; Gutiérrez-Molina, R; León-Paniagua, L; Becker, I; Méndez-Ojeda, M L; Barrientos-Salcedo, C; Serna-Lagunes, R; Cruz-Romero, A
2018-06-01
The genus Leptospira encompass 22 species of spirochaetes, with ten pathogenic species that have been recorded in more than 160 mammals worldwide. In the last two decades, the numbers of records of these agents associated with bats have increased exponentially, particularly in America. Although order Chiroptera represents the second most diverse order of mammals in Mexico, and leptospirosis represents a human and veterinary problem in the country, few studies have been conducted to identify potential wildlife reservoirs. The aim of this study was to detect the presence and diversity of Leptospira sp. in communities of bats in an endemic state of leptospirosis in Mexico. During January to September 2016, 81 bats of ten species from three localities of Veracruz, Mexico, were collected with mist nets. Kidney samples were obtained from all specimens. For the detection of Leptospira sp., we amplified several genes using specific primers. Amplicons of the expected size were submitted to sequencing, and sequences recovered were compared with those of reference deposited in GenBank using the BLAST tool. To identify their phylogenetic position, we realized a reconstruction using maximum-likelihood (ML) method. Twenty-five samples from three bat species (Artibeus lituratus, Choeroniscus godmani and Desmodus rotundus) showed the presence of Leptospira DNA. Sequences recovered were close to Leptospira noguchii, Leptospira weilii and Leptospira interrogans. Our results include the first record of Leptospira in bats from Mexico and exhibit a high diversity of these pathogens circulating in the state. Due to the finding of a large number of positive wild animals, it is necessary to implement a surveillance system in populations of the positive bats as well as in related species, in order to understand their role as carriers of this bacterial genus. © 2018 Blackwell Verlag GmbH.
Shariffah-Muzaimah, S A; Idris, A S; Madihah, A Z; Dzolkhifli, O; Kamaruzzaman, S; Maizatul-Suriza, M
2017-12-18
Ganoderma boninense, the main causal agent of oil palm (Elaeis guineensis) basal stem rot (BSR), severely reduces oil palm yields around the world. To reduce reliance on fungicide applications to control BSR, we are investigating the efficacy of alternative control methods, such as the application of biological control agents. In this study, we used four Streptomyces-like actinomycetes (isolates AGA43, AGA48, AGA347 and AGA506) that had been isolated from the oil palm rhizosphere and screened for antagonism towards G. boninense in a previous study. The aim of this study was to characterize these four isolates and then to assess their ability to suppress BSR in oil palm seedlings when applied individually to the soil in a vermiculite powder formulation. Analysis of partial 16S rRNA gene sequences (512 bp) revealed that the isolates exhibited a very high level of sequence similarity (> 98%) with GenBank reference sequences. Isolates AGA347 and AGA506 showed 99% similarity with Streptomyces hygroscopicus subsp. hygroscopicus and Streptomyces ahygroscopicus, respectively. Isolates AGA43 and AGA48 also belonged to the Streptomyces genus. The most effective formulation, AGA347, reduced BSR in seedlings by 73.1%. Formulations using the known antifungal producer Streptomyces noursei, AGA043, AGA048 or AGA506 reduced BSR by 47.4, 30.1, 54.8 and 44.1%, respectively. This glasshouse trial indicates that these Streptomyces spp. show promise as potential biological control agents against Ganoderma in oil palm. Further investigations are needed to determine the mechanism of antagonism and to increase the shelf life of Streptomyces formulations.
Sugarcane Giant Borer Transcriptome Analysis and Identification of Genes Related to Digestion
de Assis Fonseca, Fernando Campos; Firmino, Alexandre Augusto Pereira; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; de Sousa Júnior, José Dijair Antonino; Silva-Junior, Orzenil Bonfim; Togawa, Roberto Coiti; Pappas, Georgios Joannis; de Góis, Luiz Avelar Brandão; da Silva, Maria Cristina Mattar; Grossi-de-Sá, Maria Fátima
2015-01-01
Sugarcane is a widely cultivated plant that serves primarily as a source of sugar and ethanol. Its annual yield can be significantly reduced by the action of several insect pests including the sugarcane giant borer (Telchin licus licus), a lepidopteran that presents a long life cycle and which efforts to control it using pesticides have been inefficient. Although its economical relevance, only a few DNA sequences are available for this species in the GenBank. Pyrosequencing technology was used to investigate the transcriptome of several developmental stages of the insect. To maximize transcript diversity, a pool of total RNA was extracted from whole body insects and used to construct a normalized cDNA database. Sequencing produced over 650,000 reads, which were de novo assembled to generate a reference library of 23,824 contigs. After quality score and annotation, 43% of the contigs had at least one BLAST hit against the NCBI non-redundant database, and 40% showed similarities with the lepidopteran Bombyx mori. In a further analysis, we conducted a comparison with Manduca sexta midgut sequences to identify transcripts of genes involved in digestion. Of these transcripts, many presented an expansion or depletion in gene number, compared to B. mori genome. From the sugarcane giant borer (SGB) transcriptome, a number of aminopeptidase N (APN) cDNAs were characterized based on homology to those reported as Cry toxin receptors. This is the first report that provides a large-scale EST database for the species. Transcriptome analysis will certainly be useful to identify novel developmental genes, to better understand the insect’s biology and to guide the development of new strategies for insect-pest control. PMID:25706301
Serrao, Natasha R; Steinke, Dirk; Hanner, Robert H
2014-01-01
Detecting and documenting the occurrence of invasive species outside their native range requires tools to support their identification. This can be challenging for taxa with diverse life stages and/or problematic or unresolved morphological taxonomies. DNA barcoding provides a potent method for identifying invasive species, as it allows for species identification at all life stages, including fragmentary remains. It also provides an efficient interim taxonomic framework for quantifying cryptic genetic diversity by parsing barcode sequences into discontinuous haplogroup clusters (typical of reproductively isolated species) and labelling them with unique alphanumeric identifiers. Snakehead fishes are a diverse group of opportunistic predators endemic to Asia and Africa that may potentially pose significant threats as aquatic invasive species. At least three snakehead species (Channa argus, C. maculata, and C. marulius) are thought to have entered North America through the aquarium and live-food fish markets, and have established populations, yet their origins remain unclear. The objectives of this study were to assemble a library of DNA barcode sequences derived from expert identified reference specimens in order to determine the identity and aid invasion pathway analysis of the non-indigenous species found in North America using DNA barcodes. Sequences were obtained from 121 tissue samples representing 25 species and combined with public records from GenBank for a total of 36 putative species, which then partitioned into 49 discrete haplogroups. Multiple divergent clusters were observed within C. gachua, C. marulius, C. punctata and C. striata suggesting the potential presence of cryptic species diversity within these lineages. Our findings demonstrate that DNA barcoding is a valuable tool for species identification in challenging and under-studied taxonomic groups such as snakeheads, and provides a useful framework for inferring invasion pathway analysis.
Ennahar, Saïd; Cai, Yimin; Fujita, Yasuhito
2003-01-01
A total of 161 low-G+C-content gram-positive bacteria isolated from whole-crop paddy rice silage were classified and subjected to phenotypic and genetic analyses. Based on morphological and biochemical characters, these presumptive lactic acid bacterium (LAB) isolates were divided into 10 groups that included members of the genera Enterococcus, Lactobacillus, Lactococcus, Leuconostoc, Pediococcus, and Weissella. Analysis of the 16S ribosomal DNA (rDNA) was used to confirm the presence of the predominant groups indicated by phenotypic analysis and to determine the phylogenetic affiliation of representative strains. The virtually complete 16S rRNA gene was PCR amplified and sequenced. The sequences from the various LAB isolates showed high degrees of similarity to those of the GenBank reference strains (between 98.7 and 99.8%). Phylogenetic trees based on the 16S rDNA sequence displayed high consistency, with nodes supported by high bootstrap values. With the exception of one species, the genetic data was in agreement with the phenotypic identification. The prevalent LAB, predominantly homofermentative (66%), consisted of Lactobacillus plantarum (24%), Lactococcus lactis (22%), Leuconostoc pseudomesenteroides (20%), Pediococcus acidilactici (11%), Lactobacillus brevis (11%), Enterococcus faecalis (7%), Weissella kimchii (3%), and Pediococcus pentosaceus (2%). The present study, the first to fully document rice-associated LAB, showed a very diverse community of LAB with a relatively high number of species involved in the fermentation process of paddy rice silage. The comprehensive 16S rDNA-based approach to describing LAB community structure was valuable in revealing the large diversity of bacteria inhabiting paddy rice silage and enabling the future design of appropriate inoculants aimed at improving its fermentation quality. PMID:12514026
Ennahar, Saïd; Cai, Yimin; Fujita, Yasuhito
2003-01-01
A total of 161 low-G+C-content gram-positive bacteria isolated from whole-crop paddy rice silage were classified and subjected to phenotypic and genetic analyses. Based on morphological and biochemical characters, these presumptive lactic acid bacterium (LAB) isolates were divided into 10 groups that included members of the genera Enterococcus, Lactobacillus, Lactococcus, Leuconostoc, Pediococcus, and WEISSELLA: Analysis of the 16S ribosomal DNA (rDNA) was used to confirm the presence of the predominant groups indicated by phenotypic analysis and to determine the phylogenetic affiliation of representative strains. The virtually complete 16S rRNA gene was PCR amplified and sequenced. The sequences from the various LAB isolates showed high degrees of similarity to those of the GenBank reference strains (between 98.7 and 99.8%). Phylogenetic trees based on the 16S rDNA sequence displayed high consistency, with nodes supported by high bootstrap values. With the exception of one species, the genetic data was in agreement with the phenotypic identification. The prevalent LAB, predominantly homofermentative (66%), consisted of Lactobacillus plantarum (24%), Lactococcus lactis (22%), Leuconostoc pseudomesenteroides (20%), Pediococcus acidilactici (11%), Lactobacillus brevis (11%), Enterococcus faecalis (7%), Weissella kimchii (3%), and Pediococcus pentosaceus (2%). The present study, the first to fully document rice-associated LAB, showed a very diverse community of LAB with a relatively high number of species involved in the fermentation process of paddy rice silage. The comprehensive 16S rDNA-based approach to describing LAB community structure was valuable in revealing the large diversity of bacteria inhabiting paddy rice silage and enabling the future design of appropriate inoculants aimed at improving its fermentation quality.
2013-01-01
Background APOAI, a member of the APOAI/CIII/IV/V gene cluster on chromosome 11q23-24, encodes a major protein component of HDL that has been associated with serum lipid levels. The aim of this study was to determine the genetic association of polymorphisms in the APOAI promoter region with plasma lipid levels in a cohort of healthy Kuwaiti volunteers. Methods A 435 bp region of the APOAI promoter was analyzed by re-sequencing in 549 Kuwaiti samples. DNA was extracted from blood taken from 549 healthy Kuwaiti volunteers who had fasted for the previous 12 h. Univariate and multivariate analysis was used to determine allele association with serum lipid levels. Results The target sequence included a partial segment of the promoter region, 5’UTR and exon 1 located between nucleotides −141 to +294 upstream of the APOAI gene on chromosome 11. No novel single nucleotide polymorphisms (SNPs) were observed. The sequences obtained were deposited with the NCBI GenBank with accession number [GenBank: JX438706]. The allelic frequencies for the three SNPs were as follows: APOAI rs670G = 0.807; rs5069C = 0.964; rs1799837G = 0.997 and found to be in HWE. A significant association (p < 0.05) was observed for the APOAI rs670 polymorphism with increased serum LDL-C. Multivariate analysis showed that APOAI rs670 was an independent predictive factor when controlling for age, sex and BMI for both LDL-C (OR: 1.66, p = 0.014) and TC (OR: 1.77, p = 0.006) levels. Conclusion This study is the first to report sequence analysis of the APOAI promoter in an Arab population. The unexpected positive association found between the APOAI rs670 polymorphism and increased levels of LDL-C and TC may be due to linkage disequilibrium with other polymorphisms in candidate and neighboring genes known to be associated with lipid metabolism and transport. PMID:24028463
Al-Bustan, Suzanne A; Al-Serri, Ahmad E; Annice, Babitha G; Alnaqeeb, Majed A; Ebrahim, Ghada A
2013-09-12
APOAI, a member of the APOAI/CIII/IV/V gene cluster on chromosome 11q23-24, encodes a major protein component of HDL that has been associated with serum lipid levels. The aim of this study was to determine the genetic association of polymorphisms in the APOAI promoter region with plasma lipid levels in a cohort of healthy Kuwaiti volunteers. A 435 bp region of the APOAI promoter was analyzed by re-sequencing in 549 Kuwaiti samples. DNA was extracted from blood taken from 549 healthy Kuwaiti volunteers who had fasted for the previous 12 h. Univariate and multivariate analysis was used to determine allele association with serum lipid levels. The target sequence included a partial segment of the promoter region, 5'UTR and exon 1 located between nucleotides -141 to +294 upstream of the APOAI gene on chromosome 11. No novel single nucleotide polymorphisms (SNPs) were observed. The sequences obtained were deposited with the NCBI GenBank with accession number [GenBank: JX438706]. The allelic frequencies for the three SNPs were as follows: APOAI rs670G = 0.807; rs5069C = 0.964; rs1799837G = 0.997 and found to be in HWE. A significant association (p < 0.05) was observed for the APOAI rs670 polymorphism with increased serum LDL-C. Multivariate analysis showed that APOAI rs670 was an independent predictive factor when controlling for age, sex and BMI for both LDL-C (OR: 1.66, p = 0.014) and TC (OR: 1.77, p = 0.006) levels. This study is the first to report sequence analysis of the APOAI promoter in an Arab population. The unexpected positive association found between the APOAI rs670 polymorphism and increased levels of LDL-C and TC may be due to linkage disequilibrium with other polymorphisms in candidate and neighboring genes known to be associated with lipid metabolism and transport.
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures
Pride, David T; Schoenfeld, Thomas
2008-01-01
Background Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. Results From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. Conclusion That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis. PMID:18798991
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.
Pride, David T; Schoenfeld, Thomas
2008-09-17
Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis.
Wik, Lotta; Mikko, Sofia; Klingeborn, Mikael; Stéen, Margareta; Simonsson, Magnus; Linné, Tommy
2012-01-01
The prion protein (PrP) sequence of European moose, reindeer, roe deer and fallow deer in Scandinavia has high homology to the PrP sequence of North American cervids. Variants in the European moose PrP sequence were found at amino acid position 109 as K or Q. The 109Q variant is unique in the PrP sequence of vertebrates. During the 1980s a wasting syndrome in Swedish moose, Moose Wasting Syndrome (MWS), was described. SNP analysis demonstrated a difference in the observed genotype proportions of the heterozygous Q/K and homozygous Q/Q variants in the MWS animals compared with the healthy animals. In MWS moose the allele frequencies for 109K and 109Q were 0.73 and 0.27, respectively, and for healthy animals 0.69 and 0.31. Both alleles were seen as heterozygotes and homozygotes. In reindeer, PrP sequence variation was demonstrated at codon 176 as D or N and codon 225 as S or Y. The PrP sequences in roe deer and fallow deer were identical with published GenBank sequences. PMID:22441661
Sequence Polishing Library (SPL) v10.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oberortner, Ernst
The Sequence Polishing Library (SPL) is a suite of software tools in order to automate "Design for Synthesis and Assembly" workflows. Specifically: The SPL "Converter" tool converts files among the following sequence data exchange formats: CSV, FASTA, GenBank, and Synthetic Biology Open Language (SBOL); The SPL "Juggler" tool optimizes the codon usages of DNA coding sequences according to an optimization strategy, a user-specific codon usage table and genetic code. In addition, the SPL "Juggler" can translate amino acid sequences into DNA sequences.:The SPL "Polisher" verifies NA sequences against DNA synthesis constraints, such as GC content, repeating k-mers, and restriction sites.more » In case of violations, the "Polisher" reports the violations in a comprehensive manner. The "Polisher" tool can also modify the violating regions according to an optimization strategy, a user-specific codon usage table and genetic code;The SPL "Partitioner" decomposes large DNA sequences into smaller building blocks with partial overlaps that enable an efficient assembly. The "Partitioner" enables the user to configure the characteristics of the overlaps, which are mostly determined by the utilized assembly protocol, such as length, GC content, or melting temperature.« less
Suthar, Jaydipbhai
2016-01-01
Pseudoterranovosis is a well-known human disease caused by anisakid larvae belonging to the genus Pseudoterranova. Human infection occurs after consuming infected fish. Hence the presence of Pseudoterranova larvae in the flesh of the fish can cause serious losses and problems for the seafood, fishing and fisheries industries. The accurate identification of Pseudoterranova larvae in fish is important, but challenging because the larval stages of a number of different genera, including Pseudoterranova, Terranova and Pulchrascaris, look similar and cannot be differentiated from each other using morphological criteria, hence they are all referred to as Terranova larval type. Given that Terranova larval types in seafood are not necessarily Pseudoterranova and may not be dangerous, the aim of the present study was to investigate the occurrence of Terranova larval types in Australian marine fish and to determine their specific identity. A total of 137 fish belonging to 45 species were examined. Terranova larval types were found in 13 species, some of which were popular edible fish in Australia. The sequences of the first and second internal transcribed spacers (ITS-1 and ITS-2 respectively) of the Terranova larvae in the present study showed a high degree of similarity suggesting that they all belong to the same species. Due to the lack of a comparable sequence data of a well identified adult in the GenBank database the specific identity of Terranova larval type in the present study remains unknown. The sequence of the ITS regions of the Terranova larval type in the present study and those of Pseudoterranova spp. available in GenBank are significantly different, suggesting that larvae found in the present study do not belong to the genus Pseudoterranova, which is zoonotic. This study does not rule out the presence of Pseudoterranova larvae in Australian fish as Pseudoterranova decipiens E has been reported in adult form from seals in Antarctica and it is known that they have seasonal presence in Australian southern coasts. The genetic distinction of Terranova larval type in the present study from Pseudoterranova spp. along with the presence of more species of elasmobranchs in Australian waters (definitive hosts of Terranova spp. and Pulchrascaris spp.) than seals (definitive hosts of Pseudoterranova spp.) suggest that Terranova larval type in the present study belong to either genus Terranova or Pulchrascaris, which are not known to cause disease in humans. The present study provides essential information that could be helpful to identify Australian Terranova larval types in future studies. Examination and characterisation of further specimens, especially adults of Terranova and Pulchrascaris, is necessary to fully elucidate the identity of these larvae. PMID:27014510
Wang, Yan; Ma, Yan; Xu, Xiaoting; Hao, Shuang; Han, Yue; Yao, Wenqing; Zhao, Zhuo
2015-07-01
To wished to characterize the hemagglutinin (H) gene of the measles virus epidemic strain H1a in Liaoning Province (China) from 1997-2014 to provide a basis for the control and elimination of measles. All 63 measles virus strains were the H1a genotype. Fragments of the H gene (1854 nucleotides) and nucleoprotein (N) gene (450 nucleotides) were amplified by reverse transcription-polymerase chain reaction (RT-PCR) and the PCR products sequenced and analyzed. Phylogenetic-trees were constructed with reference strains of the genotype-H measles virus downloaded from GenBank, including Chinese measles virus strains isolated in 1993-1994 and the vaccine reference strains S-191 and C-47. Sixty-three strains of the measles virus in 1997-2014 belonged to genotype H1a. The mean evolutionary rate for gene N-450 was higher than that for the H gene. All 63 strains of the measles virus were mutated from: serine (Ser S) to asparagine (Asn N) in the 240th amino acid; arginine (Arg R) to glycine (Gly G) in the 243th; and tyrosine (Tyr Y) to Asn N in the 481th amino acid. All measles virus strains in cluster 2 were mutated from proline (Pro P) to leucine (Leu L) in the 397th amino acid. The other neutralization sites showed no apparent difference when comparing the nucleotides/amino acids of the H gene of S191 vaccine strains.
Quan, Phenix-Lan; Junglen, Sandra; Tashmukhamedova, Alla; Conlan, Sean; Hutchison, Stephen K.; Kurth, Andreas; Ellerbrok, Heinz; Egholm, Michael; Briese, Thomas; Leendertz, Fabian H.; Ian Lipkin, W
2009-01-01
Characterization of arboviruses at the interface of pristine habitats and anthropogenic landscapes is crucial to comprehensive emergent disease surveillance and forecasting efforts. In context of surveillance campaign in and around a West African rainforest, particles morphologically consistent with rhabdoviruses were identified in cell cultures infected with homogenates of trapped mosquitoes. RNA recovered from these cultures was used to derive the first complete genome sequence of a rhabdovirus isolated from Culex decens mosquitoes in Côte d’Ivoire, tentatively named Moussa virus (MOUV). MOUV shows the classical genome organization of rhabdoviruses, with five open reading frames (ORF) in a linear order. However, sequences show only limited conservation (12–33% identity at amino acid level), and ORF2 and ORF3 have no significant similarity to sequences deposited in GenBank. Phylogenetic analysis indicates a potential new species with distant relationship to Tupaia and Tibrogargan virus. PMID:19804801
[Integrated DNA barcoding database for identifying Chinese animal medicine].
Shi, Lin-Chun; Yao, Hui; Xie, Li-Fang; Zhu, Ying-Jie; Song, Jing-Yuan; Zhang, Hui; Chen, Shi-Lin
2014-06-01
In order to construct an integrated DNA barcoding database for identifying Chinese animal medicine, the authors and their cooperators have completed a lot of researches for identifying Chinese animal medicines using DNA barcoding technology. Sequences from GenBank have been analyzed simultaneously. Three different methods, BLAST, barcoding gap and Tree building, have been used to confirm the reliabilities of barcode records in the database. The integrated DNA barcoding database for identifying Chinese animal medicine has been constructed using three different parts: specimen, sequence and literature information. This database contained about 800 animal medicines and the adulterants and closely related species. Unknown specimens can be identified by pasting their sequence record into the window on the ID page of species identification system for traditional Chinese medicine (www. tcmbarcode. cn). The integrated DNA barcoding database for identifying Chinese animal medicine is significantly important for animal species identification, rare and endangered species conservation and sustainable utilization of animal resources.
[Molecular identification of medicinal plant genus Uncaria in Guizhou].
Gang, Tao; Liu, Tao; Zhu, Ying; Liu, Zuo-Yi
2008-06-01
To analyze rDNA ITS regions of the Medicinal Plant Genus Uncaria in Guizhou and construct their phylogenetic tree in order to supply molecular evidence of taxonomy and identification of these Medicinal Plants in genetic level. The ITS gene fragments of the 4 Medicinal Plants were PCR amplified and sequenced. The rDNA ITS regions were analyzed by means of the software of ClustalX, BioEdit and PAUP* 4.0 beta 10. The entire sequences of rDNA ITS1, ITS2, and 5.8S rDNA were obtained, The Maximum-parsimony tree of four ITS regions together with those of similar sequences from GenBank were found, as Mitrayna rubrostipulata (AJ492621 ) and Mitragyna rubrostipulata (AJ605988) were designated as outgroup. The 4 medicinal plants are the 4 species in the genus Uncaria, and are mostly similar to the Uncaria rhynhcophylla.
Quan, Phenix-Lan; Junglen, Sandra; Tashmukhamedova, Alla; Conlan, Sean; Hutchison, Stephen K; Kurth, Andreas; Ellerbrok, Heinz; Egholm, Michael; Briese, Thomas; Leendertz, Fabian H; Lipkin, W Ian
2010-01-01
Characterization of arboviruses at the interface of pristine habitats and anthropogenic landscapes is crucial to comprehensive emergent disease surveillance and forecasting efforts. In context of a surveillance campaign in and around a West African rainforest, particles morphologically consistent with rhabdoviruses were identified in cell cultures infected with homogenates of trapped mosquitoes. RNA recovered from these cultures was used to derive the first complete genome sequence of a rhabdovirus isolated from Culex decens mosquitoes in Côte d'Ivoire, tentatively named Moussa virus (MOUV). MOUV shows the classical genome organization of rhabdoviruses, with five open reading frames (ORF) in a linear order. However, sequences show only limited conservation (12-33% identity at amino acid level), and ORF2 and ORF3 have no significant similarity to sequences deposited in GenBank. Phylogenetic analysis indicates a potential new species with distant relationship to Tupaia and Tibrogargan virus.
Locating and Activating Molecular ‘Time Bombs’: Induction of Mycolata Prophages
Dyson, Zoe A.; Brown, Teagan L.; Farrar, Ben; Doyle, Stephen R.; Tucci, Joseph; Seviour, Robert J.; Petrovski, Steve
2016-01-01
Little is known about the prevalence, functionality and ecological roles of temperate phages for members of the mycolic acid producing bacteria, the Mycolata. While many lytic phages infective for these organisms have been isolated, and assessed for their suitability for use as biological control agents of activated sludge foaming, no studies have investigated how temperate phages might be induced for this purpose. Bioinformatic analysis using the PHAge Search Tool (PHAST) on Mycolata whole genome sequence data in GenBank for members of the genera Gordonia, Mycobacterium, Nocardia, Rhodococcus, and Tsukamurella revealed 83% contained putative prophage DNA sequences. Subsequent prophage inductions using mitomycin C were conducted on 17 Mycolata strains. This led to the isolation and genome characterization of three novel Caudovirales temperate phages, namely GAL1, GMA1, and TPA4, induced from Gordonia alkanivorans, Gordonia malaquae, and Tsukamurella paurometabola, respectively. All possessed highly distinctive dsDNA genome sequences. PMID:27487243
Gautum, K K; Raj, R; Kumar, S; Raj, S K; Roy, R K; Katiyar, R
2014-01-01
The complete RNA3 genome of Cucumber mosaic virus (CMV) was amplified by RT-PCR from three infected gerbera (Gerbera jamesonii) leaf samples exhibiting severe chlorotic mosaic and flower deformation symptoms. The amplicons obtained were cloned sequenced and deposited in GenBank under the accessions JN692495, JX913531 (from cv. Zingaro) and JX888093 (from cv. Silvester). These sequences shared 98-99 % identities to each other and with a strain of CMV-Banana reported from India, and 90-95 % identities with various strains of CMV reported worldwide. Phylogenetic analysis revealed their closest affinity with CMV-Banana strain, and close relationships with several other strains of CMV of subgroup IB. This study provides evidence of subgroup IB CMV causing severe chlorosis and flower deformation in two cultivars (Zingaro and Silvester) of G. jamesonii in India.
Amino acid sequence of the Amur tiger prion protein.
Wu, Changde; Pang, Wanyong; Zhao, Deming
2006-10-01
Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank.
Fayer, R.; Santin, M.; Trout, J.M.; DeStefano, S.; Koenen, K.; Kaur, T.
2006-01-01
Feces from 62 beavers (Castor canadensis) in Massachusetts were examined by fluorescence microscopy (IFA) and polymerase chain reaction (PCR) for Microsporidia species, Cryptosporidium spp., and Giardia spp. between January 2002 and December 2004. PCR-positive specimens were further examined by gene sequencing. Protist parasites were detected in 6.4% of the beavers. All were subadults and kits. Microsporidia species were not detected. Giardia spp. was detected by IFA from four beavers; Cryptosporidium spp. was also detected by IFA from two of these beavers. However, gene sequence data for the ssrRNA gene from these two Cryptosporidium spp.-positive beavers were inconclusive in identifying the species. Nucleotide sequences of the TPI, ssrRNA, and ??-giardin genes for Giardia spp. (deposited in GenBank) indicated that the four beavers were excreting Giardia duodenalis Assemblage B, the zoonotic genotype representing a potential source of waterborne Giardia spp. cysts. Copyright 2006 by American Association of Zoo Veterinarians.
Evidence for a vast peptide overlap between West Nile virus and human proteomes.
Capone, Giovanni; Pagoni, Maria; Delfino, Antonella Pesce; Kanduc, Darja
2013-10-01
The primary amino acid sequence of West Nile virus (WNV) polyprotein, GenBank accession number M12294, was analyzed by computional biology. WNV is a mosquito-borne neurotropic flavivirus that has emerged globally as a significant cause of viral encephalitis in humans. Using pentapeptides as scanning units and the perfect peptide match program from PIR International Protein Sequence Database, we compared the WNV polyprotein and the human proteome. WNV polyprotein showed significant sequence similarities to a number of human proteins. Several of these proteins are involved in embryogenesis, neurite outgrowth, cortical neuron branching, formation of mature synapses, semaphorin interactions, and voltage dependent L-type calcium channel subunits. The biocomputional study suggest that common amino acid segments might represent a potential platform for further studies on the neurological pathophysiology of WNV infections. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Wang, Kaicheng; Lu, Chengping
2007-01-01
A total of 36 streptococcal strains, including seven S. equi ssp.zooepidemicus, two S. suis type 1 (SS1), 24 SS2, two SS9, and one SS7, were tested for glyceraldehyde-3-phosphate dehydrogenase gene (gapdh). Except from non-virulent SS2 strain T1 5, all strains harboured gapdh. The gapdh of Chinese Sichuan SS2 isolate ZY05719 and Jiangsu SS2 isolate HA9801 were sequenced and then compared with published sequences in the GenBank. The comparison revealed a 99.9 % and 99.8 % similarity of ZY05719 and HA9801, respectively, with the published sequence. Adherence assay data demonstrated a significant ((p<0.05)) reduction in adhesion of SS2 in HEp-2 cells pre-incubated with purified GAPDH compared to non pre-incubated controls, suggesting the GAPDH mediates SS2 bacterial adhesion to host cells.
Liao, Ai-Jun; Su, Qi; Wang, Xun; Zeng, Bin; Shi, Wei
2008-01-01
AIM: To isolate and analyze the DNA sequences which are methylated differentially between gastric cancer and normal gastric mucosa. METHODS: The differentially methylated DNA sequences between gastric cancer and normal gastric mucosa were isolated by methylation-sensitive representational difference analysis (MS-RDA). Similarities between the separated fragments and the human genomic DNA were analyzed with Basic Local Alignment Search Tool (BLAST). RESULTS: Three differentially methylated DNA sequences were obtained, two of which have been accepted by GenBank. The accession numbers are AY887106 and AY887107. AY887107 was highly similar to the 11th exon of LOC440683 (98%), 3’ end of LOC440887 (99%), and promoter and exon regions of DRD5 (94%). AY887106 was consistent (98%) with a CpG island in ribosomal RNA isolated from colorectal cancer by Minoru Toyota in 1999. CONCLUSION: The methylation degree is different between gastric cancer and normal gastric mucosa. The differentially methylated DNA sequences can be isolated effectively by MS-RDA. PMID:18322944
Kudlai, Olena; Kostadinova, Aneta; Pulis, Eric E; Tkach, Vasyl V
2015-03-01
Drepanocephalus auritus n. sp. is described based on specimens from the double-crested cormorant Phalacrocorax auritus (Lesson) in North America. The new species differs from its congeners in its very narrow, elongate body, long uterine field and widely separated testes. Sequences of the nuclear rRNA gene cluster, spanning the 3' end of the nuclear ribosomal 18S rRNA gene, internal transcribed spacer region (ITS1+5.8S gene+ITS2) and partial 28S gene (2,345 bp), were identical in specimens collected from North Dakota, Minnesota and Mississippi, USA. Sequences of the 651 bp long fragment of the mitochondrial cox1 gene exhibited very low intraspecific variability (< 1%). Comparisons of the newly-generated sequences with those available in the GenBank indicate that the sequences from North America published under the name D. spathans Dietz, 1909 in fact represent D. auritus n. sp.
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.
Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine
2007-07-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
SSRPrimer and SSR Taxonomy Tree: Biome SSR discovery
Jewell, Erica; Robinson, Andrew; Savage, David; Erwin, Tim; Love, Christopher G.; Lim, Geraldine A. C.; Li, Xi; Batley, Jacqueline; Spangenberg, German C.; Edwards, David
2006-01-01
Simple sequence repeat (SSR) molecular genetic markers have become important tools for a broad range of applications such as genome mapping and genetic diversity studies. SSRs are readily identified within DNA sequence data and PCR primers can be designed for their amplification. These PCR primers frequently cross amplify within related species. We report a web-based tool, SSR Primer, that integrates SPUTNIK, an SSR repeat finder, with Primer3, a primer design program, within one pipeline. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. Results are then parsed to Primer3 for locus specific primer design. We have applied this tool for the discovery of SSRs within the complete GenBank database, and have designed PCR amplification primers for over 13 million SSRs. The SSR Taxonomy Tree server provides web-based searching and browsing of species and taxa for the visualisation and download of these SSR amplification primers. These tools are available at . PMID:16845092
SSRPrimer and SSR Taxonomy Tree: Biome SSR discovery.
Jewell, Erica; Robinson, Andrew; Savage, David; Erwin, Tim; Love, Christopher G; Lim, Geraldine A C; Li, Xi; Batley, Jacqueline; Spangenberg, German C; Edwards, David
2006-07-01
Simple sequence repeat (SSR) molecular genetic markers have become important tools for a broad range of applications such as genome mapping and genetic diversity studies. SSRs are readily identified within DNA sequence data and PCR primers can be designed for their amplification. These PCR primers frequently cross amplify within related species. We report a web-based tool, SSR Primer, that integrates SPUTNIK, an SSR repeat finder, with Primer3, a primer design program, within one pipeline. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. Results are then parsed to Primer3 for locus specific primer design. We have applied this tool for the discovery of SSRs within the complete GenBank database, and have designed PCR amplification primers for over 13 million SSRs. The SSR Taxonomy Tree server provides web-based searching and browsing of species and taxa for the visualisation and download of these SSR amplification primers. These tools are available at http://bioinformatics.pbcbasc.latrobe.edu.au/ssrdiscovery.html.
Beyhan, Yunus E; Karakus, Mehmet; Karagoz, Alper; Mungan, Mesut; Ozkan, Aysegul T; Hokelek, Murat
2017-09-01
To characterize the cutaneous leishmaniasis (CL) isolates of Syrian and Central Anatolia patients at species levels. Methods: Skin scrapings of 3 patients (2 Syrian, 1 Turkish) were taken and examined by direct examination, culture in Novy-MacNeal-Nicole (NNN) medium, internal transcribed spacer polymerase chain reaction and sequence analysis (PCR). Results:According to microscopic examination, culture and PCR methods, 3 samples were detected positive. The sequencing results of all isolates in the study were identified as Leishmania tropica. The same genotypes were detected in the 3 isolates and nucleotide sequence submitted into GenBank with the accession number: KP689599. Conclusion: This finding could give information about the transmission of CL between Turkey and Syria. Because of the Syrian civil war, most of the Syrian citizens circulating in Turkey and different part of Europe, this can be increase the risk of spreading the disease. So, prevention measurements must be taken urgently.
Amareshwari, P; Bhatia, Mayuri; Venkatesh, K; Roja Rani, A; Ravi, G V; Bhakt, Priyanka; Bandaru, Srinivas; Yadav, Mukesh; Nayarisseri, Anuraj; Nair, Achuthsankar S
2015-03-01
Indiscriminate application of pesticides like chlorpyrifos, diazinon, or malathion contaminate the soil in addition has being unsafe often it has raised severe health concerns. Conversely, microorganisms like Trichoderma, Aspergillus and Bacteria like Rhizobium Bacillus, Azotobacter, Flavobacterium etc have evolved that are endowed with degradation of pesticides aforementioned to non-toxic products. The current study pitches into identification of a novel species of Flavobacterium bacteria capable to degrade the Organophosphorous pesticides. The bacterium was isolated from agricultural soil collected from Guntur District, Andhra Pradesh, India. The samples were serially diluted and the aliquots were incubated for a suitable time following which the suspected colony was subjected to 16S rDNA sequencing. The sequence thus obtained was aligned pairwise against Flavobacterium species, which resulted in identification of novel specie of Flavobacterium later named as EMBS0145, the sequence of which was deposited in in GenBank with accession number JN794045.
CHOgenome.org 2.0: Genome resources and website updates.
Kremkow, Benjamin G; Baik, Jong Youn; MacDonald, Madolyn L; Lee, Kelvin H
2015-07-01
Chinese hamster ovary (CHO) cells are a major host cell line for the production of therapeutic proteins, and CHO cell and Chinese hamster (CH) genomes have recently been sequenced using next-generation sequencing methods. CHOgenome.org was launched in 2011 (version 1.0) to serve as a database repository and to provide bioinformatics tools for the CHO community. CHOgenome.org (version 1.0) maintained GenBank CHO-K1 genome data, identified CHO-omics literature, and provided a CHO-specific BLAST service. Recent major updates to CHOgenome.org (version 2.0) include new sequence and annotation databases for both CHO and CH genomes, a more user-friendly website, and new research tools, including a proteome browser and a genome viewer. CHO cell-line specific sequences and annotations facilitate cell line development opportunities, several of which are discussed. Moving forward, CHOgenome.org will host the increasing amount of CHO-omics data and continue to make useful bioinformatics tools available to the CHO community. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Cytochrome c oxidase subunit I barcoding of the green bee-eater (Merops orientalis).
Arif, I A; Khan, H A; Shobrak, M; Williams, J
2011-10-21
DNA barcoding using mitochondrial cytochrome c oxidase subunit I (COI) is regarded as a standard method for species identification. Recent reports have also shown extended applications of COI gene analysis in phylogeny and molecular diversity studies. The bee-eaters are a group of near passerine birds in the family Meropidae. There are 26 species worldwide; five of them are found in Saudi Arabia. Until now, GenBank included a COI barcode for only one species of bee-eater, the European bee-eater (Merops apiaster). We sequenced the 694-bp segment of the COI gene of the green bee-eater M. orientalis and compared the sequences with those of M. apiaster. Pairwise sequence comparison showed 66 variable sites across all the eight sequences from both species, with an interspecific genetic distance of 0.0362. Two and one within-species variable sites were found, with genetic distances of 0.0005 and 0.0003 for M. apiaster and M. orientalis, respectively. This is the first study reporting barcodes for M. orientalis.
Liew, Pauline Woanying; Jong, Bor Chyan
2008-05-01
Two culture-independent methods, namely ribosomal DNA libraries and denaturing gradient gel electrophoresis (DGGE), were adopted to examine the microbial community of a Malaysian light crude oil. In this study, both 16S and 18S rDNAs were PCR-amplified from bulk DNA of crude oil samples, cloned, and sequenced. Analyses of restriction fragment length polymorphism (RFLP) and phylogenetics clustered the 16S and 18S rDNA sequences into seven and six groups, respectively. The ribosomal DNA sequences obtained showed sequence similarity between 90 to 100% to those available in the GenBank database. The closest relatives documented for the 16S rDNAs include member species of Thermoincola and Rhodopseudomonas, whereas the closest fungal relatives include Acremonium, Ceriporiopsis, Xeromyces, Lecythophora, and Candida. Others were affiliated to uncultured bacteria and uncultured ascomycete. The 16S rDNA library demonstrated predomination by a single uncultured bacterial type by >80% relative abundance. The predomination was confirmed by DGGE analysis.
Bào, Yīmíng; Amarasinghe, Gaya K; Basler, Christopher F; Bavari, Sina; Bukreyev, Alexander; Chandran, Kartik; Dolnik, Olga; Dye, John M; Ebihara, Hideki; Formenty, Pierre; Hewson, Roger; Kobinger, Gary P; Leroy, Eric M; Mühlberger, Elke; Netesov, Sergey V; Patterson, Jean L; Paweska, Janusz T; Smither, Sophie J; Takada, Ayato; Towner, Jonathan S; Volchkov, Viktor E; Wahl-Jensen, Victoria; Kuhn, Jens H
2017-05-11
The mononegaviral family Filoviridae has eight members assigned to three genera and seven species. Until now, genus and species demarcation were based on arbitrarily chosen filovirus genome sequence divergence values (≈50% for genera, ≈30% for species) and arbitrarily chosen phenotypic virus or virion characteristics. Here we report filovirus genome sequence-based taxon demarcation criteria using the publicly accessible PAirwise Sequencing Comparison (PASC) tool of the US National Center for Biotechnology Information (Bethesda, MD, USA). Comparison of all available filovirus genomes in GenBank using PASC revealed optimal genus demarcation at the 55-58% sequence diversity threshold range for genera and at the 23-36% sequence diversity threshold range for species. Because these thresholds do not change the current official filovirus classification, these values are now implemented as filovirus taxon demarcation criteria that may solely be used for filovirus classification in case additional data are absent. A near-complete, coding-complete, or complete filovirus genome sequence will now be required to allow official classification of any novel "filovirus." Classification of filoviruses into existing taxa or determining the need for novel taxa is now straightforward and could even become automated using a presented algorithm/flowchart rooted in RefSeq (type) sequences.
Lionfish, Pterois volitans Linnaeus 1758, the complete mitochondrial DNA of an invasive species.
Del Río-Portilla, Miguel A; Vargas-Peralta, Carmen E; Machkour-M'Rabet, Salima; Hénaut, Yann; García-De-León, Francisco J
2016-01-01
The lionfish, Pterois volitans, native from the Indo-Pacific, has been found in Atlantic and Caribbean waters and is considered as an invasive species. Here we sequence its mitogenome (Genbank accession number KJ739816), which has a total length of 16,500 bp, and the arrangement consist of 13 protein-coding genes, 2 ribosomal RNA (rRNA) genes and 22 transfer RNA similar to other Pteroinae subfamily (family Scorpaenidae). This mitogenome will be useful for phylogenetic and population genetic studies of this invasive species.
Type material in the NCBI Taxonomy Database
Federhen, Scott
2015-01-01
Type material is the taxonomic device that ties formal names to the physical specimens that serve as exemplars for the species. For the prokaryotes these are strains submitted to the culture collections; for the eukaryotes they are specimens submitted to museums or herbaria. The NCBI Taxonomy Database (http://www.ncbi.nlm.nih.gov/taxonomy) now includes annotation of type material that we use to flag sequences from type in GenBank and in Genomes. This has important implications for many NCBI resources, some of which are outlined below. PMID:25398905
Chiou, Jiachi; Li, Ruichao
2015-01-01
Vibrio parahaemolyticus is commonly resistant to ampicillin, yet the mechanisms underlying this phenomenon are not clear. In this study, a novel class A carbenicillin-hydrolyzing β-lactamase (CARB) family of β-lactamases, blaCARB-17, was identified and found to be responsible for the intrinsic penicillin resistance in V. parahaemolyticus. Importantly, blaCARB-17-like genes were present in all 293 V. parahaemolyticus genome sequences available in GenBank and detectable in all 91 V. parahaemolyticus food isolates, further confirming the intrinsic nature of this gene. PMID:25801555
Prophiro, Josiane Somariva; Pereira, Thiago Nunes; Oliveira, Joice Guilherme de; Dandolini, Guilherme Werner; Silva, Mario Antonio Navarro da; Silva, Onilda Santos da
2017-01-01
This study registers Ascogregarina spp. infection in field populations of Aedes aegypti and Aedes albopictus in a subtropical region of Brazil. Mosquito larvae collected in tires placed in four municipalities of Santa Catarina were identified morphologically and assessed for Ascogregarina sp. infection using morphological and molecular methods. Both mosquito species harbored Ascogregarina taiwanensis, whose genomic DNA was confirmed in both the Aedes species by PCR. DNA sequences were deposited in GenBank. Conclusion: Both Ae. albopictus e Ae. aegypti harbor Ascogregarina sp.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae).
Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren
2016-04-01
Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae)
Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren
2016-01-01
Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans. PMID:27180575
Niu, Fang-Fang; Zhu, Liang; Wang, Su; Wei, Shu-Jun
2016-07-01
Here, we report the mitochondrial genome sequence of the multicolored Asian lady beetle Harmonia axyridis (Pallas, 1773) (Coleoptera: Coccinellidae) (GenBank accession No. KR108208). This is the first species with sequenced mitochondrial genome from the genus Harmonia. The current length with partitial A + T-rich region of this mitochondrial genome is 16,387 bp. All the typical genes were sequenced except the trnI and trnQ. As in most other sequenced mitochondrial genomes of Coleoptera, there is no re-arrangement in the sequenced region compared with the pupative ancestral arrangement of insects. All protein-coding genes start with ATN codons. Five, five and three protein-coding genes stop with termination codon TAA, TA and T, respectively. Phylogenetic analysis using Bayesian method based on the first and second codon positions of the protein-coding genes supported that the Scirtidae is a basal lineage of Polyphaga. The Harmonia and the Coccinella form a sister lineage. The monophyly of Staphyliniformia, Scarabaeiformia and Cucujiformia was supported. The Buprestidae was found to be a sister group to the Bostrichiformia.
MetaBar - a tool for consistent contextual data acquisition and standards compliant submission.
Hankeln, Wolfgang; Buttigieg, Pier Luigi; Fink, Dennis; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver
2010-06-30
Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft Excel spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data.
Gardner, Shea N.; Hall, Barry G.
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four “raw read” genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths. PMID:24349125
Gardner, Shea N; Hall, Barry G
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.
MetaBar - a tool for consistent contextual data acquisition and standards compliant submission
2010-01-01
Background Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. Results MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft® Excel® spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). Conclusion The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data. PMID:20591175
Mouse mammary tumor virus-like gene sequences are present in lung patient specimens
2011-01-01
Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV)-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18%) of the lung carcinomas and 1 out of 7 (14%) of acute inflamatory lung infiltrate specimens studied of a Mexican Population. PMID:21943279
Seligmann, Hervé
2013-03-01
Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Gopi, M; Ajith Kumar, T T; Balagurunathan, R; Vinoth, R; Dhaneesh, K V; Rajasekaran, R; Balasubramanian, T
2012-02-01
Marine ecosystem of the Lakshadweep archipelago is unique and known to have a very high degree of biodiversity with a number of endemic flora and fauna. The present study focuses to isolate the endosymbiotic microorganism from sponges and its effectiveness against marine ornamental fish pathogens. The sponges were collected from Agatti island of Lakshadweep archipelago and identified as Clathria procera, Sigmadocia fibulata and Dysidea granulosa. In which, 15 different types of bacteria were isolated and screened against marine ornamental fish pathogens (A. hydrophila, Vibrio alginolyticus, V. harveyii, V. parahaemolyticus and Pseudomonas fluorescens). The strain S25 was found as potential bacteria based on their antimicrobial activity against the fish pathogens. Molecular identification of the potential strain (S25) of the 16S rRNA gene showed 99% identity with Acinetobacter sp. The sequenced 16 s rRNA gene with 1,081 bp in length was submitted in NCBI Genbank and Accession was obtained (GenBank Accession number HM004071). The strain exhibited high similarity (99%) with the 16S rRNA gene of Acinetobacter calcoaceticus from GenBank database. Crude extract obtained with acetone and ethyl acetate from extracellular products of S25 showed significant antimicrobial activity by disc diffusion assay using 1,500 μg/ml of crude extract. Extracellular metobolite of A. calcoaceticus was extracted by shake flask method and the crude extract was partially purified by thin layer chromatography. Partially purified crude extract showed significant inhibition zone of antimicrobial activity (A. hydrophila, V. alginolyticus, V. parahaemolyticus) and less similar activity against V. harveyii and P. fluorescens. This is the first report on A. calcoaceticus isolated from sponges of Lakshadweep archipelago and the studies are underway to characterize and purify the antimicrobial compounds of the potential bacteria.
USDA-ARS?s Scientific Manuscript database
The plasma membrane intrinsic proteins (PIP) are one of the five aquaporin protein subfamilies. Aquaporin proteins are known to facilitate water transport through biological membranes. In order to identify NIP aquaporin gene candidates in cotton (Gossypium hirsutum L.), in silico and molecular clon...
De Oliveira, T; Miller, R; Tarin, M; Cassol, S
2003-01-01
Sequence databases encode a wealth of information needed to develop improved vaccination and treatment strategies for the control of HIV and other important pathogens. To facilitate effective utilization of these datasets, we developed a user-friendly GDE-based LINUX interface that reduces input/output file formatting. GDE was adapted to the Linux operating system, bioinformatics tools were integrated with microbe-specific databases, and up-to-date GDE menus were developed for several clinically important viral, bacterial and parasitic genomes. Each microbial interface was designed for local access and contains Genbank, BLAST-formatted and phylogenetic databases. GDE-Linux is available for research purposes by direct application to the corresponding author. Application-specific menus and support files can be downloaded from (http://www.bioafrica.net).
Walter, J.; Tannock, G. W.; Tilsala-Timisjarvi, A.; Rodtong, S.; Loach, D. M.; Munro, K.; Alatossava, T.
2000-01-01
Denaturing gradient gel electrophoresis (DGGE) of DNA fragments obtained by PCR amplification of the V2-V3 region of the 16S rRNA gene was used to detect the presence of Lactobacillus species in the stomach contents of mice. Lactobacillus isolates cultured from human and porcine gastrointestinal samples were identified to the species level by using a combination of DGGE and species-specific PCR primers that targeted 16S-23S rRNA intergenic spacer region or 16S rRNA gene sequences. The identifications obtained by this approach were confirmed by sequencing the V2-V3 region of the 16S rRNA gene and by a BLAST search of the GenBank database. PMID:10618239
Prosdocimi, Francisco; Souto, Helena Magarinos; Ruschi, Piero Angeli; Furtado, Carolina; Jennings, W Bryan
2016-09-01
The genome of the versicoloured emerald hummingbird (Amazilia versicolor) was partially sequenced in one-sixth of an Illumina HiSeq lane. The mitochondrial genome was assembled using MIRA and MITObim software, yielding a circular molecule of 16,861 bp in length and deposited in GenBank under the accession number KF624601. The mitogenome contained 13 protein-coding genes, 22 transfer tRNAs, 2 ribosomal RNAs and 1 non-coding control region. The molecule was assembled using 21,927 sequencing reads of 100 bp each, resulting in ∼130 × coverage of uniformly distributed reads along the genome. This is the forth mitochondrial genome described for this highly diverse family of birds and may benefit further phylogenetic, phylogeographic, population genetic and species delimitation studies of hummingbirds.
Di Luca, Marco; Boccolini, Daniela; Marinuccil, Marino; Romi, Roberto
2004-07-01
We evaluated the internal transcribed spacer two (ITS2) sequence to detect intraspecific polymorphism in the Palearctic Anopheles maculipennis complex, analyzing 52 populations from 12 countries and representing six species. For An. messene, two fragments of the cytochrome oxidase I (COI) gene were also evaluated. The results were compared with GenBank sequences and data from the literature. ITS2 analysis revealed evident intraspecific polymorphism for An. messeae and a slightly less evident polymorphism for An. melanoon, whereas for each of the other species, 100% identity was found among populations. ITS2 analysis of An. messeae identified five haplotypes that were consistent with the geographical origin of the populations. ITS2 seems to be a reliable marker of intraspecific polymorphism for this complex, whereas the COI gene is apparently uninformative.
Heras, Sandra; Maltagliati, Ferruccio; Fernández, Maria Victoria; Roldán, María Inés
2016-07-01
With this work we addressed some molecular systematic issues within the Mugil cephalus species complex. Particular attention was paid to the debated situations of: (i) Mugil liza, occurring in partial sympatry with Mugil cephalus in the northwestern Atlantic, and (ii) Mugil platanus, considered by some authors a synonymy of the former species and distributed in the southwestern Atlantic. We sequenced 79 individuals of a 465-bp portion of the mitochondrial control region (CR) from 8 western Atlantic and 2 Mediterranean localities. In addition, all CR sequences available from GenBank for the studied taxa were added to our dataset, for a total of 323 individuals. Overall, 229 haplotypes corresponding to 8 divergent monophyletic lineages were detected. Results of phylogenetic analyses were consistent with the occurrence of past speciation events producing the observed lineages. Of these lineages, 7 correspond to cryptic species and one is constituted by M. liza and M. platanus. As a matter of fact, these 2 taxa constitute a single lineage within the M. cephalus species complex. However, individuals of M. liza/M. platanus lineage analyzed by means of the 18 mitochondrial markers available in GenBank exhibited a degree of genetic diversity consistent with highly divergent populations. Of the 8 lineages detected, the Mediterraean one (type locality) corresponds to M. cephalus; the lineage M. liza/M. platanus should be named M. liza, under the priority principle, and the left 6 lineages need formal description. © 2015 International Society of Zoological Sciences, Institute of Zoology/Chinese Academy of Sciences and John Wiley & Sons Australia, Ltd.
Gasparinho, Carolina; Ferreira, Filipa S; Mayer, António Carlos; Mirante, Maria Clara; Vaz Nery, Susana; Santos-Reis, Ana; Portugal-Calisto, Daniela; Brito, Miguel
2017-01-01
Abstract Background Giardia lamblia is a pathogenic intestinal protozoan with high prevalence in developing countries, especially among children. Molecular characterization has revealed the existence of eight assemblages, with A and B being more commonly described in human infections. Despite its importance, to our knowledge this is the first published molecular analysis of G. lamblia assemblages in Angola. Methods The present study aimed to identify the assemblages of G. lamblia in children with acute diarrhoea presenting at the Bengo General Hospital, Angola. A stool sample was collected and microscopy and immunochromatographic tests were used. DNA was extracted and assemblage determination was performed through amplification of the gene fragment ssu-rRNA (175 bp) and β-giardin (511 bp) through polymerase chain reaction and DNA sequencing. Results Of the 16 stool samples screened, 12 were successfully sequenced. Eleven isolates were assigned to assemblage B and one to assemblage A. Subassemblage determination was not possible for assemblage B, while the single isolate assigned to assemblage A was identified as belonging to subassemblage A3. Conclusion This study provides information about G. lamblia assemblages in Bengo Province, Angola and may contribute as a first step in understanding the molecular epidemiology of this protozoan in the country. GenBank accession numbers for the ssur-RNA gene: MF479750, MF479751, MF479752, MF479753, MF479754, MF479755, MF479756, MF479757, MF479758, MF479759, MF479760, MF479761. GenBank accession numbers for the β-giardin gene: MF565378, MF565379, MF565380, MF565381. PMID:29438541
Analysis for complete genomic sequence of HLA-B and HLA-C alleles in the Chinese Han population.
Zhu, F; He, Y; Zhang, W; He, J; He, J; Xu, X; Lv, H; Yan, L
2011-08-01
In the present study, we have determined the complete genomic sequence and analysed the intron polymorphism of partial HLA-B and HLA-C alleles in the Chinese Han population. Over 3.0 kb DNA fragments of HLA-B and HLA-C loci were amplified by polymerase chain reaction from partial 5' untranslated region to 3' noncoding region respectively, and then the amplified products were sequenced. Full-length nucleotide sequences of 14 HLA-B alleles and 10 HLA-C alleles were obtained and have been submitted to GenBank and IMGT/HLA database. Two novel alleles of HLA-B*52:01:01:02 and HLA-B*59:01:01:02 were identified, and the complete genomic sequence of HLA-B*52:01:01:01 was firstly reported. Totally 157 and 167 polymorphism positions were found in the full-length genomic sequence of HLA-B and HLA-C loci respectively. Our results suggested that many single nucleotide polymorphisms existed in the exon and intron regions, and the data can provide useful information for understanding the evolution of HLA-B and HLA-C alleles. © 2011 Blackwell Publishing Ltd.
Comprehensive evolutionary and phylogenetic analysis of Hepacivirus N (HNV).
da Silva, M S; Junqueira, D M; Baumbach, L F; Cibulski, S P; Mósena, A C S; Weber, M N; Silveira, S; de Moraes, G M; Maia, R D; Coimbra, V C S; Canal, C W
2018-05-24
Hepaciviruses (HVs) have been detected in several domestic and wild animals and present high genetic diversity. The actual classification divides the genus Hepacivirus into 14 species (A-N), according to their phylogenetic relationships, including the bovine hepacivirus [Hepacivirus N (HNV)]. In this study, we confirmed HNV circulation in Brazil and sequenced the whole genome of two strains. Based on the current classification of HCV, which is divided into genotypes and subtypes, we analysed all available bovine hepacivirus sequences in the GenBank database and proposed an HNV classification. All of the sequences were grouped into a single genotype, putatively named 'genotype 1'. This genotype can be clearly divided into four subtypes: A and D containing sequences from Germany and Brazil, respectively, and B and C containing Ghanaian sequences. In addition, the NS3-coding region was used to estimate the time to the most recent common ancestor (TMRCA) of each subtype, using a Bayesian approach and a relaxed molecular clock model. The analyses indicated a common origin of the virus circulating in Germany and Brazil. Ghanaian sequences seemed to have an older TMRCA, indicating a long time of circulation of these viruses in the African continent.
Database resources of the National Center for Biotechnology Information.
Wheeler, David L; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Ostell, James; Miller, Vadim; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Steven T; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene
2007-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene; Ye, Jian
2009-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
DNA barcoding commercially important aquatic invertebrates of Turkey.
Keskin, Emre; Atar, Hasan Hüseyin
2013-08-01
DNA barcoding was used in order to identify aquatic invertebrates sampled from fisheries bycatch and discards. A total of 440 unique cytochrome c oxidase sub unit I (COI) barcodes were generated for 22 species from three important phyla (Arthropoda, Cnidaria, and Mollusca). All the species were sequenced and submitted to GenBank and Barcode of Life Database (BOLD) databases using 654 bp-long fragment of mitochondrial COI gene. Two of them (Pontastacus leptodactylus and Rapana bezoar) were first records of the species for the BOLD database and six of them (Carcinus aestuarii, Loligo vulgaris, Melicertus kerathurus, Nephrops norvegicus, Scyllarides latus, and Scyllarus arctus) were first standard (>648 bp) COI barcode records for the GenBank database. COI barcodes were analyzed for nucleotide composition, nucleotide pair frequencies, and Kimura's two-parameter genetic distance. Mean genetic distance among species was found increasing at higher taxonomic levels. Neighbor-joining trees generated were congruent with morphometric-based taxonomic classification. Findings of this study clearly demonstrate that DNA barcodes could be used as an efficient molecular tool in identification of not only target species from fisheries but also bycatch and discard species, and so it could provide us leverage for a better understanding in monitoring and management of fisheries and biodiversity.
Drancourt, Michel; Roux, Véronique; Fournier, Pierre-Edouard; Raoult, Didier
2004-01-01
We developed a new molecular tool based on rpoB gene (encoding the beta subunit of RNA polymerase) sequencing to identify streptococci. We first sequenced the complete rpoB gene for Streptococcus anginosus, S. equinus, and Abiotrophia defectiva. Sequences were aligned with these of S. pyogenes, S. agalactiae, and S. pneumoniae available in GenBank. Using an in-house analysis program (SVARAP), we identified a 740-bp variable region surrounded by conserved, 20-bp zones and, by using these conserved zones as PCR primer targets, we amplified and sequenced this variable region in an additional 30 Streptococcus, Enterococcus, Gemella, Granulicatella, and Abiotrophia species. This region exhibited 71.2 to 99.3% interspecies homology. We therefore applied our identification system by PCR amplification and sequencing to a collection of 102 streptococci and 60 bacterial isolates belonging to other genera. Amplicons were obtained in streptococci and Bacillus cereus, and sequencing allowed us to make a correct identification of streptococci. Molecular signatures were determined for the discrimination of closely related species within the S. pneumoniae-S. oralis-S. mitis group and the S. agalactiae-S. difficile group. These signatures allowed us to design a S. pneumoniae-specific PCR and sequencing primer pair. PMID:14766807
Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags
de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.
2000-01-01
Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084
2013-01-01
Background Wild aquatic birds constitute the natural reservoir for avian influenza viruses (AIVs). Separate Eurasian and American AIV gene pools exist. Here, the prevalence and diversity of AIVs in gulls and dabbling ducks in Norway were described. The influence of host species and temporal changes on AIV prevalence was examined. Five AIVs from Norway, including three from common gull (Larus canus), were analyzed along with 10 available AIV genomes from gulls in Eurasia to search for evidence of intracontinental and intercontinental reassortment of gene segments encoding the internal viral proteins. Methods Swabs collected from 2417 dabbling ducks and gulls in the south-west of Norway during five ordinary hunting seasons (August-December) in the period 2005–2010 were analyzed for presence of AIV. Multivariate linear regression was used to identify associations between AIV prevalence, host species and sampling time. Five AIVs from mallard (Anas platyrhynchos) (H3N8, H9N2) and common gull (H6N8, H13N2, H16N3) were full-length characterized and phylogenetically analyzed together with GenBank reference sequences. Results Low pathogenic AIVs were detected in 15.5% (CI: 14.1–17.0) of the samples. The overall AIV prevalence was lower in December compared to that found in August to November (p = 0.003). AIV was detected in 18.7% (CI: 16.8–20.6) of the dabbling ducks. A high AIV prevalence of 7.8% (CI; 5.9–10.0) was found in gulls. A similar temporal pattern in AIV prevalence was found in both bird groups. Thirteen hemagglutinin and eight neuraminidase subtypes were detected. No evidence of intercontinental reassortment was found. Eurasian avian (non H13 and H16) PB2 or PA genes were identified in five reference Eurasian gull (H13 and H16) AIV genomes from GenBank. The NA gene from the Norwegian H13N2 gull isolate was of Eurasian avian origin. Conclusions The similar temporal pattern in AIV prevalence found in dabbling ducks and gulls, the relatively high virus prevalence detected in gulls and the evidence of intracontinental reassortment in AIVs from gulls indicate that gulls that interact with dabbling ducks are likely to be mixing vessels for AIVs from waterfowl and gulls. Our results support that intercontinental reassortment is rare in AIVs from gulls in Eurasia. PMID:23575317
GenColors: annotation and comparative genomics of prokaryotes made easy.
Romualdi, Alessandro; Felder, Marius; Rose, Dominic; Gausmann, Ulrike; Schilhabel, Markus; Glöckner, Gernot; Platzer, Matthias; Sühnel, Jürgen
2007-01-01
GenColors (gencolors.fli-leibniz.de) is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. A variety of export/import filters manages an effective data flow from sequence assembly and manipulation programs (e.g., GAP4) to GenColors and back as well as to standard GenBank file(s). The genome comparison tools include best bidirectional hits, gene conservation, syntenies, and gene core sets. Precomputed UniProt matches allow annotation and analysis in an effective manner. In addition to these analysis options, base-specific quality data (coverage and confidence) can also be handled if available. The GenColors system can be used both for annotation purposes in ongoing genome projects and as an analysis tool for finished genomes. GenColors comes in two types, as dedicated genome browsers and as the Jena Prokaryotic Genome Viewer (JPGV). Dedicated genome browsers contain genomic information on a set of related genomes and offer a large number of options for genome comparison. The system has been efficiently used in the genomic sequencing of Borrelia garinii and is currently applied to various ongoing genome projects on Borrelia, Legionella, Escherichia, and Pseudomonas genomes. One of these dedicated browsers, the Spirochetes Genome Browser (sgb.fli-leibniz.de) with Borrelia, Leptospira, and Treponema genomes, is freely accessible. The others will be released after finalization of the corresponding genome projects. JPGV (jpgv.fli-leibniz.de) offers information on almost all finished bacterial genomes, as compared to the dedicated browsers with reduced genome comparison functionality, however. As of January 2006, this viewer includes 632 genomic elements (e.g., chromosomes and plasmids) of 293 species. The system provides versatile quick and advanced search options for all currently known prokaryotic genomes and generates circular and linear genome plots. Gene information sheets contain basic gene information, database search options, and links to external databases. GenColors is also available on request for local installation.
Zhang, Yu-liang; Kulye, Mahesh; Yang, Feng-shan; Xiao, Luo; Zhang, Yi-tong; Zeng, Hongmei; Wang, Jian-hua; Liu, Zhi-xin
2011-01-01
An allele of the cytochrome P450 gene, CYP6AE14, named CYP6AE25 (GenBank accession no. EU807990) was isolated from the Asian com borer, Ostrinia fumacalis (Guenée) (Lepidoptera: Pyralidae) by RT-PCR. The cDNA sequence of CYP6AE25 is 2315 bp in length and contains a 1569 nucleotides open reading frame encoding a putative protein with 523 amino acid residues and a predicted molecular weight of 59.95 kDa and a theoretical pI of 8.31. The putative protein contains the classic heme-binding sequence motif F××G×××C×G (residues 451–460) conserved among all P450 enzymes as well as other characteristic motifs of all cytochrome P450s. It shares 52% identity with the previously published sequence of CYP6AE14 (GenBank accession no. DQ986461) from Helicoverpa armigera. Phylogenetic analysis of amino acid sequences from members of various P450 families indicated that CYP6AE25 has a closer phylogenetic relationship with CYP6AE14 and CYP6B1 that are related to metabolism of plant allelochemicals, CYP6D1 which is related to pyrethroid resistance and has a more distant relationship to CYP302A1 and CYP307A1 which are related to synthesis of the insect molting hormones. The expression level of the gene in the adults and immature stages of O. furnacalis by quantitative real-time PCR revealed that CYP6AE25 was expressed in all life stages investigated. The mRNA expression level in 3rd instar larvae was 12.8- and 2.97-fold higher than those in pupae and adults, respectively. The tissue specific expression level of CYP6AE25 was in the order of midgut, malpighian tube and fatty body from high to low but was absent in ovary and brain. The analysis of the CYP6AB25 gene using bioinformatic software is discussed. PMID:21529257
2013-01-01
Background Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. Results We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. Conclusions search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/. PMID:23452691
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Siążnik, Artur
2013-03-01
Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user's query, advanced data searching based on the specified user's query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/.
Sharifdini, Meysam; Derakhshani, Sedigheh; Alizadeh, Safar Ali; Ghanbarzadeh, Laleh; Mirjalali, Hamed; Mobedi, Iraj; Saraei, Mehrzad
2017-12-01
Human infections with Trichostrongylus species have been reported in most parts of Iran. The aim of this study was the identification, molecular characterization and phylogenetic analysis of human Trichostrongylus species based on ITS2 region of ribosomal DNA from Guilan Province, northern Iran. Stool samples were collected from rural inhabitants and examined by formalin-ether concentration and agar plate culture techniques. After anthelmintic treatment, male adult worms were collected from five infected cases. Genomic DNA was extracted from one male worm of each species in every treated individual and one filariform larva isolated from each case. PCR amplification of ITS2-rDNA region was performed and the products were sequenced. Among 1508 individuals, 46 (3.05%) were found infected with Trichostrongylus species using parasitological methods. Male worms of T. colubriformis, T. vitrinus and T. longispicularis were expelled from five patients after treatment. Out of 41 filariform larvae, 40 were T. colubriformis, and the other one was T. axei. Phylogenetic analysis showed that each species was placed together with reference sequences submitted to GenBank database. Intra-species similarity for all species obtained in the current study was 100%. T. colubriformis was found to be probably the most common species in this region of Iran. For the first time, the authors of the present study report the occurrence of natural human infection by T. longispicularis in the world. Therefore, the number of Trichostrongylus species infecting human in Iran now increased to ten. Copyright © 2017. Published by Elsevier B.V.
Concept for estimating mitochondrial DNA haplogroups using a maximum likelihood approach (EMMA)☆
Röck, Alexander W.; Dür, Arne; van Oven, Mannis; Parson, Walther
2013-01-01
The assignment of haplogroups to mitochondrial DNA haplotypes contributes substantial value for quality control, not only in forensic genetics but also in population and medical genetics. The availability of Phylotree, a widely accepted phylogenetic tree of human mitochondrial DNA lineages, led to the development of several (semi-)automated software solutions for haplogrouping. However, currently existing haplogrouping tools only make use of haplogroup-defining mutations, whereas private mutations (beyond the haplogroup level) can be additionally informative allowing for enhanced haplogroup assignment. This is especially relevant in the case of (partial) control region sequences, which are mainly used in forensics. The present study makes three major contributions toward a more reliable, semi-automated estimation of mitochondrial haplogroups. First, a quality-controlled database consisting of 14,990 full mtGenomes downloaded from GenBank was compiled. Together with Phylotree, these mtGenomes serve as a reference database for haplogroup estimates. Second, the concept of fluctuation rates, i.e. a maximum likelihood estimation of the stability of mutations based on 19,171 full control region haplotypes for which raw lane data is available, is presented. Finally, an algorithm for estimating the haplogroup of an mtDNA sequence based on the combined database of full mtGenomes and Phylotree, which also incorporates the empirically determined fluctuation rates, is brought forward. On the basis of examples from the literature and EMPOP, the algorithm is not only validated, but both the strength of this approach and its utility for quality control of mitochondrial haplotypes is also demonstrated. PMID:23948335
DNA barcoding commercially important fish species of Turkey.
Keskın, Emre; Atar, Hasan H
2013-09-01
DNA barcoding was used in the identification of 89 commercially important freshwater and marine fish species found in Turkish ichthyofauna. A total of 1765 DNA barcodes using a 654-bp-long fragment of the mitochondrial cytochrome c oxidase subunit I gene were generated for 89 commercially important freshwater and marine fish species found in Turkish ichthyofauna. These species belong to 70 genera, 40 families and 19 orders from class Actinopterygii, and all were associated with a distinct DNA barcode. Nine and 12 of the COI barcode clusters represent the first species records submitted to the BOLD and GenBank databases, respectively. All COI barcodes (except sequences of first species records) were matched with reference sequences of expected species, according to morphological identification. Average nucleotide frequencies of the data set were calculated as T = 29.7%, C = 28.2%, A = 23.6% and G = 18.6%. Average pairwise genetic distance among individuals were estimated as 0.32%, 9.62%, 17,90% and 22.40% for conspecific, congeneric, confamilial and within order, respectively. Kimura 2-parameter genetic distance values were found to increase with taxonomic level. For most of the species analysed in our data set, there is a barcoding gap, and an overlap in the barcoding gap exists for only two genera. Neighbour-joining trees were drawn based on DNA barcodes and all the specimens clustered in agreement with their taxonomic classification at species level. Results of this study supported DNA barcoding as an efficient molecular tool for a better monitoring, conservation and management of fisheries. © 2013 John Wiley & Sons Ltd.
Oligonucleotide Array for Identification and Detection of Pythium Species†
Tambong, J. T.; de Cock, A. W. A. M.; Tinker, N. A.; Lévesque, C. A.
2006-01-01
A DNA array containing 172 oligonucleotides complementary to specific diagnostic regions of internal transcribed spacers (ITS) of more than 100 species was developed for identification and detection of Pythium species. All of the species studied, with the exception of Pythium ostracodes, exhibited a positive hybridization reaction with at least one corresponding species-specific oligonucleotide. Hybridization patterns were distinct for each species. The array hybridization patterns included cluster-specific oligonucleotides that facilitated the recognition of species, including new ones, belonging to groups such as those producing filamentous or globose sporangia. BLAST analyses against 500 publicly available Pythium sequences in GenBank confirmed that species-specific oligonucleotides were unique to all of the available strains of each species, of which there were numerous economically important ones. GenBank entries of newly described species that are not putative synonyms showed no homology to sequences of the spotted species-specific oligonucleotides, but most new species did match some of the cluster-specific oligonucleotides. Further verification of the specificity of the DNA array was done with 50 additional Pythium isolates obtained by soil dilution plating. The hybridization patterns obtained were consistent with the identification of these isolates based on morphology and ITS sequence analyses. In another blind test, total DNA of the same soil samples was amplified and hybridized on the array, and the results were compared to those of 130 Pythium isolates obtained by soil dilution plating and root baiting. The 13 species detected by the DNA array corresponded to the isolates obtained by a combination of soil dilution plating and baiting, except for one new species that was not represented on the array. We conclude that the reported DNA array is a reliable tool for identification and detection of the majority of Pythium species in environmental samples. Simultaneous detection and identification of multiple species of soilborne pathogens such as Pythium species could be a major step forward for epidemiological and ecological studies. PMID:16597974
Biochemical characterization of a phospholipase A2 from Photobacterium damselae subsp. piscicida.
Hsu, Po-Yuan; Lee, Kuo-Kau; Lee, Pei-Shan; Hu, Chih-Chuang; Lin, Cheng-Hui; Liu, Ping-Chung
2013-01-01
Photobacterium damselae subsp. piscicida (Phdp) is the causative agent of fish photobacteriosis (pasteurellosis) in cultured cobia (Rachycentron canadum) in Taiwan. A component was purified from the extracellular products (ECP) of the bacterium strain 9205 by fast protein liquid chromatography (FPLC) and identified as a phospholipase. An N-terminal sequence of 10 amino acid residues, QDQPNLDPGK, was determined by mass spectroscopy (MS) and found to be identical with that of another Phdp phospholipase (GenBank accession no. BAB85814) at positions 21 to 30. The corresponding gene sequence of the phospholipase (GenBank accession no. AB071137) was employed to design primers for amplification of the sequence by the polymerase chain reaction (PCR). The PCR products were transformed into Escherichia coli, and a recombinant protein product was obtained which was purified as a His-tag fusion protein by Ni-metal affinity chromatography. A single 43-kDa band was determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). Phosphatidylcholine was degraded by this protein to lysophosphatidylcholine and a fatty acid. These products were characterized by thin-layer (TLC) and gas chromatography (GC), respectively, allowing the identification of the protein as a phospholipase A2. The recombinant protein had maximum enzymatic activity between pH 4 and 7, and at 40 degrees C. The activity was inhibited by Zn(2+) and Cu(2+), activated by Ca(2+) and Mg(2+), and completely inactivated by dexamethasone and p-bromophenacyl bromide. A rabbit antiserum against the recombinant protein neutralized the phospholipase A2 activity in the ECP of Phdp strain 9205 and the recombinant protein itself. The recombinant protein was toxic to cobia of about 5 g weight with an LD50 value between 2 and 4 microg protein/g fish. The results revealed phospholipase A2 as a fish toxin in the ECP of Phdp strain 9205.
Evolution of Fseg/Cseg dimorphism in region III of the Plasmodium falciparum eba-175 gene.
Yasukochi, Yoshiki; Naka, Izumi; Patarapotikul, Jintana; Hananantachai, Hathairad; Ohashi, Jun
2017-04-01
The 175-kDa erythrocyte binding antigen (EBA-175) of the malaria parasite Plasmodium falciparum is important for its invasion into human erythrocytes. The primary structure of eba-175 is divided into seven regions, namely I to VII. Region III contains highly divergent dimorphic segments, termed Fseg and Cseg. The allele frequencies of segmental dimorphism within a P. falciparum population have been extensively examined; however, the molecular evolution of segmental dimorphism is not well understood. A comprehensive comparison of nucleotide sequences among 32 P. falciparum eba-175 alleles identified in our previous study, two Plasmodium reichenowi, and one P. gaboni orthologous alleles obtained from the GenBank database was conducted to uncover the origin and evolutionary processes of segmental dimorphism in P. falciparum eba-175. In the eba-175 nucleotide sequence derived from a P. reichenowi CDC strain, both Fseg and Cseg were found in region III, which implies that the original eba-175 gene had both segments, and deletions of F- and C-segments generated Cseg and Fseg alleles, respectively. We also confirmed the presence of allele with Fseg and Cseg in another P. reichenowi strain (SY57), by re-mapping short reads obtained from the GenBank database. On the other hand, the segmental sequence of eba-175 ortholog in P. gaboni was quite diverged from those of the other species, suggesting that the original eba-175 dimorphism of P. falciparum can be traced back to the stem linage of P. falciparum and P. reichenowi. Our findings suggest that Fseg and Cseg alleles are derived from a single eba-175 allele containing both segments in the ancestral population of P. falciparum and P. reichenowi, and that the allelic dimorphism of eba-175 was shaped by the independent emergence of similar dimorphic lineage in different species that has never been observed in any evolutionary mode of allelic dimorphism at other loci in malaria genomes. Copyright © 2017 Elsevier B.V. All rights reserved.
CycADS: an annotation database system to ease the development and update of BioCyc databases
Vellozo, Augusto F.; Véron, Amélie S.; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E.; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano
2011-01-01
In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org PMID:21474551
Analysis of the vp2 gene sequence of a new mutated mink enteritis parvovirus strain in PR China
2010-01-01
Background Mink enteritis virus (MEV) causes a highly contagious viral disease of mink with a worldwide distribution. MEV has a linear, single-stranded, negative-sense DNA with a genome length of approximately 5,000 bp. The VP2 protein is the major structural protein of the parvovirus encoded by the vp2 gene. VP2 is highly antigenic and plays important roles in determining viral host ranges and tissue tropisms. This study describes the bionomics and vp2 gene analysis of a mutated strain, MEV-DL, which was isolated recently in China and outlines its homologous relationships with other selected strains registered in Genbank. Results The MEV-DL strain can infect F81 cells with cytopathic effects. Pig erythrocytes were agglutinated by the MEV-DL strain. The generation of MEV-DL in F81 cells could infect mink within three months and cause a disease that was similar to that caused by wild-type MEV. A comparative analysis of the vp2 gene nucleotide (nt) sequence of MEV-DL showed that this was more than 99% homologous with other mink enteritis parvoviruses in Genbank. However, the nucleotide residues at positions 1,065 and 1,238 in the MEV-DL strain of the vp2 gene differed from those of all the other MEV strains described previously. It is noteworthy that the mutation at the nucleotide residues position 1,238 led to Asp/Gly replacement. This may lead to structural changes. A phylogenetic tree and sequence distance table were obtained, which showed that the MEV-DL and ZYL-1 strains had the closest inheritance distance. Conclusions A new variation of the vp2 gene exists in the MEV-DL strain, which may lead to structural changes of the VP2 protein. Phylogenetic analysis showed that MEV-DL may originate from the ZYL-1 strain in DaLian. PMID:20540765
Global genetic diversity of the Plasmodium vivax transmission-blocking vaccine candidate Pvs48/45.
Vallejo, Andres F; Martinez, Nora L; Tobon, Alejandra; Alger, Jackeline; Lacerda, Marcus V; Kajava, Andrey V; Arévalo-Herrera, Myriam; Herrera, Sócrates
2016-04-12
Plasmodium vivax 48/45 protein is expressed on the surface of gametocytes/gametes and plays a key role in gamete fusion during fertilization. This protein was recently expressed in Escherichia coli host as a recombinant product that was highly immunogenic in mice and monkeys and induced antibodies with high transmission-blocking activity, suggesting its potential as a P. vivax transmission-blocking vaccine candidate. To determine sequence polymorphism of natural parasite isolates and its potential influence on the protein structure, all pvs48/45 sequences reported in databases from around the world as well as those from low-transmission settings of Latin America were compared. Plasmodium vivax parasite isolates from malaria-endemic regions of Colombia, Brazil and Honduras (n = 60) were used to sequence the Pvs48/45 gene, and compared to those previously reported to GenBank and PlasmoDB (n = 222). Pvs48/45 gene haplotypes were analysed to determine the functional significance of genetic variation in protein structure and vaccine potential. Nine non-synonymous substitutions (E35K, Y196H, H211N, K250N, D335Y, E353Q, A376T, K390T, K418R) and three synonymous substitutions (I73, T149, C156) that define seven different haplotypes were found among the 282 isolates from nine countries when compared with the Sal I reference sequence. Nucleotide diversity (π) was 0.00173 for worldwide samples (range 0.00033-0.00216), resulting in relatively high diversity in Myanmar and Colombia, and low diversity in Mexico, Peru and South Korea. The two most frequent substitutions (E353Q: 41.9 %, K250N: 39.5 %) were predicted to be located in antigenic regions without affecting putative B cell epitopes or the tertiary protein structure. There is limited sequence polymorphism in pvs48/45 with noted geographical clustering among Asian and American isolates. The low genetic diversity of the protein does not influence the predicted antigenicity or protein structure and, therefore, supports its further development as transmission-blocking vaccine candidate.
Molecular detection and characterization of Anaplasma platys in dogs and ticks in Cuba.
Silva, Claudia Bezerra da; Santos, Huarrisson Azevedo; Navarrete, Maylín González; Ribeiro, Carla Carolina Dias Uzedo; Gonzalez, Belkis Corona; Zaldivar, Maykelin Fuentes; Pires, Marcus Sandes; Peckle, Maristela; Costa, Renata Lins da; Vitari, Gabriela Lopes Vivas; Massard, Carlos Luiz
2016-07-01
Canine cyclic thrombocytopenia, an infectious disease caused by Anaplasma platys is a worldwide dog health problem. This study aimed to detect and characterize A. platys deoxyribonucleic acid (DNA) in dogs and ticks from Cuba using molecular methods. The study was conducted in four cities of Cuba (Habana del Este, Boyeros, Cotorro and San José de las Lajas). Blood samples were collected from 100 dogs in these cities. The animals were inspected for the detection of tick infestation and specimens were collected. Genomic DNA was extracted from dog blood and ticks using a commercial kit. Genomic DNA samples from blood and ticks were tested by a nested polymerase chain reaction (nPCR) to amplify 678 base pairs (bp) from the 16S ribosomal DNA (rDNA) of A. platys. Positive samples in nPCR were also subjected to PCR to amplify a fragment of 580bp from the citrate synthase (gltA) gene and the products were sequenced. Only Rhipicephalus sanguineus sensu lato (s.l.) was found on dogs, and 10.20% (n=5/49) of these ticks plus sixteen percent (16.0%, n=16/100) of dogs were considered positive for A. platys by nPCR targeting the 16S rDNA gene. All analyzed gltA and 16S rDNA sequences showed a 99-100% identity with sequences of A. platys reported in around the world. Phylogenetic analysis showed two defined clusters for the 16S rDNA gene and three defined clusters for the gltA gene. Based on the gltA gene, the deduced amino acid sequence showed two mutations at positions 88 and 168 compared with the sequence DQ525687 (GenBank ID from Italian sample), used as a reference in the alignment. A preliminary study on the epidemiological aspects associated with infection by A. platys showed no statistical association with the variables studied (p>0.05). This is the first evidence of the presence of A. platys in dogs and ticks in Cuba. Further studies are needed to evaluate the epidemiological aspects of A. platys infection in Cuban dogs. Copyright © 2016 Elsevier GmbH. All rights reserved.
Plasmodium vivax rhomboid-like protease 1 gene diversity in Thailand.
Mataradchakul, Touchchapol; Uthaipibull, Chairat; Nosten, Francois; Vega-Rodriguez, Joel; Jacobs-Lorena, Marcelo; Lek-Uthai, Usa
2017-10-01
Plasmodium vivax infection remains a major public health problem, especially along the Thailand border regions. We examined the genetic diversity of this parasite by analyzing single-nucleotide polymorphisms (SNPs) of the P. vivax rhomboid-like protease 1 gene (Pvrom1) in parasites collected from western (Tak province, Thai-Myanmar border) and eastern (Chanthaburi province, Thai-Cambodia border) regions. Data were collected by a cross-sectional survey, consisting of 47 and 45 P. vivax-infected filter paper-spotted blood samples from the western and eastern regions of Thailand, respectively during September 2013 to May 2014. Extracted DNA was examined for presence of P. vivax using Plasmodium species-specific nested PCR. Pvrom1 gene was PCR amplified, sequenced and the SNP diversity was analyzed using F-STAT, DnaSP, MEGA and LIAN programs. Comparison of sequences of the 92 Pvrom1 831-base open reading frames with that of a reference sequence (GenBank acc. no. XM001615211) revealed 17 samples with a total of 8 polymorphic sites, consisting of singleton (exon 3, nt 645) and parsimony informative (exon 1, nt 22 and 39; exon 3, nt 336, 537 and 656; and exon 4, nt 719 and 748) sites, which resulted in six different deduced Pvrom1 variants. Non-synonymous to synonymous substitutions ratio estimated by the DnaSP program was 1.65 indicating positive selection, but the Z-tests of selection showed no significant deviations from neutrality for Pvrom1 samples from western region of Thailand. In addition McDonald Kreitman test (MK) showed not significant, and Fst values are not different between the two regions and the regions combined. Interestingly, only Pvrom1 exon 2 was the most conserved sequences among the four exons. The relatively high degree of Pvrom1 polymorphism suggests that the protein is important for parasite survival in face of changes in both insect vector and human populations. These polymorphisms could serve as a sensitive marker for studying plasmodial genetic diversity. The significance of Pvrom1 conserved exon 2 sequence remains to be investigated. Copyright © 2017 Mahidol University. Published by Elsevier Inc. All rights reserved.
Random Amplification and Pyrosequencing for Identification of Novel Viral Genome Sequences
Hang, Jun; Forshey, Brett M.; Kochel, Tadeusz J.; Li, Tao; Solórzano, Víctor Fiestas; Halsey, Eric S.; Kuschner, Robert A.
2012-01-01
ssRNA viruses have high levels of genomic divergence, which can lead to difficulty in genomic characterization of new viruses using traditional PCR amplification and sequencing methods. In this study, random reverse transcription, anchored random PCR amplification, and high-throughput pyrosequencing were used to identify orthobunyavirus sequences from total RNA extracted from viral cultures of acute febrile illness specimens. Draft genome sequence for the orthobunyavirus L segment was assembled and sequentially extended using de novo assembly contigs from pyrosequencing reads and orthobunyavirus sequences in GenBank as guidance. Accuracy and continuous coverage were achieved by mapping all reads to the L segment draft sequence. Subsequently, RT-PCR and Sanger sequencing were used to complete the genome sequence. The complete L segment was found to be 6936 bases in length, encoding a 2248-aa putative RNA polymerase. The identified L segment was distinct from previously published South American orthobunyaviruses, sharing 63% and 54% identity at the nucleotide and amino acid level, respectively, with the complete Oropouche virus L segment and 73% and 81% identity at the nucleotide and amino acid level, respectively, with a partial Caraparu virus L segment. The result demonstrated the effectiveness of a sequence-independent amplification and next-generation sequencing approach for obtaining complete viral genomes from total nucleic acid extracts and its use in pathogen discovery. PMID:22468136
Gan, Han Ming; Tan, Mun Hua; Lee, Yin Peng; Austin, Christopher M
2016-05-01
The mitochondrial genome sequence of the Australian tadpole shrimp, Triops australiensis is presented (GenBank Accession Number: NC_024439) and compared with other Triops species. Triops australiensis has a mitochondrial genome of 15,125 base pairs consisting of 13 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a non-coding AT-rich region. The T. australiensis mitogenome is composed of 36.4% A, 16.1% C, 12.3% G and 35.1% T. The mitogenome gene order conforms to the primitive arrangement for Branchiopod crustaceans, which is also conserved within the Pancrustacean.
Opinion: Why we need a centralized repository for isotopic data
Pauli, Jonathan N.; Newsome, Seth D.; Cook, Joseph A.; Harrod, Chris; Steffan, Shawn A.; Baker, Christopher J. O.; Ben-David, Merav; Bloom, David; Bowen, Gabriel J.; Cerling, Thure E.; Cicero, Carla; Cook, Craig; Dohm, Michelle; Dharampal, Prarthana S.; Graves, Gary; Gropp, Robert; Hobson, Keith A.; Jordan, Chris; MacFadden, Bruce; Pilaar Birch, Suzanne; Poelen, Jorrit; Ratnasingham, Sujeevan; Russell, Laura; Stricker, Craig A.; Uhen, Mark D.; Yarnes, Christopher T.; Hayden, Brian
2017-01-01
Stable isotopes encode and integrate the origin of matter; thus, their analysis offers tremendous potential to address questions across diverse scientific disciplines (1, 2). Indeed, the broad applicability of stable isotopes, coupled with advancements in high-throughput analysis, have created a scientific field that is growing exponentially, and generating data at a rate paralleling the explosive rise of DNA sequencing and genomics (3). Centralized data repositories, such as GenBank, have become increasingly important as a means for archiving information, and “Big Data” analytics of these resources are revolutionizing science and everyday life.
The complete plastid genome of the middle Asian endemic of Stipa lipskyi (Poaceae).
Myszczyński, Kamil; Nobis, Marcin; Szczecinska, Monika; Sawicki, Jakub; Nowak, Arkadiusz
2016-11-01
The structure of the Stipa lipskyi (GenBank accession no. KT692644) plastid genome is similar to that of closely related Poaceae species: it has a total length of 137 755 bp, the base composition of the plastome is the following: A (30.7%), C (19.3%), G (19.4%) and T (30.5%). The S. lipskyi plastid genome contains 71 genes, excluding second IR region. A complete plastome sequence of S. lipskyi will help the development of primers for examining phylogeny and hybridization events in this taxonomically difficult genus.
DeBellis, Tonia; Widden, Paul
2006-11-01
Arbuscular mycorrhizal fungi (AMF) communities in Clintonia borealis roots from a boreal mixed forests in northwestern Québec were investigated. Roots were sampled from 100 m2 plots whose overstory was dominated by either trembling aspen (Populus tremuloides Michx.), white birch (Betula papyrifera Marsh.), or mixed white spruce (Picea glauca (Moench) Voss) and balsam fir (Abies balsamea (L.) Mill.). Part of the 18S ribosomal gene of the AMF was amplified and the resulting PCR products were cloned. Restriction analysis of the 576 resulting clones yielded 92 different restriction patterns which were then sequenced. Fifty-two sequences closely matched other Glomus sequences from Genbank. Phylogenetic analysis revealed 10 different AMF sequence types, most of which clustered with other uncultured AM sequences from plant roots from various field sites. Compared with other AMF communities from comparable studies, richness and diversity were higher than observed in an arable field, but lower than seen in a tropical forest and a temperate wetland. The AMF communities from Clintonia roots under the different canopy types did not differ significantly and the dominant sequence type, which clustered with AM sequences from a variety of environments and hosts at distant geographical locations, represented 66.9% of all the clones analyzed.
Thomas, Paul D; Kejariwal, Anish; Campbell, Michael J; Mi, Huaiyu; Diemer, Karen; Guo, Nan; Ladunga, Istvan; Ulitsky-Lazareva, Betty; Muruganujan, Anushya; Rabkin, Steven; Vandergriff, Jody A; Doremieux, Olivier
2003-01-01
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.
Chandok, Harshpreet; Shah, Pratik; Akare, Uday Raj; Hindala, Maliram; Bhadoriya, Sneha Singh; Ravi, G V; Sharma, Varsha; Bandaru, Srinivas; Rathore, Pragya; Nayarisseri, Anuraj
2015-09-01
16S rDNA sequencing which has gained wide popularity amongst microbiologists for the molecular characterization and identification of newly discovered isolates provides accurate identification of isolates down to the level of sub-species (strain). Its most important advantage over the traditional biochemical characterization methods is that it can provide an accurate identification of strains with atypical phenotypic characters as well. The following work is an application of 16S rRNA gene sequencing approach to identify a novel species of Probiotic Lactobacillus acidophilus. The sample was collected from pond water samples of rural and urban areas of Krishna district, Vijayawada, Andhra Pradesh, India. Subsequently, the sample was serially diluted and the aliquots were incubated for a suitable time period following which the suspected colony was subjected to 16S rDNA sequencing. The sequence aligned against other species was concluded to be a novel, Probiotic L. acidophilus bacteria, further which were named L. acidophilus strain EMBS081 & EMBS082. After the sequence characterization, the isolate was deposited in GenBank Database, maintained by the National Centre for Biotechnology Information NCBI. The sequence can also be retrieve from EMBL and DDBJ repositories with accession numbers JX255677 and KC150145.
A Novel Rickettsia Species Detected in Vole Ticks (Ixodes angustus) from Western Canada
Anstead, Clare A.
2013-01-01
The genomic DNA of ixodid ticks from western Canada was tested by PCR for the presence of Rickettsia. No rickettsiae were detected in Ixodes sculptus, whereas 18% of the I. angustus and 42% of the Dermacentor andersoni organisms examined were PCR positive for Rickettsia. The rickettsiae from each tick species were characterized genetically using multiple genes. Rickettsiae within the D. andersoni organisms had sequences at four genes that matched those of R. peacockii. In contrast, the Rickettsia present within the larvae, nymphs, and adults of I. angustus had novel DNA sequences at four of the genes characterized compared to the sequences available from GenBank for all recognized species of Rickettsia and all other putative species within the genus. Phylogenetic analyses of the sequence data revealed that the rickettsiae in I. angustus do not belong to the spotted fever, transitional, or typhus groups of rickettsiae but are most closely related to “Candidatus Rickettsia kingi” and belong to a clade that also includes R. canadensis, “Candidatus Rickettsia tarasevichiae,” and “Candidatus Rickettsia monteiroi.” PMID:24077705
Molecular Characterization of Watermelon Chlorotic Stunt Virus (WmCSV) from Palestine
Ali-Shtayeh, Mohammed S.; Jamous, Rana M.; Mallah, Omar B.; Abu-Zeitoun, Salam Y.
2014-01-01
The incidence of watermelon chlorotic stunt disease and molecular characterization of the Palestinian isolate of Watermelon chlorotic stunt virus (WmCSV-[PAL]) are described in this study. Symptomatic leaf samples obtained from watermelon Citrullus lanatus (Thunb.), and cucumber (Cucumis sativus L.) plants were tested for WmCSV-[PAL] infection by polymerase chain reaction (PCR) and Rolling Circle Amplification (RCA). Disease incidence ranged between 25%–98% in watermelon fields in the studied area, 77% of leaf samples collected from Jenin were found to be mixed infected with WmCSV-[PAL] and SLCV. The full-length DNA-A and DNA-B genomes of WmCSV-[PAL] were amplified and sequenced, and the sequences were deposited in the GenBank. Sequence analysis of virus genomes showed that DNA-A and DNA-B had 97.6%–99.42% and 93.16%–98.26% nucleotide identity with other virus isolates in the region, respectively. Sequence analysis also revealed that the Palestinian isolate of WmCSV shared the highest nucleotide identity with an isolate from Israel suggesting that the virus was introduced to Palestine from Israel. PMID:24956181
Zahraei Salehi, Taghi; Derakhshandeh, Abdollah; Tadjbakhsh, Hasan; Karimi, Vahid
2013-02-01
The ISS (increased serum survival) gene and its protein product (ISS) of avian pathogenic Escherichia coli (APEC) are important characteristics of resistance to the complement system. The aims of this study were to clone, sequence and characterize sequence diversity of the ISS gene between two predominant serogroups in Iran and among those previously deposited in Genbank. The ISS gene of 309 bp from the APEC χ1390 strain was amplified by PCR, cloned and sequenced using pTZ57R/T vector. The ISS gene from the χ1390 strain has 100% identity among different serogroups of APEC in different geographical regions throughout the world. Phylogenetic analysis shows two different phylogenic groups among the different strains. Strong association of nucleotide sequences among different E. coli strains suggests that it may be a conserved gene and could be a suitable antigen to control and detect avian pathogenic E. coli, at least in our region. Currently, our group is working on the ISS protein as candidate vaccine in SPF poultry. Copyright © 2012 Elsevier Ltd. All rights reserved.
L'vov, D K; Al'khovskiĭ, S V; Shchelkanov, M Iu; Shchetinin, A M; Deriabin, P G; Aristova, V A; Gitel'man, A K; Samokhvalov, E I; Botikov, A G
2014-01-01
The Tyulek virus (TLKV) was isolated from the ticks Argas vulgaris Filippova, 1961 (Argasidae), collected from the burrow biotopes in multispecies birds colony in the Aksu river floodplain near Tyulek village (northern part of Chu Valley, Kyrgyzstan). Recently, the TLKV was assigned to the Quaranfil group (including the Quaranfil virus (QRFV), Johnston Atoll virus (JAV), Lake Chad virus) that is a novel genus of the Quaranjavirus in the Orthomyxoviridae family. In his work, the complete genome (ID GenBank KJ438647-8) sequence of the TLKV was determined using next-generation sequencing (Illumina platform). Comparison of deduced amino acid sequences shows closed relationship of the TLKV with QRFV and JAV (86% and 84% identity for PB1 and about 70% for PB2 and PA, respectively). The identity level of the TLKV and QRFV in outer glycoprotein GP is 72% and 80% for nucleotide and amino acid sequences, respectively. The phylogenetic analysis showed that the TLKV belongs to the genus of the Quaranjavirus in the family Orthomyxoviridae.
Anantaphruti, Malinee Thairungroj; Thaenkham, Urusa; Watthanakulpanich, Dorn; Phuphisut, Orawan; Maipanich, Wanna; Yoonuan, Tippayarat; Nuamtanong, Supaporn; Pubampen, Somjit; Sanguankiat, Surapol
2013-02-01
Twelve 924 bp cytochrome c oxidase subunit 1 (cox1) mitochondrial DNA sequences from Taenia asiatica isolates from Thailand were aligned and compared with multiple sequence isolates from Thailand and 6 other countries from the GenBank database. The genetic divergence of T. asiatica was also compared with Taenia saginata database sequences from 6 different countries in Asia, including Thailand, and 3 countries from other continents. The results showed that there were minor genetic variations within T. asiatica species, while high intraspecies variation was found in T. saginata. There were only 2 haplotypes and 1 polymorphic site found in T. asiatica, but 8 haplotypes and 9 polymorphic sites in T. saginata. Haplotype diversity was very low, 0.067, in T. asiatica and high, 0.700, in T. saginata. The very low genetic diversity suggested that T. asiatica may be at a risk due to the loss of potential adaptive alleles, resulting in reduced viability and decreased responses to environmental changes, which may endanger the species.
Thaenkham, Urusa; Watthanakulpanich, Dorn; Phuphisut, Orawan; Maipanich, Wanna; Yoonuan, Tippayarat; Nuamtanong, Supaporn; Pubampen, Somjit; Sanguankiat, Surapol
2013-01-01
Twelve 924 bp cytochrome c oxidase subunit 1 (cox1) mitochondrial DNA sequences from Taenia asiatica isolates from Thailand were aligned and compared with multiple sequence isolates from Thailand and 6 other countries from the GenBank database. The genetic divergence of T. asiatica was also compared with Taenia saginata database sequences from 6 different countries in Asia, including Thailand, and 3 countries from other continents. The results showed that there were minor genetic variations within T. asiatica species, while high intraspecies variation was found in T. saginata. There were only 2 haplotypes and 1 polymorphic site found in T. asiatica, but 8 haplotypes and 9 polymorphic sites in T. saginata. Haplotype diversity was very low, 0.067, in T. asiatica and high, 0.700, in T. saginata. The very low genetic diversity suggested that T. asiatica may be at a risk due to the loss of potential adaptive alleles, resulting in reduced viability and decreased responses to environmental changes, which may endanger the species. PMID:23467439
Genotypes and subgenotypes of hepatitis B virus circulating in an endemic area in Peru.
Ramírez-Soto, Max Carlos; Bracho, Maria Alma; González-Candelas, Fernando; Huichi-Atamari, Milagros
2018-01-01
Although hepatitis B virus (HBV) infection is still endemic in Abancay, Peru, two decades after vaccination against hepatitis B started in the area, little is known about the diversity and circulation of genotypes and subgenotypes of the virus. To identify the genotypes and subtypes of HBV circulating in Abancay, complete genome sequences of 11 treatment-naive HBV-infected patients were obtained, and phylogenetic analysis was conducted with these and additional sequences from GenBank. Genotyping revealed the presence of genotype F in all the samples from Abancay. Subgenotype F1b was dominant and only one isolate belonged to subgenotype F4, which represents the first description of this subgenotype in Peru. Phylogenetic analysis revealed that most subgenotype F1b isolates from Peru clustered in a subgroup along with two sequences from Argentina, whereas two clusters with two HBV/F1b sequences each were indicative of recent epidemiological linkage, but only one could be verified by independent data. These results suggest that the HBV subgenotype F1b seems to be the predominant subgenotype in Abancay, Peru.
Li, Jitao; Li, Jian; Chen, Ping; Liu, Ping; He, Yuying
2015-01-01
The ridgetail white prawn Exopalaemon carinicauda is one of major economic mariculture species in eastern China. The deficiency of genomic and transcriptomic data is becoming the bottleneck of further researches on its good traits. In the present study, 454 pyrosequencing was undertaken to investigate the transcriptome profiles of E. carinicauda. A collection of 1,028,710 sequence reads (459.59 Mb) obtained from cDNA prepared from eyestalk and hemocytes was assembled into 162,056 expressed sequence tags (ESTs). Of these, 29.88 % of 48,428 contigs and 70.12 % of 113,628 singlets possessed high similarities to sequences in the GenBank non-redundant database, with most significant (E value <1e(-10)) unigenes matches occurring with crustacean and insect sequences. KEGG analysis of unigenes identified putative members of biological pathways related to growth and immunity. In addition, we obtained a total of putative 125,112 SNPs and 13,467 microsatellites. These results will contribute to the understanding of the genome makeup and provide useful information for future functional genomic research in E. carinicauda.
Nayarisseri, Anuraj; Suppahia, Anjana; Nadh, Anuroopa G; Nair, Achuthsankar S
2015-06-01
Organophosphates like chlorpyrifos, diazinon, or malathion have become most common and indisputably most toxic pest control agents that adversely affects the human nervous system even at low levels of exposure. Because of their relatively low cost and ability to be applied on a wide range of target insects and crop, organophosphorus pesticides account for a large share of all insecticides used in India, and this in turn raises severe health concerns. In this view, the present investigation was aimed to identify novel species of Flavobacterium bacteria which is bestowed with the capacity to degrade pesticides like chlorpyrifos, diazinon, or malathion. The bacterium was isolated from agricultural soil collected from Guntur District, Andhra Pradesh, India. The samples were serially diluted, and the aliquots were incubated for a suitable time following which the suspected colony was subjected to 16S rRNA gene sequencing. The sequence thus obtained was aligned pairwise against Flavobacterium species, which resulted in identification of novel species of Flavobacterium later which was named as EMBS0145 and sequence was deposited in GenBank with Accession Number: JN794045.
Nayarisseri, Anuraj; Suppahia, Anjana; Nadh, Anuroopa G; Nair, Achuthsankar S
2014-08-09
Organophosphates (OPs) like chlorpyrifos, diazinon, or malathion have become most common and indisputably most toxic pest-control agents that adversely affects the human nervous system even at low levels of exposure. Because of their relatively low cost and ability to be applied on a wide range of target insects and crop, organophosphorus pesticides account for a large share of all insecticides used in India, this in turn raises severe health concerns. In this view, the present investigation was aimed to identify novel species of Flavobacterium bacteria which is bestowed with the capacity to degrade pesticides like chlorpyrifos, diazinon or malathion. The bacterium was isolated from agricultural soil collected from Guntur District, Andhra Pradesh, India. The samples were serially diluted and the aliquots were incubated for a suitable time following which the suspected colony was subjected to 16S rRNA gene sequencing. The sequence thus obtained was aligned pairwise against Flavobacterium species, which resulted in identification of novel species of Flavobacterium later which was named as EMBS0145 and sequence was deposited in GenBank with accession number JN794045.
Abayli, Hasan; Tonbak, Sukru; Azkur, Ahmet Kursat; Bulut, Hakan
2017-10-01
Relatively high prevalence and mortality rates of bovine ephemeral fever (BEF) have been reported in recent epidemics in some countries, including Turkey, when compared with previous outbreaks. A limited number of complete genome sequences of BEF virus (BEFV) are available in the GenBank Database. In this study, the complete genome of highly pathogenic BEFV isolated during an outbreak in Turkey in 2012 was analyzed for genetic characterization. The complete genome of the Turkish BEFV isolate was amplified by reverse transcription-polymerase chain reaction (RT-PCR) and sequenced. It was found that the complete genome of the Turkish BEFV isolate was 14,901 nt in length. The complete genome sequence obtained from the study showed 91-92% identity at nucleotide level to Australian (BB7721) and Chinese (Bovine/China/Henan1/2012) BEFV isolates. Phylogenetic analysis of the glycoprotein gene of the Turkish BEFV isolate also showed that Turkish isolates were closely related to Israeli isolates. Because of the limited number of complete BEFV genome sequences, the results from this study will be useful for understanding the global molecular epidemiology and geodynamics of BEF.
Molecular Cloning and Sequence Analysis of a Phenylalanine Ammonia-Lyase Gene from Dendrobium
Cai, Yongping; Lin, Yi
2013-01-01
In this study, a phenylalanine ammonia-lyase (PAL) gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748) has 2,458 bps and contains a complete open reading frame (ORF) of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum. PMID:23638048
Hepatozoon canis infecting dogs in the State of Espírito Santo, southeastern Brazil.
Spolidorio, Mariana G; Labruna, Marcelo B; Zago, Augusto M; Donatele, Dirlei M; Caliari, Késia M; Yoshinari, Natalino H
2009-08-26
From May 2007 to March 2008, blood samples were collected from 92 healthy dogs living in 21 households (17 farms in rural area, and 4 homes in urban area) in 6 counties of the State of Espírito Santo, southeastern Brazil. In addition, ticks were collected from these dogs. A mean of 4.4+/-3.0 dogs (range: 1-12) were sampled per household; 78 and 14 dogs were from rural and urban areas, respectively. Polymerase chain reaction (PCR) designed to amplify fragments of the 18S rDNA gene of Babesia spp or Hepatozoon spp revealed amplicons of the expected size in 20 (21.7%) dogs for Babesia, and 54 (58.7%) dogs for Hepatozoon. All Babesia-positive dogs were also Hepatozoon-positive. Among the 21 households, 15 (71.4%) from 3 counties had at least one PCR-positive dog, including 13 farms (rural area) and 2 homes (urban area). A total of 40 PCR products from the Hepatozoon-PCR, and 19 products from the Babesia-PCR were submitted to DNA sequencing. All generated sequences from Hepatozoon-PCR were identical to each other, and to corresponding 18S rDNA sequences of H. canis in GenBank. Surprisingly, all generated sequences from the Babesia PCR were also identical to corresponding 18S rDNA sequences of H. canis in GenBank. Dogs from 10 rural and 2 urban households were found infested by Rhipicephalus sanguineus ticks. Immature of Amblyomma cajennense ticks were found in dogs from only 4 rural households (also infested by R. sanguineus). All but one household with R. sanguineus-infested dogs had at least one Hepatozoon-infected dog. Statistical analysis showed that the presence of ticks (i.e. R. sanguineus) infesting dogs in the households was significantly (P<0.05) associated with at least one PCR-positive dog. There was no significant association (P>0.05) between PCR-positive dogs and urban or rural households. Canine hepatozoonosis caused by H. canis is a high frequent infection in Espírito Santo, Brazil, where it is possibly vectored by R. sanguineus. Since all infected dogs were found apparently healthy, the pathogenicity of H. canis for dogs in Espírito Santo is yet to be elucidated.
Bhusri, Benjaporn; Sariya, Ladawan; Mongkolphan, Chalisa; Suksai, Parut; Kaewchot, Supakarn; Changbunjong, Tanasak
2017-09-01
Hepatozoon spp. are protozoan parasites that infect a wide range of domestic and wild animals. The infection occurs by ingestion of an infected tick. This study was carried out to detect and characterize Hepatozoon spp. in ticks collected from captive lions ( Panthera leo ) in Thailand based on the partial 18S rRNA gene sequence. A total of 30 ticks were collected and identified as Rhipicephalus sanguineus . The collected ticks were separated into 10 tick pools by sex and life stages. Of the 10 tick pools examined, only one (10%) was found to be infected with the Hepatozoon species. Sequencing and phylogenetic analysis showed a clustering of the partial 18S rRNA gene sequence like that of H. felis from the GenBank database. This is the first report of H. felis in R. sanguineus ticks collected from captive lions in Thailand. Our results indicated that R. sanguineus may be a possible vector of feline Hepatozoon in Thailand.
Detection and identification of Rickettsia species in Ixodes tick populations from Estonia.
Katargina, Olga; Geller, Julia; Ivanova, Anna; Värv, Kairi; Tefanova, Valentina; Vene, Sirkka; Lundkvist, Åke; Golovljova, Irina
2015-09-01
A total of 1640 ticks collected in different geographical parts of Estonia were screened for the presence of Rickettsia species DNA by real-time PCR. DNA of Rickettsia was detected in 83 out of 1640 questing ticks with an overall prevalence of 5.1%. The majority of the ticks infected by rickettsiae were Ixodes ricinus (74 of 83), while 9 of the 83 positive ticks were Ixodes persulcatus. For rickettsial species identification, a part of the citrate synthase gltA gene was sequenced. The majority of the positive samples were identified as Rickettsia helvetica (81 out of 83) and two of the samples were identified as Rickettsia monacensis and Candidatus R. tarasevichiae, respectively. Genetic characterization based on the partial gltA gene showed that the Estonian sequences within the R. helvetica, R. monacensis and Candidatus R. tarasevichiae species demonstrated 100% similarity with sequences deposited in GenBank, originating from Rickettsia species distributed over large territories from Europe to Asia. Copyright © 2015 Elsevier GmbH. All rights reserved.
Abarca, M L; Martorell, J; Castellá, G; Ramis, A; Cabañes, F J
2009-08-01
A Chrysosporium sp. related to Nannizziopsis vriesii was isolated in pure culture from squames and biopsies of facial lesions in a pet inland bearded dragon (Pogona vitticeps) in Spain. The presence in histological sections of morphologically consistent fungal elements strongly incriminates this fungus as the aetiological agent of infection. Lesions regressed following treatment with oral ketoconazole and topical chlorhexidine and terbinafine until the lizard was lost to follow up 1 month later. The ITS-5.8S rRNA gene of the isolate was sequenced and a search on the GenBank database revealed a high match with the sequences of two Chrysosporium sp. strains recently isolated from green iguanas (Iguana iguana) with dermatomycosis, also in Spain. Phylogenetic analysis of the sequences revealed that all these strains are related to N. vriesii. This is the first report of dermatomycoses caused by a Chrysosporium species related to N. vriesii in a bearded dragon outside North America.
Liu, Yan-Hua; Liu, Xin-Xin; Zhang, Ming-Hai
2016-07-01
Sika deer (Cervus nippon Temminck 1836) are classified in the order Artiodactyla, family Cervidae, subfamily Cervinae. At present, the phylogenetic studies of C. nippon are problematic. In this study, we first determined and described the complete mitochondrial sequence of the wild C. nippon hortulorum. The complete mitogenome sequence is 16 566 bp in length, including 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a putative control region (CR) and a light-strand replication origin (OL). The overall base composition was 33.4% A, 28.6% T, 24.5% C, 13.5% G, with a 62.0% AT bias. The 13 protein-coding genes encode 3782 amino acids in total. To further validate the new determined sequences and phylogeny of Sika deer, phylogenetic trees involving 15 most closely related species available in GenBank database were constructed. These results are expected to provide useful molecular data for deer species identification and further phylogenetic studies of Artiodactyla.
Wang, Zhong-dong; Wu, Ji-nan; Zhou, Lin; Ling, Jun-qi; Guo, Xi-min; Xiao, Ming-zhen; Zhu, Feng; Pu, Qin; Chai, Yu-bo; Zhao, Zhong-liang
2007-02-01
To study the biological properties of human dental pulp cells (HDPC) by cloning and analysis of genes differentially expressed in HDPC in comparison with human gingival fibroblasts (HGF). HDPC and HGF were cultured and identified by immunocytochemistry. HPDC and HGF subtractive cDNA library was established by PCR-based modified subtractive hybridization, genes differentially expressed by HPDC were cloned, sequenced and compared to find homogeneous sequence in GenBank by BLAST. Cloning and sequencing analysis indicate 12 genes differentially expressed were obtained, in which two were unknown genes. Among the 10 known genes, 4 were related to signal transduction, 2 were related to trans-membrane transportation (both cell membrane and nuclear membrane), and 2 were related to RNA splicing mechanisms. The biological properties of HPDC are determined by the differential expression of some genes and the growth and differentiation of HPDC are associated to the dynamic protein synthesis and secretion activities of the cell.