nucleotide sequence diversity: Topics by Science.gov

Sample records for nucleotide sequence diversity

Statistical analysis of nucleotide sequences of the hemagglutinin gene of human influenza A viruses.

PubMed Central

Ina, Y; Gojobori, T

1994-01-01

To examine whether positive selection operates on the hemagglutinin 1 (HA1) gene of human influenza A viruses (H1 subtype), 21 nucleotide sequences of the HA1 gene were statistically analyzed. The nucleotide sequences were divided into antigenic and nonantigenic sites. The nucleotide diversities for antigenic and nonantigenic sites of the HA1 gene were computed at synonymous and nonsynonymous sites separately. For nonantigenic sites, the nucleotide diversities were larger at synonymous sites than at nonsynonymous sites. This is consistent with the neutral theory of molecular evolution. For antigenic sites, however, the nucleotide diversities at nonsynonymous sites were larger than those at synonymous sites. These results suggest that positive selection operates on antigenic sites of the HA1 gene of human influenza A viruses (H1 subtype). PMID:8078892
Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica).

PubMed

He, Shui-Lian; Yang, Yang; Morrell, Peter L; Yi, Ting-Shuang

2015-01-01

Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.
Sequence diversity within the reovirus S2 gene: reovirus genes reassort in nature, and their termini are predicted to form a panhandle motif.

PubMed Central

Chapell, J D; Goral, M I; Rodgers, S E; dePamphilis, C W; Dermody, T S

1994-01-01

To better understand genetic diversity within mammalian reoviruses, we determined S2 nucleotide and deduced sigma 2 amino acid sequences of nine reovirus strains and compared these sequences with those of prototype strains of the three reovirus serotypes. The S2 gene and sigma 2 protein are highly conserved among the four type 1, one type 2, and seven type 3 strains studied. Phylogenetic analyses based on S2 nucleotide sequences of the 12 reovirus strains indicate that diversity within the S2 gene is independent of viral serotype. Additionally, we found marked topological differences between phylogenetic trees generated from S1 and S2 gene nucleotide sequences of the seven type 3 strains. These results demonstrate that reovirus S1 and S2 genes have distinct evolutionary histories, thus providing phylogenetic evidence for lateral transfer of reovirus genes in nature. When variability among the 12 sigma 2-encoding S2 nucleotide sequences was analyzed at synonymous positions, we found that approximately 60 nucleotides at the 5' terminus and 30 nucleotides at the 3' terminus were markedly conserved in comparison with other sigma 2-encoding regions of S2. Predictions of RNA secondary structures indicate that the more conserved S2 sequences participate in the formation of an extended region of duplex RNA interrupted by a pair of stem-loops. Among the 12 deduced sigma 2 amino acid sequences examined, substitutions were observed at only 11% of amino acid positions. This finding suggests that constraints on the structure or function of sigma 2, perhaps in part because of its location in the virion core, have limited sequence diversity within this protein. PMID:8289378
Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

PubMed

Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

2013-08-01

To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.
Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

USDA-ARS?s Scientific Manuscript database

Technical Abstract: 20-75 CHARACTER LINES A strategy for a genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into respective genomes. In this study, nucle...
Genetic variation in potential Giardia vaccine candidates cyst wall protein 2 and α1-giardin.

PubMed

Radunovic, Matej; Klotz, Christian; Saghaug, Christina Skår; Brattbakk, Hans-Richard; Aebischer, Toni; Langeland, Nina; Hanevik, Kurt

2017-08-01

Giardia is a prevalent intestinal parasitic infection. The trophozoite structural protein a1-giardin (a1-g) and the cyst protein cyst wall protein 2 (CWP2) have shown promise as Giardia vaccine antigen candidates in murine models. The present study assesses the genetic diversity of a1-g and CWP2 between and within assemblages A and B in human clinical isolates. a1-g and CWP2 sequences were acquired from 15 Norwegian isolates by PCR amplification and 20 sequences from German cultured isolates by whole genome sequencing. Sequences were aligned to reference genomes from assemblage A2 and B to identify genetic variance. Genetic diversity was found between assemblage A and B reference sequences for both a1-g (90.8% nucleotide identity) and CWP2 (82.5% nucleotide identity). However, for a1-g, this translated into only 3 amino acid (aa) substitutions, while for CWP2 there were 41 aa substitutions, and also one aa deletion. Genetic diversity within assemblage B was larger; nucleotide identity 92.0% for a1-g and 94.3% for CWP2, than within assemblage A (nucleotide identity 99.0% for a1-g and 99.7% for CWP2). For CWP2, the diversity on both nucleotide and protein level was higher in the C-terminal end. Predicted antigenic epitopes were not affected for a1-g, but partially for CWP2. Despite genetic diversity in a1-g, we found aa sequence, characteristics, and antigenicity to be well preserved. CWP2 showed more aa variance and potential antigenic differences. Several CWP2 antigens might be necessary in a future Giardia vaccine to provide cross protection against both Giardia assemblages infecting humans.
Genetic diversity and classification of Tibetan yak populations based on the mtDNA COIII gene.

PubMed

Song, Q Q; Chai, Z X; Xin, J W; Zhao, S J; Ji, Q M; Zhang, C F; Ma, Z J; Zhong, J C

2015-03-13

To determine the level of genetic diversity and phylogenetic relationships among Tibetan yak populations, the mitochondrial DNA cytochrome c oxidase subunit 3 (COIII) genes of 378 yak individuals from 16 populations were analyzed in this study. The results showed that the length of cytochrome c oxidase subunit 3 gene sequences was 781 bp, with nucleotide frequencies of 29.2, 29.4, 26.1, and 15.2% for T, C, A, and G, respectively. A total of 26 haplotypes were identified, with 69 polymorphic sites, including 11 parsimony-informative sites and 58 single-nucleotide polymorphism sites. No deletions/insertions were found in sequence comparison, indicating that nucleotide mutation types were transitions and transversions. Haplotype and nucleotide diversities were 0.562 and 0.00138, respectively, indicating a high level of genetic diversity in Tibetan yak populations. Phylogenetic relationship analysis indicated that Tibetan yak populations are divided into 2 groups.
SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data.

PubMed

Nelson, Chase W; Moncla, Louise H; Hughes, Austin L

2015-11-15

New applications of next-generation sequencing technologies use pools of DNA from multiple individuals to estimate population genetic parameters. However, no publicly available tools exist to analyse single-nucleotide polymorphism (SNP) calling results directly for evolutionary parameters important in detecting natural selection, including nucleotide diversity and gene diversity. We have developed SNPGenie to fill this gap. The user submits a FASTA reference sequence(s), a Gene Transfer Format (.GTF) file with CDS information and a SNP report(s) in an increasing selection of formats. The program estimates nucleotide diversity, distance from the reference and gene diversity. Sites are flagged for multiple overlapping reading frames, and are categorized by polymorphism type: nonsynonymous, synonymous, or ambiguous. The results allow single nucleotide, single codon, sliding window, whole gene and whole genome/population analyses that aid in the detection of positive and purifying natural selection in the source population. SNPGenie version 1.2 is a Perl program with no additional dependencies. It is free, open-source, and available for download at https://github.com/hugheslab/snpgenie. nelsoncw@email.sc.edu or austin@biol.sc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genomic diversity of the human intestinal parasite Entamoeba histolytica

PubMed Central

2012-01-01

Background Entamoeba histolytica is a significant cause of disease worldwide. However, little is known about the genetic diversity of the parasite. We re-sequenced the genomes of ten laboratory cultured lines of the eukaryotic pathogen Entamoeba histolytica in order to develop a picture of genetic diversity across the genome. Results The extreme nucleotide composition bias and repetitiveness of the E. histolytica genome provide a challenge for short-read mapping, yet we were able to define putative single nucleotide polymorphisms in a large portion of the genome. The results suggest a rather low level of single nucleotide diversity, although genes and gene families with putative roles in virulence are among the more polymorphic genes. We did observe large differences in coverage depth among genes, indicating differences in gene copy number between genomes. We found evidence indicating that recombination has occurred in the history of the sequenced genomes, suggesting that E. histolytica may reproduce sexually. Conclusions E. histolytica displays a relatively low level of nucleotide diversity across its genome. However, large differences in gene family content and gene copy number are seen among the sequenced genomes. The pattern of polymorphism indicates that E. histolytica reproduces sexually, or has done so in the past, which has previously been suggested but not proven. PMID:22630046
Ancient diversity and geographical sub-structuring in African buffalo Theileria parva populations revealed through metagenetic analysis of antigen-encoding loci.

PubMed

Hemmink, Johanneke D; Sitt, Tatjana; Pelle, Roger; de Klerk-Lorist, Lin-Mari; Shiels, Brian; Toye, Philip G; Morrison, W Ivan; Weir, William

2018-03-01

An infection and treatment protocol involving infection with a mixture of three parasite isolates and simultaneous treatment with oxytetracycline is currently used to vaccinate cattle against Theileria parva. While vaccination results in high levels of protection in some regions, little or no protection is observed in areas where animals are challenged predominantly by parasites of buffalo origin. A previous study involving sequencing of two antigen-encoding genes from a series of parasite isolates indicated that this is associated with greater antigenic diversity in buffalo-derived T. parva. The current study set out to extend these analyses by applying high-throughput sequencing to ex vivo samples from naturally infected buffalo to determine the extent of diversity in a set of antigen-encoding genes. Samples from two populations of buffalo, one in Kenya and the other in South Africa, were examined to investigate the effect of geographical distance on the nature of sequence diversity. The results revealed a number of significant findings. First, there was a variable degree of nucleotide sequence diversity in all gene segments examined, with the percentage of polymorphic nucleotides ranging from 10% to 69%. Second, large numbers of allelic variants of each gene were found in individual animals, indicating multiple infection events. Third, despite the observed diversity in nucleotide sequences, several of the gene products had highly conserved amino acid sequences, and thus represent potential candidates for vaccine development. Fourth, although compelling evidence for population differentiation between the Kenyan and South African T. parva parasites was identified, analysis of molecular variance for each gene revealed that the majority of the underlying nucleotide sequence polymorphism was common to both areas, indicating that much of this aspect of genetic variation in the parasite population arose prior to geographic separation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Analysis of genetic diversity using SNP markers in oat

USDA-ARS?s Scientific Manuscript database

A large-scale single nucleotide polymorphism (SNP) discovery was carried out in cultivated oat using Roche 454 sequencing methods. DNA sequences were generated from cDNAs originating from a panel of 20 diverse oat cultivars, and from Diversity Array Technology (DArT) genomic complexity reductions fr...
Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads

PubMed Central

Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse; Hansen, Thomas Arn; Kjartansdóttir, Kristín Rós; Guldberg Frøslev, Tobias; Snogdal Boutrup, Torsten; Nielsen, Lars Peter; Willerslev, Eske; Hansen, Anders J.

2015-01-01

From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads are derived. First, we showed by simulations that we can robustly infer the level of genetic diversity from short sequence reads. Second, we find that the measures of nucleotide diversity inferred from our retroviral sequences significantly exceed the level observed from Human Immunodeficiency Virus infections, prompting us to conclude that the novel retroviruses are both of endogenous origin. Through further simulations, we rule out the possibility that the observed elevated levels of nucleotide diversity are the result of co-infection with two closely related exogenous retroviruses. PMID:26493184
Single nucleotide polymorphisms generated by genotyping by sequencing to characterize genome-wide diversity, linkage disequilibrium, and selective sweeps in cultivated watermelon

USDA-ARS?s Scientific Manuscript database

Large datasets containing single nucleotide polymorphisms (SNPs) are used to analyze genome-wide diversity in a robust collection of cultivars from representative accessions, across the world. The extent of linkage disequilibrium (LD) within a population determines the number of markers required fo...
Expansion of the Preimmune Antibody Repertoire by Junctional Diversity in Bos taurus

PubMed Central

Liljavirta, Jenni; Niku, Mikael; Pessa-Morikawa, Tiina; Ekman, Anna; Iivanainen, Antti

2014-01-01

Cattle have a limited range of immunoglobulin genes which are further diversified by antigen independent somatic hypermutation in fetuses. Junctional diversity generated during somatic recombination contributes to antibody diversity but its relative significance has not been comprehensively studied. We have investigated the importance of terminal deoxynucleotidyl transferase (TdT) -mediated junctional diversity to the bovine immunoglobulin repertoire. We also searched for new bovine heavy chain diversity (IGHD) genes as the information of the germline sequences is essential to define the junctional boundaries between gene segments. New heavy chain variable genes (IGHV) were explored to address the gene usage in the fetal recombinations. Our bioinformatics search revealed five new IGHD genes, which included the longest IGHD reported so far, 154 bp. By genomic sequencing we found 26 new IGHV sequences that represent potentially new IGHV genes or allelic variants. Sequence analysis of immunoglobulin heavy chain cDNA libraries of fetal bone marrow, ileum and spleen showed 0 to 36 nontemplated N-nucleotide additions between variable, diversity and joining genes. A maximum of 8 N nucleotides were also identified in the light chains. The junctional base profile was biased towards A and T nucleotide additions (64% in heavy chain VD, 52% in heavy chain DJ and 61% in light chain VJ junctions) in contrast to the high G/C content which is usually observed in mice. Sequence analysis also revealed extensive exonuclease activity, providing additional diversity. B-lymphocyte specific TdT expression was detected in bovine fetal bone marrow by reverse transcription-qPCR and immunofluorescence. These results suggest that TdT-mediated junctional diversity and exonuclease activity contribute significantly to the size of the cattle preimmune antibody repertoire already in the fetal period. PMID:24926997
Global sequence diversity of the lactate dehydrogenase gene in Plasmodium falciparum.

PubMed

Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Harnyuttanakorn, Pongchai

2018-01-09

Antigen-detecting rapid diagnostic tests (RDTs) have been recommended by the World Health Organization for use in remote areas to improve malaria case management. Lactate dehydrogenase (LDH) of Plasmodium falciparum is one of the main parasite antigens employed by various commercial RDTs. It has been hypothesized that the poor detection of LDH-based RDTs is attributed in part to the sequence diversity of the gene. To test this, the present study aimed to investigate the genetic diversity of the P. falciparum ldh gene in Thailand and to construct the map of LDH sequence diversity in P. falciparum populations worldwide. The ldh gene was sequenced for 50 P. falciparum isolates in Thailand and compared with hundreds of sequences from P. falciparum populations worldwide. Several indices of molecular variation were calculated, including the proportion of polymorphic sites, the average nucleotide diversity index (π), and the haplotype diversity index (H). Tests of positive selection and neutrality tests were performed to determine signatures of natural selection on the gene. Mean genetic distance within and between species of Plasmodium ldh was analysed to infer evolutionary relationships. Nucleotide sequences of P. falciparum ldh could be classified into 9 alleles, encoding 5 isoforms of LDH. L1a was the most common allelic type and was distributed in P. falciparum populations worldwide. Plasmodium falciparum ldh sequences were highly conserved, with haplotype and nucleotide diversity values of 0.203 and 0.0004, respectively. The extremely low genetic diversity was maintained by purifying selection, likely due to functional constraints. Phylogenetic analysis inferred the close genetic relationship of P. falciparum to malaria parasites of great apes, rather than to other human malaria parasites. This study revealed the global genetic variation of the ldh gene in P. falciparum, providing knowledge for improving detection of LDH-based RDTs and supporting the candidacy of LDH as a therapeutic drug target.
Error correction and diversity analysis of population mixtures determined by NGS

PubMed Central

Burroughs, Nigel J.; Evans, David J.; Ryabov, Eugene V.

2014-01-01

The impetus for this work was the need to analyse nucleotide diversity in a viral mix taken from honeybees. The paper has two findings. First, a method for correction of next generation sequencing error in the distribution of nucleotides at a site is developed. Second, a package of methods for assessment of nucleotide diversity is assembled. The error correction method is statistically based and works at the level of the nucleotide distribution rather than the level of individual nucleotides. The method relies on an error model and a sample of known viral genotypes that is used for model calibration. A compendium of existing and new diversity analysis tools is also presented, allowing hypotheses about diversity and mean diversity to be tested and associated confidence intervals to be calculated. The methods are illustrated using honeybee viral samples. Software in both Excel and Matlab and a guide are available at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/, the Warwick University Systems Biology Centre software download site. PMID:25405074
An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

USDA-ARS?s Scientific Manuscript database

Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
Simian immunodeficiency viruses from African green monkeys display unusual genetic diversity.

PubMed Central

Johnson, P R; Fomsgaard, A; Allan, J; Gravell, M; London, W T; Olmsted, R A; Hirsch, V M

1990-01-01

African green monkeys are asymptomatic carriers of simian immunodeficiency viruses (SIV), commonly called SIVagm. As many as 50% of African green monkeys in the wild may be SIV seropositive. This high seroprevalence rate and the potential for genetic variation of lentiviruses suggested to us that African green monkeys may harbor widely differing genotypes of SIVagm. To investigate this hypothesis, we determined the entire nucleotide sequence of an infectious proviral molecular clone of SIVagm (155-4) and partial sequences (long terminal repeat and Gag) of three other distinct SIVagm isolates (90, gri-1, and ver-1). Comparisons among the SIVagm isolates revealed extreme diversity at the nucleotide and amino acid levels. Long terminal repeat nucleotide sequences varied up to 35% and Gag protein sequences varied up to 30%. The variability among SIVagm isolates exceeded the variability among any other group of primate lentiviruses. Our data suggest that SIVagm has been in the African green monkey population for a long time and may be the oldest primate lentivirus group in existence. PMID:2304139
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.

Genetic diversity based on 28S rDNA sequences among populations of Culex quinquefasciatus collected at different locations in Tamil Nadu, India.

PubMed

Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S

2015-09-01

The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.
Dimeric PROP1 binding to diverse palindromic TAAT sequences promotes its transcriptional activity.

PubMed

Nakayama, Michie; Kato, Takako; Susa, Takao; Sano, Akiko; Kitahara, Kousuke; Kato, Yukio

2009-08-13

Mutations in the Prop1 gene are responsible for murine Ames dwarfism and human combined pituitary hormone deficiency with hypogonadism. Recently, we reported that PROP1 is a possible transcription factor for gonadotropin subunit genes through plural cis-acting sites composed of AT-rich sequences containing a TAAT motif which differs from its consensus binding sequence known as PRDQ9 (TAATTGAATTA). This study aimed to verify the binding specificity and sequence of PROP1 by applying the method of SELEX (Systematic Evolution of Ligands by EXponential enrichment), EMSA (electrophoretic mobility shift assay) and transient transfection assay. SELEX, after 5, 7 and 9 generations of selection using a random sequence library, showed that nucleotides containing one or two TAAT motifs were accumulated and accounted for 98.5% at the 9th generation. Aligned sequences and EMSA demonstrated that PROP1 binds preferentially to 11 nucleotides composed of an inverted TAAT motif separated by 3 nucleotides with variation in the half site of palindromic TAAT motifs and with preferential requirement of T at the nucleotide number 5 immediately 3' to a TAAT motif. Transient transfection assay demonstrated first that dimeric binding of PROP1 to an inverted TAAT motif and its cognates resulted in transcriptional activation, whereas monomeric binding of PROP1 to a single TAAT motif and an inverted ATTA motif did not mediate activation. Thus, this study demonstrated that dimeric binding of PROP1 is able to recognize diverse palindromic TAAT sequences separated by 3 nucleotides and to exhibit its transcriptional activity.
Unraveling Haplotype Diversity of the Apical Membrane Antigen-1 Gene in Plasmodium falciparum Populations in Thailand

PubMed Central

Lumkul, Lalita; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Harnyuttanakorn, Pongchai; Pattaradilokrat, Sittiporn

2018-01-01

Development of an effective vaccine is critically needed for the prevention of malaria. One of the key antigens for malaria vaccines is the apical membrane antigen 1 (AMA-1) of the human malaria parasite Plasmodium falciparum, the surface protein for erythrocyte invasion of the parasite. The gene encoding AMA-1 has been sequenced from populations of P. falciparum worldwide, but the haplotype diversity of the gene in P. falciparum populations in the Greater Mekong Subregion (GMS), including Thailand, remains to be characterized. In the present study, the AMA-1 gene was PCR amplified and sequenced from the genomic DNA of 65 P. falciparum isolates from 5 endemic areas in Thailand. The nearly full-length 1,848 nucleotide sequence of AMA-1 was subjected to molecular analyses, including nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity and neutrality tests. Phylogenetic analysis and pairwise population differentiation (Fst indices) were performed to infer the population structure. The analyses identified 60 single nucleotide polymorphic loci, predominately located in domain I of AMA-1. A total of 31 unique AMA-1 haplotypes were identified, which included 11 novel ones. The phylogenetic tree of the AMA-1 haplotypes revealed multiple clades of AMA-1, each of which contained parasites of multiple geographical origins, consistent with the Fst indices indicating genetic homogeneity or gene flow among geographically distinct populations of P. falciparum in Thailand’s borders with Myanmar, Laos and Cambodia. In summary, the study revealed novel haplotypes and population structure needed for the further advancement of AMA-1-based malaria vaccines in the GMS. PMID:29742870
De novo assembly of mitochondrial genomes provides insights into genetic diversity and molecular evolution in wild boars and domestic pigs.

PubMed

Ni, Pan; Bhuiyan, Ali Akbar; Chen, Jian-Hai; Li, Jingjin; Zhang, Cheng; Zhao, Shuhong; Du, Xiaoyong; Li, Hua; Yu, Hui; Liu, Xiangdong; Li, Kui

2018-06-01

Up to date, the scarcity of publicly available complete mitochondrial sequences for European wild pigs hampers deeper understanding about the genetic changes following domestication. Here, we have assembled 26 de novo mtDNA sequences of European wild boars from next generation sequencing (NGS) data and downloaded 174 complete mtDNA sequences to assess the genetic relationship, nucleotide diversity, and selection. The Bayesian consensus tree reveals the clear divergence between the European and Asian clade and a very small portion (10 out of 200 samples) of maternal introgression. The overall nucleotides diversities of the mtDNA sequences have been reduced following domestication. Interestingly, the selection efficiencies in both European and Asian domestic pigs are reduced, probably caused by changes in both selection constraints and maternal population size following domestication. This study suggests that de novo assembled mitogenomes can be a great boon to uncover the genetic turnover following domestication. Further investigation is warranted to include more samples from the ever-increasing amounts of NGS data to help us to better understand the process of domestication.
High levels of MHC class II allelic diversity in lake trout from Lake Superior

USGS Publications Warehouse

Dorschner, M.O.; Duris, T.; Bronte, C.R.; Burnham-Curtis, M. K.; Phillips, R.B.

2000-01-01

Sequence variation in a 216 bp portion of the major histocompatibility complex (MHC) II B1 domain was examined in 74 individual lake trout (Salvelinus namaycush) from different locations in Lake Superior. Forty-three alleles were obtained which encoded 71-72 amino acids of the mature protein. These sequences were compared with previous data obtained from five Pacific salmon species and Atlantic salmon using the same primers. Although all of the lake trout alleles clustered together in the neighbor-joining analysis of amino acid sequences, one amino acid allelic lineage was shared with Atlantic salmon (Salmo salar), a species in another genus which probably diverged from Salvelinus more than 10-20 million years ago. As shown previously in other salmonids, the level of nonsynonymous nucleotide substitution (d(N)) exceeded the level of synonymous substitution (d(S)). The level of nucleotide diversity at the MHC class II B1 locus was considerably higher in lake trout than in the Pacific salmon (genus Oncorhynchus). These results are consistent with the hypothesis that lake trout colonized Lake Superior from more than one refuge following the Wisconsin glaciation. Recent population bottlenecks may have reduced nucleotide diversity in Pacific salmon populations.
Assessment of genetic diversity among four orchids based on ddRAD sequencing data for conservation purposes.

PubMed

Roy, Subhas Chandra; Moitra, Kaushik; De Sarker, Dilip

2017-01-01

Genetic diversity was assessed in the four orchid species using NGS based ddRAD sequencing data. The assembled nucleotide sequences (fastq) were deposited in the SRA archive of NCBI Database with accession number (SRP063543 for Dendrobium , SRP065790 for Geodorum, SRP072201 for Cymbidium and SRP072378 for Rhynchostylis ). Total base pair read was 1.1 Mbp in case of Dendrobium sp., 553.3 Kbp for Geodorum sp., 1.6 Gbp for Cymbidium , and 1.4 Gbp for Rhynchostylis . Average GC% was 43.9 in Geodorum , 43.7% in Dendrobium , 41.2% in Cymbidium and 42.3% in Rhynchostylis . Four partial gene sequences were used in DnaSP5 program for nucleotide diversity and phylogenetic relationship determination ( Ycf2 gene of Dendrobium, matK gene of Geodorum , psbD gene of Cymbidium and Ycf2 gene of Ryhnchostylis ). Nucleotide diversity (per site) Pi (π) was 0.10560 in Dendrobium, 0.03586 in Geodorum, 0.01364 in Cymbidium and 0.011344 in Rhynchostylis . Neutrality test statistics showed the negative value in all the four orchid species (Tajima's D value -2.17959 in Dendrobium , -2.01655 in Geodorum, -2.12362 in Rhynchostylis and -1.54222 in Cymbidium ) indicating the purifying selection. Result for these gene sequences ( mat K and Ycf 2 and psb D) indicate that they were not evolved neutrally, but signifying that selection might have played a role in evolution of these genes in these four groups of orchids. Phylogenetic relationship was analyzed by reconstructing dendrogram based on the matK, psbD and Ycf2 gene sequences using maximum likelihood method in MEGA6 program.
Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

PubMed Central

2010-01-01

Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome. PMID:21156062
Genetic diversity of Plasmodium Vivax revealed by the merozoite surface protein-1 icb5-6 fragment.

PubMed

Ruan, Wei; Zhang, Ling-Ling; Feng, Yan; Zhang, Xuan; Chen, Hua-Liang; Lu, Qiao-Yi; Yao, Li-Nong; Hu, Wei

2017-06-05

Plasmodium vivax remains a potential cause of morbidity and mortality for people living in its endemic areas. Understanding the genetic diversity of P. vivax from different regions is valuable for studying population dynamics and tracing the origins of parasites. The PvMSP-1 gene is highly polymorphic and has been used as a marker in many P. vivax population studies. The aim of this study was to investigate the genetic diversity of the PvMSP-1 gene icb5-6 fragment and to provide more genetic polymorphism data for further studies on P. vivax population structure and tracking of the origin of clinical cases. Nested PCR and sequencing of the PvMSP-1 icb5-6 marker were performed to obtain the nucleotide sequences of 95 P. vivax isolates collected from Zhejiang province, China. To investigate the genetic diversity of PvMSP-1, the 95 nucleotide sequences of the PvMSP-1 icb5-6 fragment were genotyped and analyzed using DnaSP v5, MEGA software. The 95 P. vivax isolates collected from Zhejiang province were either indigenous cases or imported cases from different regions around the world. A total of 95 sequences ranging from 390 to 460 bp were obtained. The 95 sequences were genotyped into four allele-types (Sal I, Belem, R-III and R-IV) and 17 unique haplotypes. R-III and Sal I were the predominant allele-types. The haplotype diversity (Hd) and nucleotide diversity (Pi) were estimated to be 0.729 and 0.062, indicating that the PvMSP-1 icb5-6 fragment had the highest level of polymorphism due to frequent recombination processes and single nucleotide polymorphism. The values of dN/dS and Tajima's D both suggested neutral selection for the PvMSP-1icb5-6 fragment. In addition, a rare recombinant style of R-IV type was identified. This study presented high genetic diversity in the PvMSP-1 marker among P. vivax strains from around the world. The genetic data is valuable for expanding the polymorphism information on P. vivax, which could be helpful for further study on population dynamics and tracking the origin of P. vivax.
The primary structure of the thymidine kinase gene of fish lymphocystis disease virus.

PubMed

Schnitzler, P; Handermann, M; Szépe, O; Darai, G

1991-06-01

The DNA nucleotide sequence of the thymidine kinase (TK) gene of fish lymphocystis disease virus (FLDV) which has been localized between the coordinates 0.678 to 0.688 of the viral genome was determined. The analysis of the DNA nucleotide sequence located between the recognition sites of HindIII (0.669 map unit; nucleotide position 1) and AccI (nucleotide position 2032) revealed the presence of an open reading frame of 954 bp on the lower strand of this region between nucleotide positions 1868 (ATG) and 915 (TAA). It encodes for a protein of 318 amino acid residues. The evolutionary relationships of the TK gene of FLDV to the other known TK genes was investigated using the method of progressive sequence alignment. These analyses revealed a high degree of diversity between the protein sequence of FLDV TK gene and the amino acid composition of other TKs tested. However, significant conservations were detected at several regions of amino acid residues of the FLDV TK protein when compared to the amino acid sequence of TKs of African swine fever virus, fowlpox virus, shope fibroma virus, and vaccinia virus and to the amino acid sequences of the cellular cytoplasmic TK of chicken, mouse, and man.
Isolation of a full-length CC-NBS-LRR resistance gene analog candidate from sugar pine showing low nucleotide diversity.

Treesearch

K.D. Jermstad; L.A. Sheppard; B.B. Kinloch; A. Delfino-Mix; E.S. Ersoz; K.V. Krutovsky; D.B Neale

2006-01-01

The nucleotide-binding-site and leucine-rich-repeat (NBSâLRR) class of R proteins is abundant and widely distributed in plants. By using degenerate primers designed on the NBS domain in lettuce, we amplified sequences in sugar pine that shared sequence identity with many of the NBSâLRR class resistance genes catalogued in GenBank. The polymerase chain reaction products...
Genetic diversity and population structure of cowpea (Vigna unguiculata L. Walp)

USDA-ARS?s Scientific Manuscript database

The genetic diversity of cowpea was analyzed and the population structure was estimated in a diverse set of 768 cultivated cowpea genotypes from USDA GRIN cowpea collection, originally collected from 56 countries worldwide. Genotyping by sequencing was used to discover single nucleotide polymorphism...
Genetic Diversity and Phylogenetic Evolution of Tibetan Sheep Based on mtDNA D-Loop Sequences

PubMed Central

Yue, Yaojing; Guo, Xian; Guo, Tingting; Chu, Min; Wang, Fan; Han, Jilong; Feng, Ruilin; Sun, Xiaoping; Niu, Chune; Yang, Bohui; Guo, Jian; Yuan, Chao

2016-01-01

The molecular and population genetic evidence of the phylogenetic status of the Tibetan sheep (Ovis aries) is not well understood, and little is known about this species’ genetic diversity. This knowledge gap is partly due to the difficulty of sample collection. This is the first work to address this question. Here, the genetic diversity and phylogenetic relationship of 636 individual Tibetan sheep from fifteen populations were assessed using 642 complete sequences of the mitochondrial DNA D-loop. Samples were collected from the Qinghai-Tibetan Plateau area in China, and reference data were obtained from the six reference breed sequences available in GenBank. The length of the sequences varied considerably, between 1031 and 1259 bp. The haplotype diversity and nucleotide diversity were 0.992±0.010 and 0.019±0.001, respectively. The average number of nucleotide differences was 19.635. The mean nucleotide composition of the 350 haplotypes was 32.961% A, 29.708% T, 22.892% C, 14.439% G, 62.669% A+T, and 37.331% G+C. Phylogenetic analysis showed that all four previously defined haplogroups (A, B, C, and D) were found in the 636 individuals of the fifteen Tibetan sheep populations but that only the D haplogroup was found in Linzhou sheep. Further, the clustering analysis divided the fifteen Tibetan sheep populations into at least two clusters. The estimation of the demographic parameters from the mismatch analyses showed that haplogroups A, B, and C had at least one demographic expansion in Tibetan sheep. These results contribute to the knowledge of Tibetan sheep populations and will help inform future conservation programs about the Tibetan sheep native to the Qinghai-Tibetan Plateau. PMID:27463976
Phylogeny of North American Powassan virus.

PubMed

Ebel, G D; Spielman, A; Telford, S R

2001-07-01

To determine whether Powassan virus (POW) and deer tick virus (DTV) constitute distinct flaviviral populations transmitted by ixodid ticks in North America, we analysed diverse nucleotide sequences from 16 strains of these viruses. Two distinct genetic lineages are evident, which may be defined by geographical and host associations. The nucleotide and amino acid sequences of lineage one (comprising New York and Canadian POW isolates) are highly conserved across time and space, but those of lineage two (comprising isolates from deer ticks and a fox) are more variable. The divergence between lineages is much greater than the variation within either lineage, and lineage two appears to be more diverse genetically than is lineage one. Application of McDonald-Kreitman tests to the sequences of these strains indicates that adaptive evolution of the envelope protein separates lineage one from lineage two. The two POW lineages circulating in North America possess a pattern of genetic diversity suggesting that they comprise distinct subtypes that may perpetuate in separate enzootic cycles.
Genomic distribution and estimation of nucleotide diversity in natural populations: perspectives from the collared flycatcher (Ficedula albicollis) genome.

PubMed

Dutoit, Ludovic; Burri, Reto; Nater, Alexander; Mugal, Carina F; Ellegren, Hans

2017-07-01

Properly estimating genetic diversity in populations of nonmodel species requires a basic understanding of how diversity is distributed across the genome and among individuals. To this end, we analysed whole-genome resequencing data from 20 collared flycatchers (genome size ≈1.1 Gb; 10.13 million single nucleotide polymorphisms detected). Genomewide nucleotide diversity was almost identical among individuals (mean = 0.00394, range = 0.00384-0.00401), but diversity levels varied extensively across the genome (95% confidence interval for 200-kb windows = 0.0013-0.0053). Diversity was related to selective constraint such that in comparison with intergenic DNA, diversity at fourfold degenerate sites was reduced to 85%, 3' UTRs to 82%, 5' UTRs to 70% and nondegenerate sites to 12%. There was a strong positive correlation between diversity and chromosome size, probably driven by a higher density of targets for selection on smaller chromosomes increasing the diversity-reducing effect of linked selection. Simulations exploring the ability of sequence data from a small number of genetic markers to capture the observed diversity clearly demonstrated that diversity estimation from finite sampling of such data is bound to be associated with large confidence intervals. Nevertheless, we show that precision in diversity estimation in large outbred population benefits from increasing the number of loci rather than the number of individuals. Simulations mimicking RAD sequencing showed that this approach gives accurate estimates of genomewide diversity. Based on the patterns of observed diversity and the performed simulations, we provide broad recommendations for how genetic diversity should be estimated in natural populations. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology

PubMed Central

Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M

2007-01-01

Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442
DNA sequences of Pima (Gossypium barbadense L.) cotton leaf for examining transcriptome diversity and SNP biomarker discovery

USDA-ARS?s Scientific Manuscript database

As an initial step to explore the transcriptome genetic diversity and to discover single nucleotide polymorphic (SNP)-biomarkers for marker assisted breeding within Pima (Gossypium barbadense L.) cotton, leaves from 25 day plants of three diverse genotypes were used to develop cDNA libraries. Using ...
Genetic diversity and connectivity in the East African giant mud crab Scylla serrata: Implications for fisheries management.

PubMed

Rumisha, Cyrus; Huyghe, Filip; Rapanoel, Diary; Mascaux, Nemo; Kochzius, Marc

2017-01-01

The giant mud crab Scylla serrata provides an important source of income and food to coastal communities in East Africa. However, increasing demand and exploitation due to the growing coastal population, export trade, and tourism industry are threatening the sustainability of the wild stock of this species. Because effective management requires a clear understanding of the connectivity among populations, this study was conducted to assess the genetic diversity and connectivity in the East African mangrove crab S. serrata. A section of 535 base pairs of the cytochrome oxidase subunit I (COI) gene and eight microsatellite loci were analysed from 230 tissue samples of giant mud crabs collected from Kenya, Tanzania, Mozambique, Madagascar, and South Africa. Microsatellite genetic diversity (He) ranged between 0.56 and 0.6. The COI sequences showed 57 different haplotypes associated with low nucleotide diversity (current nucleotide diversity = 0.29%). In addition, the current nucleotide diversity was lower than the historical nucleotide diversity, indicating overexploitation or historical bottlenecks in the recent history of the studied population. Considering that the coastal population is growing rapidly, East African countries should promote sustainable fishing practices and sustainable use of mangrove resources to protect mud crabs and other marine fauna from the increasing pressure of exploitation. While microsatellite loci did not show significant genetic differentiation (p > 0.05), COI sequences revealed significant genetic divergence between sites on the East coast of Madagascar (ECM) and sites on the West coast of Madagascar, mainland East Africa, as well as the Seychelles. Since East African countries agreed to achieve the Convention on Biological Diversity (CBD) target to protect over 10% of their marine areas by 2020, the observed pattern of connectivity and the measured genetic diversity can serve to provide useful information for designing networks of marine protected areas.
Genetic diversity and molecular evolution of Naga King Chili inferred from internal transcribed spacer sequence of nuclear ribosomal DNA.

PubMed

Kehie, Mechuselie; Kumaria, Suman; Devi, Khumuckcham Sangeeta; Tandon, Pramod

2016-02-01

Sequences of the Internal Transcribed Spacer (ITS1-5.8S-ITS2) of nuclear ribosomal DNAs were explored to study the genetic diversity and molecular evolution of Naga King Chili. Our study indicated the occurrence of nucleotide polymorphism and haplotypic diversity in the ITS regions. The present study demonstrated that the variability of ITS1 with respect to nucleotide diversity and sequence polymorphism exceeded that of ITS2. Sequence analysis of 5.8S gene revealed a much conserved region in all the accessions of Naga King Chili. However, strong phylogenetic information of this species is the distinct 13 bp deletion in the 5.8S gene which discriminated Naga King Chili from the rest of the Capsicum sp. Neutrality test results implied a neutral variation, and population seems to be evolving at drift-mutation equilibrium and free from directed selection pressure. Furthermore, mismatch analysis showed multimodal curve indicating a demographic equilibrium. Phylogenetic relationships revealed by Median Joining Network (MJN) analysis denoted a clear discrimination of Naga King Chili from its closest sister species (Capsicum chinense and Capsicum frutescens). The absence of star-like network of haplotypes suggested an ancient population expansion of this chili.
The Impact of Mutation and Gene Conversion on the Local Diversification of Antigen Genes in African Trypanosomes

PubMed Central

Gjini, Erida; Haydon, Daniel T.; Barry, J. David; Cobbold, Christina A.

2012-01-01

Patterns of genetic diversity in parasite antigen gene families hold important information about their potential to generate antigenic variation within and between hosts. The evolution of such gene families is typically driven by gene duplication, followed by point mutation and gene conversion. There is great interest in estimating the rates of these processes from molecular sequences for understanding the evolution of the pathogen and its significance for infection processes. In this study, a series of models are constructed to investigate hypotheses about the nucleotide diversity patterns between closely related gene sequences from the antigen gene archive of the African trypanosome, the protozoan parasite causative of human sleeping sickness in Equatorial Africa. We use a hidden Markov model approach to identify two scales of diversification: clustering of sequence mismatches, a putative indicator of gene conversion events with other lower-identity donor genes in the archive, and at a sparser scale, isolated mismatches, likely arising from independent point mutations. In addition to quantifying the respective probabilities of occurrence of these two processes, our approach yields estimates for the gene conversion tract length distribution and the average diversity contributed locally by conversion events. Model fitting is conducted using a Bayesian framework. We find that diversifying gene conversion events with lower-identity partners occur at least five times less frequently than point mutations on variant surface glycoprotein (VSG) pairs, and the average imported conversion tract is between 14 and 25 nucleotides long. However, because of the high diversity introduced by gene conversion, the two processes have almost equal impact on the per-nucleotide rate of sequence diversification between VSG subfamily members. We are able to disentangle the most likely locations of point mutations and conversions on each aligned gene pair. PMID:22735079
Sequence Variation of the tRNALeu Intron as a Marker for Genetic Diversity and Specificity of Symbiotic Cyanobacteria in Some Lichens

PubMed Central

Paulsrud, Per; Lindblad, Peter

1998-01-01

We examined the genetic diversity of Nostoc symbionts in some lichens by using the tRNALeu (UAA) intron as a genetic marker. The nucleotide sequence was analyzed in the context of the secondary structure of the transcribed intron. Cyanobacterial tRNALeu (UAA) introns were specifically amplified from freshly collected lichen samples without previous DNA extraction. The lichen species used in the present study were Nephroma arcticum, Peltigera aphthosa, P. membranacea, and P. canina. Introns with different sizes around 300 bp were consistently obtained. Multiple clones from single PCRs were screened by using their single-stranded conformational polymorphism pattern, and the nucleotide sequence was determined. No evidence for sample heterogenity was found. This implies that the symbiont in situ is not a diverse community of cyanobionts but, rather, one Nostoc strain. Furthermore, each lichen thallus contained only one intron type, indicating that each thallus is colonized only once or that there is a high degree of specificity. The same cyanobacterial intron sequence was also found in samples of one lichen species from different localities. In a phylogenetic analysis, the cyanobacterial lichen sequences grouped together with the sequences from two free-living Nostoc strains. The size differences in the intron were due to insertions and deletions in highly variable regions. The sequence data were used in discussions concerning specificity and biology of the lichen symbiosis. It is concluded that the tRNALeu (UAA) intron can be of great value when examining cyanobacterial diversity. PMID:9435083

Population structure and genetic variability within isolates of Grapevine fanleaf virus from a naturally infected vineyard in France: evidence for mixed infection and recombination.

PubMed

Vigne, Emmanuelle; Bergdoll, Marc; Guyader, Sébastien; Fuchs, Marc

2004-08-01

The nematode-borne Grapevine fanleaf virus, from the genus Nepovirus in the family Comoviridae, causes severe degeneration of grapevines in most vineyards worldwide. We characterized 347 isolates from transgenic and conventional grapevines from two vineyard sites in the Champagne region of France for their molecular variant composition. The population structure and genetic diversity were examined in the coat protein gene by IC-RT-PCR-RFLP analysis with EcoRI and StyI, and nucleotide sequencing, respectively. RFLP data suggested that 55 % (191 of 347) of the isolates had a population structure consisting of one predominant variant. Sequencing data of 51 isolates representing the different restrictotypes confirmed the existence of mixed infection with a frequency of 33 % (17 of 51) and showed two major predominant haplotypes representing 71 % (60 of 85) of the sequence variants. Comparative nucleotide diversity among population subsets implied a lack of genetic differentiation according to host (transgenic vs conventional) or field site for most restrictotypes (17 of 18 and 13 of 18) and for haplotypes in most phylogenetic groups (seven of eight and six of eight), respectively. Interestingly, five of the 85 haplotypes sequenced had an intermediate divergence (0.036-0.066) between the lower (0.005-0.028) and upper range (0.083-0.138) of nucleotide variability, suggesting the occurrence of homologous RNA recombination. Sequence alignments clearly indicated a mosaic structure for four of these five variants, for which recombination sites were identified and parental lineages proposed. This is the first in-depth characterization of the population structure and genetic diversity in a nepovirus.
Virulence Gene Sequencing Highlights Similarities and Differences in Sequences in Listeria monocytogenes Serotype 1/2a and 4b Strains of Clinical and Food Origin From 3 Different Geographic Locations.

PubMed

Poimenidou, Sofia V; Dalmasso, Marion; Papadimitriou, Konstantinos; Fox, Edward M; Skandamis, Panagiotis N; Jordan, Kieran

2018-01-01

The prfA -virulence gene cluster ( p VGC) is the main pathogenicity island in Listeria monocytogenes , comprising the prfA, plcA, hly, mpl, actA , and plcB genes. In this study, the p VGC of 36 L. monocytogenes isolates with respect to different serotypes (1/2a or 4b), geographical origin (Australia, Greece or Ireland) and isolation source (food-associated or clinical) was characterized. The most conserved genes were prfA and hly , with the lowest nucleotide diversity (π) among all genes ( P < 0.05), and the lowest number of alleles, substitutions and non-synonymous substitutions for prfA . Conversely, the most diverse gene was actA , which presented the highest number of alleles ( n = 20) and showed the highest nucleotide diversity. Grouping by serotype had a significantly lower π value ( P < 0.0001) compared to isolation source or geographical origin, suggesting a distinct and well-defined unit compared to other groupings. Among all tested genes, only hly and mpl were those with lower nucleotide diversity in 1/2a serotype than 4b serotype, reflecting a high within-1/2a serotype divergence compared to 4b serotype. Geographical divergence was noted with respect to the hly gene, where serotype 4b Irish strains were distinct from Greek and Australian strains. Australian strains showed less diversity in plcB and mpl relative to Irish or Greek strains. Notable differences regarding sequence mutations were identified between food-associated and clinical isolates in prfA, actA , and plcB sequences. Overall, these results indicate that virulence genes follow different evolutionary pathways, which are affected by a strain's origin and serotype and may influence virulence and/or epidemiological dominance of certain subgroups.
Genetic diversity among isolates of Autographa californica multiple nucleopolyhedrovirus

USDA-ARS?s Scientific Manuscript database

Our knowledge of genetic variation at the nucleotide sequence level of Autographa californica multiple nucleopolyhedrovirus (AcMNPV; Baculoviridae: Alphabaculovirus) derives from complete genome sequences of the C6 clonal isolate of AcMNPV and the R1 and CL3 clonal isolates of AcMNPV variants Rachip...
HLA DNA Sequence Variation among Human Populations: Molecular Signatures of Demographic and Selective Events

PubMed Central

Buhler, Stéphane; Sanchez-Mazas, Alicia

2011-01-01

Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC) genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies. Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model). However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used to explore the genetic history of human populations, and that their analysis allows a more thorough investigation of human MHC molecular evolution. PMID:21408106
Comparative genomic sequence analysis of novel Helicoverpa armigera nucleopolyhedrovirus (NPV) isolated from Kenya and three other previously sequenced Helicoverpa spp. NPVs.

PubMed

Ogembo, Javier Gordon; Caoili, Barbara L; Shikata, Masamitsu; Chaeychomsri, Sudawan; Kobayashi, Michihiro; Ikeda, Motoko

2009-10-01

A newly cloned Helicoverpa armigera nucleopolyhedrovirus (HearNPV) from Kenya, HearNPV-NNg1, has a higher insecticidal activity than HearNPV-G4, which also exhibits lower insecticidal activity than HearNPV-C1. In the search for genes and/or nucleotide sequences that might be involved in the observed virulence differences among Helicoverpa spp. NPVs, the entire genome of NNg1 was sequenced and compared with previously sequenced genomes of G4, C1 and Helicoverpa zea single-nucleocapsid NPV (Hz). The NNg1 genome was 132,425 bp in length, with a total of 143 putative open reading frames (ORFs), and shared high levels of overall amino acid and nucleotide sequence identities with G4, C1 and Hz. Three NNg1 ORFs, ORF5, ORF100 and ORF124, which were shared with C1, were absent in G4 and Hz, while NNg1 and C1 were missing a homologue of G4/Hz ORF5. Another three ORFs, ORF60 (bro-b), ORF119 and ORF120, and one direct repeat sequence (dr) were unique to NNg1. Relative to the overall nucleotide sequence identity, lower sequence identities were observed between NNg1 hrs and the homologous hrs in the other three Helicoverpa spp. NPVs, despite containing the same number of hrs located at essentially the same positions on the genomes. Differences were also observed between NNg1 and each of the other three Helicoverpa spp. NPVs in the diversity of bro genes encoded on the genomes. These results indicate several putative genes and nucleotide sequences that may be responsible for the virulence differences observed among Helicoverpa spp., yet the specific genes and/or nucleotide sequences responsible have not been identified.
Comparative sequence analysis of domain I of Plasmodium falciparum apical membrane antigen 1 from Saudi Arabia and worldwide isolates.

PubMed

Al-Qahtani, Ahmed A; Abdel-Muhsin, Abdel-Muhsin A; Dajem, Saad M Bin; AlSheikh, Adel Ali H; Bohol, Marie Fe F; Al-Ahdal, Mohammed N; Putaporntip, Chaturong; Jongwutiwes, Somchai

2016-04-01

The apical membrane antigen 1 of Plasmodium falciparum (PfAMA1) plays a crucial role in erythrocyte invasion and is a target of protective antibodies. Although domain I of PfAMA1 has been considered a promising vaccine component, extensive sequence diversity in this domain could compromise an effective vaccine design. To explore the extent of sequence diversity in domain I of PfAMA1, P. falciparum-infected blood samples from Saudi Arabia collected between 2007 and 2009 were analyzed and compared with those from worldwide parasite populations. Forty-six haplotypes and a novel codon change (M190V) were found among Saudi Arabian isolates. The haplotype diversity (0.948±0.004) and nucleotide diversity (0.0191±0.0008) were comparable to those from African hyperendemic countries. Positive selection in domain I of PfAMA1 among Saudi Arabian parasite population was observed because nonsynonymous nucleotide substitutions per nonsynonymous site (dN) significantly exceeded synonymous nucleotide substitutions per synonymous site (dS) and Tajima's D and its related statistics significantly deviated from neutrality in the positive direction. Despite a relatively low prevalence of malaria in Saudi Arabia, a minimum of 17 recombination events occurred in domain I. Genetic differentiation was significant between P. falciparum in Saudi Arabia and parasites from other geographic origins. Several shared or closely related haplotypes were found among parasites from different geographic areas, suggesting that vaccine derived from multiple shared epitopes could be effective across endemic countries. Copyright © 2016 Elsevier B.V. All rights reserved.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

PubMed

Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

2014-11-01

As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Diversity and duplication of DQB and DRB-like genes of the MHC in baleen whales (suborder: Mysticeti).

PubMed

Baker, C S; Vant, M D; Dalebout, M L; Lento, G M; O'Brien, S J; Yuhki, N

2006-05-01

The molecular diversity and phylogenetic relationships of two class II genes of the baleen whale major histocompatibility complex were investigated and compared to toothed whales and out-groups. Amplification of the DQB exon 2 provided sequences showing high within-species and between-species nucleotide diversity and uninterrupted reading frames consistent with functional class II loci found in related mammals (e.g., ruminants). Cloning of amplified products indicated gene duplication in the humpback whale and triplication in the southern right whale, with average nucleotide diversity of 5.9 and 6.3%, respectively, for alleles of each species. Significantly higher nonsynonymous divergence at sites coding for peptide binding (32% for humpback and 40% for southern right) suggested that these loci were subject to positive (overdominant) selection. A population survey of humpback whales detected 23 alleles, differing by up to 21% of their inferred amino acid sequences. Amplification of the DRB exon 2 resulted in two groups of sequences. One was most similar to the DRB3 of the cow and present in all whales screened to date, including toothed whales. The second was most similar to the DRB2 of the cow and was found only in the bowhead and right whales. Both loci showed low diversity among species and apparent loss of function or altered function including interruption of reading frames. Finally, comparison of inferred protein sequence of the DRB3-like locus suggested convergence with the DQB, perhaps resulting from intergenic conversion or recombination.
Genetic diversity of the merozoite surface protein-3 gene in Plasmodium falciparum populations in Thailand.

PubMed

Pattaradilokrat, Sittiporn; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Siripoon, Napaporn; Harnyuttanakorn, Pongchai

2016-10-21

An effective malaria vaccine is an urgently needed tool to fight against human malaria, the most deadly parasitic disease of humans. One promising candidate is the merozoite surface protein-3 (MSP-3) of Plasmodium falciparum. This antigenic protein, encoded by the merozoite surface protein (msp-3) gene, is polymorphic and classified according to size into the two allelic types of K1 and 3D7. A recent study revealed that both the K1 and 3D7 alleles co-circulated within P. falciparum populations in Thailand, but the extent of the sequence diversity and variation within each allelic type remains largely unknown. The msp-3 gene was sequenced from 59 P. falciparum samples collected from five endemic areas (Mae Hong Son, Kanchanaburi, Ranong, Trat and Ubon Ratchathani) in Thailand and analysed for nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity. The gene was also subject to population genetic analysis (F st ) and neutrality tests (Tajima's D, Fu and Li D* and Fu and Li' F* tests) to determine any signature of selection. The sequence analyses revealed eight unique DNA haplotypes and seven amino acid sequence variants, with a haplotype and nucleotide diversity of 0.828 and 0.049, respectively. Neutrality tests indicated that the polymorphism detected in the alanine heptad repeat region of MSP-3 was maintained by positive diversifying selection, suggesting its role as a potential target of protective immune responses and supporting its role as a vaccine candidate. Comparison of MSP-3 variants among parasite populations in Thailand, India and Nigeria also inferred a close genetic relationship between P. falciparum populations in Asia. This study revealed the extent of the msp-3 gene diversity in P. falciparum in Thailand, providing the fundamental basis for the better design of future blood stage malaria vaccines against P. falciparum.
Genetic diversity and host specificity varies across three genera of blood parasites in ducks of the Pacific Americas Flyway

USGS Publications Warehouse

Reeves, Andrew B.; Smith, Matthew M.; Meixell, Brandt W.; Fleskes, Joseph P.; Ramey, Andrew M.

2015-01-01

Birds of the order Anseriformes, commonly referred to as waterfowl, are frequently infected by Haemosporidia of the genera Haemoproteus, Plasmodium, and Leucocytozoon via dipteran vectors. We analyzed nucleotide sequences of the Cytochrome b (Cytb) gene from parasites of these genera detected in six species of ducks from Alaska and California, USA to characterize the genetic diversity of Haemosporidia infecting waterfowl at two ends of the Pacific Americas Flyway. In addition, parasite Cytb sequences were compared to those available on a public database to investigate specificity of genetic lineages to hosts of the order Anseriformes. Haplotype and nucleotide diversity of Haemoproteus Cytb sequences was lower than was detected for Plasmodium and Leucocytozoon parasites. Although waterfowl are presumed to be infected by only a single species of Leucocytozoon, L. simondi, diversity indices were highest for haplotypes from this genus and sequences formed five distinct clades separated by genetic distances of 4.9%–7.6%, suggesting potential cryptic speciation. All Haemoproteus andLeucocytozoon haplotypes derived from waterfowl samples formed monophyletic clades in phylogenetic analyses and were unique to the order Anseriformes with few exceptions. In contrast, waterfowl-origin Plasmodium haplotypes were identical or closely related to lineages found in other avian orders. Our results suggest a more generalist strategy for Plasmodiumparasites infecting North American waterfowl as compared to those of the generaHaemoproteus and Leucocytozoon.
Genetic Diversity and Host Specificity Varies across Three Genera of Blood Parasites in Ducks of the Pacific Americas Flyway

PubMed Central

Reeves, Andrew B.; Smith, Mathew M.; Meixell, Brandt W.; Fleskes, Joseph P; Ramey, Andrew M.

2015-01-01

Birds of the order Anseriformes, commonly referred to as waterfowl, are frequently infected by Haemosporidia of the genera Haemoproteus, Plasmodium, and Leucocytozoon via dipteran vectors. We analyzed nucleotide sequences of the Cytochrome b (Cytb) gene from parasites of these genera detected in six species of ducks from Alaska and California, USA to characterize the genetic diversity of Haemosporidia infecting waterfowl at two ends of the Pacific Americas Flyway. In addition, parasite Cytb sequences were compared to those available on a public database to investigate specificity of genetic lineages to hosts of the order Anseriformes. Haplotype and nucleotide diversity of Haemoproteus Cytb sequences was lower than was detected for Plasmodium and Leucocytozoon parasites. Although waterfowl are presumed to be infected by only a single species of Leucocytozoon, L. simondi, diversity indices were highest for haplotypes from this genus and sequences formed five distinct clades separated by genetic distances of 4.9%–7.6%, suggesting potential cryptic speciation. All Haemoproteus and Leucocytozoon haplotypes derived from waterfowl samples formed monophyletic clades in phylogenetic analyses and were unique to the order Anseriformes with few exceptions. In contrast, waterfowl-origin Plasmodium haplotypes were identical or closely related to lineages found in other avian orders. Our results suggest a more generalist strategy for Plasmodium parasites infecting North American waterfowl as compared to those of the genera Haemoproteus and Leucocytozoon. PMID:25710468
Complete Genome Sequences of 38 Gordonia sp. Bacteriophages

PubMed Central

Montgomery, Matthew T.; Bonilla, J. Alfred; Dejong, Randall; Garlena, Rebecca A.; Guerrero Bustamante, Carlos; Klyczek, Karen K.; Russell, Daniel A.; Wertz, John T.; Jacobs-Sera, Deborah; Hatfull, Graham F.

2017-01-01

ABSTRACT We report here the genome sequences of 38 newly isolated bacteriophages using Gordonia terrae 3612 (ATCC 25594) and Gordonia neofelifaecis NRRL59395 as bacterial hosts. All of the phages are double-stranded DNA (dsDNA) tail phages with siphoviral morphologies, with genome sizes ranging from 17,118 bp to 93,843 bp and spanning considerable nucleotide sequence diversity. PMID:28057748
Evidence for a Complex Class of Nonadenylated mRNA in Drosophila

PubMed Central

Zimmerman, J. Lynn; Fouts, David L.; Manning, Jerry E.

1980-01-01

The amount, by mass, of poly(A+) mRNA present in the polyribosomes of third-instar larvae of Drosophila melanogaster, and the relative contribution of the poly(A+) mRNA to the sequence complexity of total polysomal RNA, has been determined. Selective removal of poly(A+) mRNA from total polysomal RNA by use of either oligo-dT-cellulose, or poly(U)-sepharose affinity chromatography, revealed that only 0.15% of the mass of the polysomal RNA was present as poly(A+) mRNA. The present study shows that this RNA hybridized at saturation with 3.3% of the single-copy DNA in the Drosophila genome. After correction for asymmetric transcription and reactability of the DNA, 7.4% of the single-copy DNA in the Drosophila genome is represented in larval poly(A+) mRNA. This corresponds to 6.73 x 106 nucleotides of mRNA coding sequences, or approximately 5,384 diverse RNA sequences of average size 1,250 nucleotides. However, total polysomal RNA hybridizes at saturation to 10.9% of the single-copy DNA sequences. After correcting this value for asymmetric transcription and tracer DNA reactability, 24% of the single-copy DNA in Drosophila is represented in total polysomal RNA. This corresponds to 2.18 x 107 nucleotides of RNA coding sequences or 17,440 diverse RNA molecules of size 1,250 nucleotides. This value is 3.2 times greater than that observed for poly(A+) mRNA, and indicates that ≃69% of the polysomal RNA sequence complexity is contributed by nonadenylated RNA. Furthermore, if the number of different structural genes represented in total polysomal RNA is ≃1.7 x 104, then the number of genes expressed in third-instar larvae exceeds the number of chromomeres in Drosophila by about a factor of three. This numerology indicates that the number of chromomeres observed in polytene chromosomes does not reflect the number of structural gene sequences in the Drosophila genome. PMID:6777246
Development of single-nucleotide polymorphism markers for Bromus tectorum (Poaceae) from a partially sequenced transcriptome

Treesearch

Keith R. Merrill; Craig E. Coleman; Susan E. Meyer; Elizabeth A. Leger; Katherine A. Collins

2016-01-01

Premise of the study: Bromus tectorum (Poaceae) is an annual grass species that is invasive in many areas of the world but most especially in the U.S. Intermountain West. Single-nucleotide polymorphism (SNP) markers were developed for use in investigating the geospatial and ecological diversity of B. tectorum in the Intermountain West to better understand the...
High-resolution genetic map for understanding the effect of genome-wide recombination rate, selection sweep and linkage disequilibrium on nucleotide diversity in watermelon

USDA-ARS?s Scientific Manuscript database

Genotyping by sequencing (GBS) technology was used to identify a set of 9,933 single nucleotide polymorphism (SNP) markers for constructing a high-resolution genetic map of 1,087 cM for watermelon. The genome-wide variation of recombination rate (GWRR) across the map was evaluated and a positive co...
Sequence variation and phylogenetic analysis of envelope glycoprotein of hepatitis G virus.

PubMed

Lim, M Y; Fry, K; Yun, A; Chong, S; Linnen, J; Fung, K; Kim, J P

1997-11-01

A transfusion-transmissible agent provisionally designated hepatitis G virus (HGV) was recently identified. In this study, we examined the variability of the HGV genome by analysing sequences in the putative envelope region from 72 isolates obtained from diverse geographical sources. The 1561 nucleotide sequence of the E1/E2/NS2a region of HGV was determined from 12 isolates, and compared with three published sequences. The most variability was observed in 400 nucleotides at the N terminus of E2. We next analysed this 400 nucleotide envelope variable region (EV) from an additional 60 HGV isolates. This sequence varied considerably among the 75 isolates, with overall identity ranging from 79.3% to 99.5% at the nucleotide level, and from 83.5% to 100% at the amino acid level. However, hypervariable regions were not identified. Phylogenetic analyses indicated that the 75 HGV isolates belong to a single genotype. A single-tier distribution of evolutionary distances was observed among the 15 E1/E2/NS2a sequences and the 75 EV sequences. In contrast, 11 isolates of HCV were analysed and showed a three-tiered distribution, representing genotypes, subtypes, and isolates. The 75 isolates of HGV fell into four clusters on the phylogenetic tree. Tight geographical clustering was observed among the HGV isolates from Japan and Korea.
Estimating the Population Mutation Rate from a de novo Assembled Bactrian Camel Genome and Cross-Species Comparison with Dromedary ESTs

PubMed Central

2014-01-01

The Bactrian camel (Camelus bactrianus) and the dromedary (Camelus dromedarius) are among the last species that have been domesticated around 3000–6000 years ago. During domestication, strong artificial (anthropogenic) selection has shaped the livestock, creating a huge amount of phenotypes and breeds. Hence, domestic animals represent a unique resource to understand the genetic basis of phenotypic variation and adaptation. Similar to its late domestication history, the Bactrian camel is also among the last livestock animals to have its genome sequenced and deciphered. As no genomic data have been available until recently, we generated a de novo assembly by shotgun sequencing of a single male Bactrian camel. We obtained 1.6 Gb genomic sequences, which correspond to more than half of the Bactrian camel’s genome. The aim of this study was to identify heterozygous single-nucleotide polymorphisms (SNPs) and to estimate population parameters and nucleotide diversity based on an individual camel. With an average 6.6-fold coverage, we detected over 116 000 heterozygous SNPs and recorded a genome-wide nucleotide diversity similar to that of other domesticated ungulates. More than 20 000 (85%) dromedary expressed sequence tags successfully aligned to our genomic draft. Our results provide a template for future association studies targeting economically relevant traits and to identify changes underlying the process of camel domestication and environmental adaptation. PMID:23454912
Genome characterization and genetic diversity of sweet potato symptomless virus 1: a mastrevirus with an unusual nonanucleotide

USDA-ARS?s Scientific Manuscript database

Complete genomic sequences of nine isolates of sweet potato symptomless virus 1 (SPSMV-1), a virus of genus Mastrevirus in the family Geminiviridae, was determined to be 2,559-2,602 nucleotides from sweet potato accessions from different countries. These isolates shared genomic sequence identities o...
Genetic variation and biological activity of isolates of lymantria dispar multiple nucleopolyhedrovirus from north america, europe, and asia

USDA-ARS?s Scientific Manuscript database

Little is known about genetic variation of Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV; Baculoviridae: Alphabaculovirus) at the nucleotide sequence level. To obtain a more comprehensive view of genetic diversity among isolates of LdMNPV, partial sequences of the lef-8 gene were generated...
Mitochondrial genome of the bullet tuna Auxis rochei from Indo-West Pacific collection provides novel genetic information about two subspecies.

PubMed

Li, Mingming; Guo, Liang; Zhang, Heng; Yang, Sen; Chen, Xinghan; Lin, Haoran; Meng, Zining

2016-09-01

Previously morphological studies supported the division of the bullet tuna into the two subspecies, Auxis rochei rochei and A. rochei eudorax. As a cosmopolitan species, A. rochei rochei ranges in the Indo-West Pacific and Atlantic oceans, while A. rochei eudorax inhabits in eastern Pacific region. Here, we used the HiSeq next-generation sequencing technique to determine the mitochondrial genome (mitogenome) of A. rochei from Indo-West Pacific collection, and then compared our data with mitogenomic sequences of the Atlantic and eastern Pacific retrieved from NCBI database. Results showed the mitogenome of A. rochei from three geographic collections shared the same genes and gene order, similar to typical teleosts. Also, we examined a low level of nucleotide diversity among these mitogenomic sequences. Interestingly, nucleotide diversity of intra-subspecies (Atlantic versus Indo-West) was higher than that of inter-subspecies (Atlantic versus eastern Pacific, Indo-West versus eastern Pacific).

Genetic diversity and population structure analysis of spinach by single-nucleotide polymorphisms identified through genotyping-by-sequencing.

PubMed

Shi, Ainong; Qin, Jun; Mou, Beiquan; Correll, James; Weng, Yuejin; Brenner, David; Feng, Chunda; Motes, Dennis; Yang, Wei; Dong, Lingdi; Bhattarai, Gehendra; Ravelombola, Waltram

2017-01-01

Spinach (Spinacia oleracea L., 2n = 2x = 12) is an economically important vegetable crop worldwide and one of the healthiest vegetables due to its high concentrations of nutrients and minerals. The objective of this research was to conduct genetic diversity and population structure analysis of a collection of world-wide spinach genotypes using single nucleotide polymorphisms (SNPs) markers. Genotyping by sequencing (GBS) was used to discover SNPs in spinach genotypes. Three sets of spinach genotypes were used: 1) 268 USDA GRIN spinach germplasm accessions originally collected from 30 countries; 2) 45 commercial spinach F1 hybrids from three countries; and 3) 30 US Arkansas spinach cultivars/breeding lines. The results from this study indicated that there was genetic diversity among the 343 spinach genotypes tested. Furthermore, the genetic background in improved commercial F1 hybrids and in Arkansas cultivars/lines had a different structured populations from the USDA germplasm. In addition, the genetic diversity and population structures were associated with geographic origin and germplasm from the US Arkansas breeding program had a unique genetic background. These data could provide genetic diversity information and the molecular markers for selecting parents in spinach breeding programs.
Genetic diversity and population structure analysis of spinach by single-nucleotide polymorphisms identified through genotyping-by-sequencing

PubMed Central

Qin, Jun; Mou, Beiquan; Correll, James; Weng, Yuejin; Brenner, David; Feng, Chunda; Motes, Dennis; Yang, Wei; Dong, Lingdi; Bhattarai, Gehendra; Ravelombola, Waltram

2017-01-01

Spinach (Spinacia oleracea L., 2n = 2x = 12) is an economically important vegetable crop worldwide and one of the healthiest vegetables due to its high concentrations of nutrients and minerals. The objective of this research was to conduct genetic diversity and population structure analysis of a collection of world-wide spinach genotypes using single nucleotide polymorphisms (SNPs) markers. Genotyping by sequencing (GBS) was used to discover SNPs in spinach genotypes. Three sets of spinach genotypes were used: 1) 268 USDA GRIN spinach germplasm accessions originally collected from 30 countries; 2) 45 commercial spinach F1 hybrids from three countries; and 3) 30 US Arkansas spinach cultivars/breeding lines. The results from this study indicated that there was genetic diversity among the 343 spinach genotypes tested. Furthermore, the genetic background in improved commercial F1 hybrids and in Arkansas cultivars/lines had a different structured populations from the USDA germplasm. In addition, the genetic diversity and population structures were associated with geographic origin and germplasm from the US Arkansas breeding program had a unique genetic background. These data could provide genetic diversity information and the molecular markers for selecting parents in spinach breeding programs. PMID:29190770
Genetic diversity of pneumococcal surface protein A in invasive pneumococcal isolates from Korean children, 1991-2016.

PubMed

Yun, Ki Wook; Choi, Eun Hwa; Lee, Hoan Jong

2017-01-01

Pneumococcal surface protein A (PspA) is an important virulence factor of pneumococci and has been investigated as a primary component of a capsular serotype-independent pneumococcal vaccine. Thus, we sought to determine the genetic diversity of PspA to explore its potential as a vaccine candidate. Among the 190 invasive pneumococcal isolates collected from Korean children between 1991 and 2016, two (1.1%) isolates were found to have no pspA by multiple polymerase chain reactions. The full length pspA genes from 185 pneumococcal isolates were sequenced. The length of pspA varied, ranging from 1,719 to 2,301 base pairs with 55.7-100% nucleotide identity. Based on the sequences of the clade-defining regions, 68.7% and 49.7% were in PspA family 2 and clade 3/family 2, respectively. PspA clade types were correlated with genotypes using multilocus sequence typing and divided into several subclades based on diversity analysis of the N-terminal α-helical regions, which showed nucleotide sequence identities of 45.7-100% and amino acid sequence identities of 23.1-100%. Putative antigenicity plots were also diverse among individual clades and subclades. The differences in antigenicity patterns were concentrated within the N-terminal 120 amino acids. In conclusion, the N-terminal α-helical domain, which is known to be the major immunogenic portion of PspA, is genetically variable and should be further evaluated for antigenic differences and cross-reactivity between various PspA types from pneumococcal isolates.
Genetic diversity and population structure analysis of spinach by single-nucleotide polymorphisms identified through genotyping-by-sequencing

USDA-ARS?s Scientific Manuscript database

Spinach (Spinacia oleracea L., 2n=2x=12) is an economically important vegetable crop worldwide and one of the healthiest vegetables due to its high concentrations of nutrients and mineral compounds. The objective of this research is to conduct genetic diversity and population structure analysis of w...
Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

PubMed Central

2013-01-01

Background Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Results Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li’s D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li’s D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. Conclusions This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens. PMID:23497218
Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing.

PubMed

Cornman, Robert Scott; Boncristiani, Humberto; Dainat, Benjamin; Chen, Yanping; vanEngelsdorp, Dennis; Weaver, Daniel; Evans, Jay D

2013-03-07

Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li's D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li's D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens.
Geographically Distinct and Domain-Specific Sequence Variations in the Alleles of Rice Blast Resistance Gene Pib

PubMed Central

Vasudevan, Kumar; Vera Cruz, Casiana M.; Gruissem, Wilhelm; Bhullar, Navreet K.

2016-01-01

Rice blast is caused by Magnaporthe oryzae, which is the most destructive fungal pathogen affecting rice growing regions worldwide. The rice blast resistance gene Pib confers broad-spectrum resistance against Southeast Asian M. oryzae races. We investigated the allelic diversity of Pib in rice germplasm originating from 12 major rice growing countries. Twenty-five new Pib alleles were identified that have unique single nucleotide polymorphisms (SNPs), insertions and/or deletions, in addition to the polymorphic nucleotides that are shared between the different alleles. These partially or completely shared polymorphic nucleotides indicate frequent sequence exchange events between the Pib alleles. In some of the new Pib alleles, nucleotide diversity is high in the LRR domain, whereas, in others it is distributed among the NB-ARC and LRR domains. Most of the polymorphic amino acids in LRR and NB-ARC2 domains are predicted as solvent-exposed. Several of the alleles and the unique SNPs are country specific, suggesting a diversifying selection of alleles in various geographical locations in response to the locally prevalent M. oryzae population. Together, the new Pib alleles are an important genetic resource for rice blast resistance breeding programs and provide new information on rice-M. oryzae interactions at the molecular level. PMID:27446145
Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.

PubMed

Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P

2005-01-01

We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.
Exploring the feasibility of using copy number variants as genetic markers through large-scale whole genome sequencing experiments

USDA-ARS?s Scientific Manuscript database

Copy number variants (CNV) are large scale duplications or deletions of genomic sequence that are caused by a diverse set of molecular phenomena that are distinct from single nucleotide polymorphism (SNP) formation. Due to their different mechanisms of formation, CNVs are often difficult to track us...
Genome Survey Sequencing of Luffa Cylindrica L. and Microsatellite High Resolution Melting (SSR-HRM) Analysis for Genetic Relationship of Luffa Genotypes.

PubMed

An, Jianyu; Yin, Mengqi; Zhang, Qin; Gong, Dongting; Jia, Xiaowen; Guan, Yajing; Hu, Jin

2017-09-11

Luffa cylindrica (L.) Roem. is an economically important vegetable crop in China. However, the genomic information on this species is currently unknown. In this study, for the first time, a genome survey of L. cylindrica was carried out using next-generation sequencing (NGS) technology. In total, 43.40 Gb sequence data of L. cylindrica , about 54.94× coverage of the estimated genome size of 789.97 Mb, were obtained from HiSeq 2500 sequencing, in which the guanine plus cytosine (GC) content was calculated to be 37.90%. The heterozygosity of genome sequences was only 0.24%. In total, 1,913,731 contigs (>200 bp) with 525 bp N 50 length and 1,410,117 scaffolds (>200 bp) with 885.01 Mb total length were obtained. From the initial assembled L. cylindrica genome, 431,234 microsatellites (SSRs) (≥5 repeats) were identified. The motif types of SSR repeats included 62.88% di-nucleotide, 31.03% tri-nucleotide, 4.59% tetra-nucleotide, 0.96% penta-nucleotide and 0.54% hexa-nucleotide. Eighty genomic SSR markers were developed, and 51/80 primers could be used in both "Zheda 23" and "Zheda 83". Nineteen SSRs were used to investigate the genetic diversity among 32 accessions through SSR-HRM analysis. The unweighted pair group method analysis (UPGMA) dendrogram tree was built by calculating the SSR-HRM raw data. SSR-HRM could be effectively used for genotype relationship analysis of Luffa species.
Serotype and genetic diversity of human rhinovirus strains that circulated in Kenya in 2008.

PubMed

Milanoi, Sylvia; Ongus, Juliette R; Gachara, George; Coldren, Rodney; Bulimo, Wallace

2016-05-01

Human rhinoviruses (HRVs) are a well-established cause of the common cold and recent studies indicated that they may be associated with severe acute respiratory illnesses (SARIs) like pneumonia, asthma, and bronchiolitis. Despite global studies on the genetic diversity of the virus, the serotype diversity of these viruses across diverse geographic regions in Kenya has not been characterized. This study sought to characterize the serotype diversity of HRV strains that circulated in Kenya in 2008. A total of 517 archived nasopharyngeal samples collected in a previous respiratory virus surveillance program across Kenya in 2008 were selected. Participants enrolled were outpatients who presented with influenza-like (ILI) symptoms. Real-time RT-PCR was employed for preliminary HRV detection. HRV-positive samples were amplified using RT-PCR and thereafter the nucleotide sequences of the amplicons were determined followed by phylogenetic analysis. Twenty-five percent of the samples tested positive for HRV. Phylogenetic analysis revealed that the Kenyan HRVs clustered into three main species comprising HRV-A (54%), HRV-B (12%), and HRV-C (35%). Overall, 20 different serotypes were identified. Intrastrain sequence homology among the Kenyan strains ranged from 58% to 100% at the nucleotide level and 55% to 100% at the amino acid level. These results show that a wide range of HRV serotypes with different levels of nucleotide variation were present in Kenya. Furthermore, our data show that HRVs contributed substantially to influenza-like illness in Kenya in 2008. © 2016 The Authors. Influenza and Other Respiratory Viruses Published by John Wiley & Sons Ltd.
Diversity of immunoglobulin lambda light chain gene usage over developmental stages in the horse.

PubMed

Tallmadge, Rebecca L; Tseng, Chia T; Felippe, M Julia B

2014-10-01

To further studies of neonatal immune responses to pathogens and vaccination, we investigated the dynamics of B lymphocyte development and immunoglobulin (Ig) gene diversity. Previously we demonstrated that equine fetal Ig VDJ sequences exhibit combinatorial and junctional diversity levels comparable to those of adult Ig VDJ sequences. Herein, RACE clones from fetal, neonatal, foal, and adult lymphoid tissue were assessed for Ig lambda light chain combinatorial, junctional, and sequence diversity. Remarkably, more lambda variable genes (IGLV) were used during fetal life than later stages and IGLV gene usage differed significantly with time, in contrast to the Ig heavy chain. Junctional diversity measured by CDR3L length was constant over time. Comparison of Ig lambda transcripts to germline revealed significant increases in nucleotide diversity over time, even during fetal life. These results suggest that the Ig lambda light chain provides an additional dimension of diversity to the equine Ig repertoire. Copyright © 2014 Elsevier Ltd. All rights reserved.
Application of the major capsid protein as a marker of the phylogenetic diversity of Emiliania huxleyi viruses.

PubMed

Rowe, Janet M; Fabre, Marie-Françoise; Gobena, Daniel; Wilson, William H; Wilhelm, Steven W

2011-05-01

Studies of the Phycodnaviridae have traditionally relied on the DNA polymerase (pol) gene as a biomarker. However, recent investigations have suggested that the major capsid protein (MCP) gene may be a reliable phylogenetic biomarker. We used MCP gene amplicons gathered across the North Atlantic to assess the diversity of Emiliania huxleyi-infecting Phycodnaviridae. Nucleotide sequences were examined across >6000 km of open ocean, with comparisons between concentrates of the virus-size fraction of seawater and of lysates generated by exposing host strains to these same virus concentrates. Analyses revealed that many sequences were only sampled once, while several were over-represented. Analyses also revealed nucleotide sequences distinct from previous coastal isolates. Examination of lysed cultures revealed a new richness in phylogeny, as MCP sequences previously unrepresented within the existing collection of E. huxleyi viruses (EhV) were associated with viruses lysing cultures. Sequences were compared with previously described EhV MCP sequences from the North Sea and a Norwegian Fjord, as well as from the Gulf of Maine. Principal component analysis indicates that location-specific distinctions exist despite the presence of sequences common across these environments. Overall, this investigation provides new sequence data and an assessment on the use of the MCP gene. © 2011 Federation of European Microbiological Societies Published by Blackwell Publishing Ltd. All rights reserved.
Decreased Nucleotide and Expression Diversity and Modified Coexpression Patterns Characterize Domestication in the Common Bean[W][OPEN

PubMed Central

Bellucci, Elisa; Bitocchi, Elena; Ferrarini, Alberto; Benazzo, Andrea; Biagetti, Eleonora; Klie, Sebastian; Minio, Andrea; Rau, Domenico; Rodriguez, Monica; Panziera, Alex; Venturini, Luca; Attene, Giovanna; Albertini, Emidio; Jackson, Scott A.; Nanni, Laura; Fernie, Alisdair R.; Nikoloski, Zoran; Bertorelle, Giorgio; Delledonne, Massimo; Papa, Roberto

2014-01-01

Using RNA sequencing technology and de novo transcriptome assembly, we compared representative sets of wild and domesticated accessions of common bean (Phaseolus vulgaris) from Mesoamerica. RNA was extracted at the first true-leaf stage, and de novo assembly was used to develop a reference transcriptome; the final data set consists of ∼190,000 single nucleotide polymorphisms from 27,243 contigs in expressed genomic regions. A drastic reduction in nucleotide diversity (∼60%) is evident for the domesticated form, compared with the wild form, and almost 50% of the contigs that are polymorphic were brought to fixation by domestication. In parallel, the effects of domestication decreased the diversity of gene expression (18%). While the coexpression networks for the wild and domesticated accessions demonstrate similar seminal network properties, they show distinct community structures that are enriched for different molecular functions. After simulating the demographic dynamics during domestication, we found that 9% of the genes were actively selected during domestication. We also show that selection induced a further reduction in the diversity of gene expression (26%) and was associated with 5-fold enrichment of differentially expressed genes. While there is substantial evidence of positive selection associated with domestication, in a few cases, this selection has increased the nucleotide diversity in the domesticated pool at target loci associated with abiotic stress responses, flowering time, and morphology. PMID:24850850
Structure of some East African Glossina fuscipes fuscipes populations

PubMed Central

Krafsur, E. S.; Marquez, J. G.; Ouma, J. O.

2008-01-01

Glossina fuscipes fuscipes Newstead 1910 (Diptera: Glossinidae) is the primary vector of human sleeping sickness in Kenya and Uganda. This is the first report on its population structure. A total of 688 nucleotides of mitochondrial ribosomal 16S2 and cytochrome oxidase I genes were sequenced. Twenty-one variants were scored in 79 flies from three geographically diverse natural populations. Four haplotypes were shared among populations, eight were private and nine were singletons. The mean haplotype and nucleotide diversities were 0.84 and 0.009, respectively. All populations were genetically differentiated and were at demographic equilibrium. In addition, a longstanding laboratory culture originating from the Central African Republic (CAR-lab) in 1986 (or before) was examined. Haplotype and nucleotide diversities in this culture were 0.95 and 0.012, respectively. None of its 27 haplotypes were shared with the East African populations. A first approximation of relative effective population sizes was Uganda > CAR-lab > Kenya. It was concluded that the structure of G. f. fuscipes populations in East Africa is localized. PMID:18816270
Sequence analysis of a few species of termites (Order: Isoptera) on the basis of partial characterization of COII gene.

PubMed

Sobti, Ranbir Chander; Kumari, Mamtesh; Sharma, Vijay Lakshmi; Sodhi, Monika; Mukesh, Manishi; Shouche, Yogesh

2009-11-01

The present study was aimed to get the nucleotide sequences of a part of COII mitochondrial gene amplified from individuals of five species of Termites (Isoptera: Termitidae: Macrotermitinae). Four of them belonged to the genus Odontotermes (O. obesus, O. horni, O. bhagwatii and Odontotermes sp.) and one to Microtermes (M. obesi). Partial COII gene fragments were amplified by using specific primers. The sequences so obtained were characterized to calculate the frequencies of each nucleotide bases and a high A + T content was observed. The interspecific pairwise sequence divergence in Odontotermes species ranged from 6.5% to 17.1% across COII fragment. M. obesi sequence diversity ranged from 2.5 with Odontotermes sp. to 19.0% with O. bhagwatii. Phylogenetic trees drawn on the basis of distance neighbour-joining method revealed three main clades clustering all the individuals according to their genera and families.
A genomic scale map of genetic diversity in Trypanosoma cruzi

PubMed Central

2012-01-01

Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs): TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the population, providing an essential resource for future studies on the development of new drugs and diagnostics, for Chagas Disease. These data is available through the TcSNP database (http://snps.tcruzi.org). PMID:23270511
High levels of Y-chromosome nucleotide diversity in the genus Pan

PubMed Central

Stone, Anne C.; Griffiths, Robert C.; Zegura, Stephen L.; Hammer, Michael F.

2002-01-01

Although some mitochondrial, X chromosome, and autosomal sequence diversity data are available for our closest relatives, Pan troglodytes and Pan paniscus, data from the nonrecombining portion of the Y chromosome (NRY) are more limited. We examined ≈3 kb of NRY DNA from 101 chimpanzees, seven bonobos, and 42 humans to investigate: (i) relative levels of intraspecific diversity; (ii) the degree of paternal lineage sorting among species and subspecies of the genus Pan; and (iii) the date of the chimpanzee/bonobo divergence. We identified 10 informative sequence-tagged sites associated with 23 polymorphisms on the NRY from the genus Pan. Nucleotide diversity was significantly higher on the NRY of chimpanzees and bonobos than on the human NRY. Similar to mtDNA, but unlike X-linked and autosomal loci, lineages defined by mutations on the NRY were not shared among subspecies of P. troglodytes. Comparisons with mtDNA ND2 sequences from some of the same individuals revealed a larger female versus male effective population size for chimpanzees. The NRY-based divergence time between chimpanzees and bonobos was estimated at ≈1.8 million years ago. In contrast to human populations who appear to have had a low effective size and a recent origin with subsequent population growth, some taxa within the genus Pan may be characterized by large populations of relatively constant size, more ancient origins, and high levels of subdivision. PMID:11756656
Low level of sequence diversity at merozoite surface protein-1 locus of Plasmodium ovale curtisi and P. ovale wallikeri from Thai isolates.

PubMed

Putaporntip, Chaturong; Hughes, Austin L; Jongwutiwes, Somchai

2013-01-01

The merozoite surface protein-1 (MSP-1) is a candidate target for the development of blood stage vaccines against malaria. Polymorphism in MSP-1 can be useful as a genetic marker for strain differentiation in malarial parasites. Although sequence diversity in the MSP-1 locus has been extensively analyzed in field isolates of Plasmodium falciparum and P. vivax, the extent of variation in its homologues in P. ovale curtisi and P. ovale wallikeri, remains unknown. Analysis of the mitochondrial cytochrome b sequences of 10 P. ovale isolates from symptomatic malaria patients from diverse endemic areas of Thailand revealed co-existence of P. ovale curtisi (n = 5) and P. ovale wallikeri (n = 5). Direct sequencing of the PCR-amplified products encompassing the entire coding region of MSP-1 of P. ovale curtisi (PocMSP-1) and P. ovale wallikeri (PowMSP-1) has identified 3 imperfect repeated segments in the former and one in the latter. Most amino acid differences between these proteins were located in the interspecies variable domains of malarial MSP-1. Synonymous nucleotide diversity (πS) exceeded nonsynonymous nucleotide diversity (πN) for both PocMSP-1 and PowMSP-1, albeit at a non-significant level. However, when MSP-1 of both these species was considered together, πS was significantly greater than πN (p<0.0001), suggesting that purifying selection has shaped diversity at this locus prior to speciation. Phylogenetic analysis based on conserved domains has placed PocMSP-1 and PowMSP-1 in a distinct bifurcating branch that probably diverged from each other around 4.5 million years ago. The MSP-1 sequences support that P. ovale curtisi and P. ovale wallikeri are distinct species. Both species are sympatric in Thailand. The low level of sequence diversity in PocMSP-1 and PowMSP-1 among Thai isolates could stem from persistent low prevalence of these species, limiting the chance of outcrossing at this locus.
Low Level of Sequence Diversity at Merozoite Surface Protein-1 Locus of Plasmodium ovale curtisi and P. ovale wallikeri from Thai Isolates

PubMed Central

Putaporntip, Chaturong; Hughes, Austin L.; Jongwutiwes, Somchai

2013-01-01

Background The merozoite surface protein-1 (MSP-1) is a candidate target for the development of blood stage vaccines against malaria. Polymorphism in MSP-1 can be useful as a genetic marker for strain differentiation in malarial parasites. Although sequence diversity in the MSP-1 locus has been extensively analyzed in field isolates of Plasmodium falciparum and P. vivax, the extent of variation in its homologues in P. ovale curtisi and P. ovale wallikeri, remains unknown. Methodology/Principal Findings Analysis of the mitochondrial cytochrome b sequences of 10 P. ovale isolates from symptomatic malaria patients from diverse endemic areas of Thailand revealed co-existence of P. ovale curtisi (n = 5) and P. ovale wallikeri (n = 5). Direct sequencing of the PCR-amplified products encompassing the entire coding region of MSP-1 of P. ovale curtisi (PocMSP-1) and P. ovale wallikeri (PowMSP-1) has identified 3 imperfect repeated segments in the former and one in the latter. Most amino acid differences between these proteins were located in the interspecies variable domains of malarial MSP-1. Synonymous nucleotide diversity (πS) exceeded nonsynonymous nucleotide diversity (πN) for both PocMSP-1 and PowMSP-1, albeit at a non-significant level. However, when MSP-1 of both these species was considered together, πS was significantly greater than πN (p<0.0001), suggesting that purifying selection has shaped diversity at this locus prior to speciation. Phylogenetic analysis based on conserved domains has placed PocMSP-1 and PowMSP-1 in a distinct bifurcating branch that probably diverged from each other around 4.5 million years ago. Conclusion/Significance The MSP-1 sequences support that P. ovale curtisi and P. ovale wallikeri are distinct species. Both species are sympatric in Thailand. The low level of sequence diversity in PocMSP-1 and PowMSP-1 among Thai isolates could stem from persistent low prevalence of these species, limiting the chance of outcrossing at this locus. PMID:23536840

Variation in the number of nucleoli and incomplete homogenization of 18S ribosomal DNA sequences in leaf cells of the cultivated Oriental ginseng (Panax ginseng Meyer).

PubMed

Chelomina, Galina N; Rozhkovan, Konstantin V; Voronova, Anastasia N; Burundukova, Olga L; Muzarok, Tamara I; Zhuravlev, Yuri N

2016-04-01

Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440-640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine.
Variation in the number of nucleoli and incomplete homogenization of 18S ribosomal DNA sequences in leaf cells of the cultivated Oriental ginseng (Panax ginseng Meyer)

PubMed Central

Chelomina, Galina N.; Rozhkovan, Konstantin V.; Voronova, Anastasia N.; Burundukova, Olga L.; Muzarok, Tamara I.; Zhuravlev, Yuri N.

2015-01-01

Background Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. Methods The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. Results In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440–640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. Conclusion This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine. PMID:27158239
Genetic analyses reveal unusually high diversity of infectious haematopoietic necrosis virus in rainbow trout aquaculture

USGS Publications Warehouse

Troyer, Ryan M.; LaPatra, Scott E.; Kurath, Gael

2000-01-01

Infectious haematopoietic necrosis virus (IHNV) is the most significant virus pathogen of salmon and trout in North America. Previous studies have shown relatively low genetic diversity of IHNV within large geographical regions. In this study, the genetic heterogeneity of 84 IHNV isolates sampled from rainbow trout (Oncorhynchus mykiss) over a 20 year period at four aquaculture facilities within a 12 mile stretch of the Snake River in Idaho, USA was investigated. The virus isolates were characterized using an RNase protection assay (RPA) and nucleotide sequence analyses. Among the 84 isolates analysed, 46 RPA haplotypes were found and analyses revealed a high level of genetic heterogeneity relative to that detected in other regions. Sequence analyses revealed up to 7·6% nucleotide divergence, which is the highest level of diversity reported for IHNV to date. Phylogenetic analyses identified four distinct monophyletic clades representing four virus lineages. These lineages were distributed across facilities, and individual facilities contained multiple lineages. These results suggest that co-circulating IHNV lineages of relatively high genetic diversity are present in the IHNV populations in this rainbow trout culture study site. Three of the four lineages exhibited temporal trends consistent with rapid evolution.
Nucleotide diversity and linkage disequilibrium in wild avocado (Persea americana Mill.).

PubMed

Chen, Haofeng; Morrell, Peter L; de la Cruz, Marlene; Clegg, Michael T

2008-01-01

Resequencing studies provide the ultimate resolution of genetic diversity because they identify all mutations in a gene that are present within the sampled individuals. We report a resequencing study of Persea americana, a subtropical tree species native to Meso- and Central America and the progenitor of cultivated avocado. The sample includes 21 wild accessions from Mexico, Costa Rica, Ecuador, and the Dominican Republic. Estimated levels of nucleotide polymorphism and linkage disequilibrium (LD) are obtained from fully resolved haplotype data from 4 nuclear loci that span 5960 nucleotide sites. Results show that, although avocado is a subtropical tree crop and a predominantly outcrossing plant, the overall level of genetic variation is not exceptionally high (nucleotide diversity at silent sites, pi(sil) = 0.0102) compared with available estimates from temperate plant species. Intralocus LD decays rapidly to half the initial value within about 1 kb. Estimates of recombination rate (based on the sequence data) show that the rate is not exceptionally high when compared with annual plants such as wild barley or maize. Interlocus LD is significant owing to substantial population structure induced by mixing of the 3 botanical races of avocado.
Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.

PubMed

Sakai, Ryo; Aerts, Jan

2014-01-01

The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.
Multilocus sequence analysis of Thermoanaerobacter isolates reveals recombining, but differentiated, populations from geothermal springs of the Uzon Caldera, Kamchatka, Russia

PubMed Central

Wagner, Isaac D.; Varghese, Litty B.; Hemme, Christopher L.; Wiegel, Juergen

2013-01-01

Thermal environments have island-like characteristics and provide a unique opportunity to study population structure and diversity patterns of microbial taxa inhabiting these sites. Strains having ≥98% 16S rRNA gene sequence similarity to the obligately anaerobic Firmicutes Thermoanaerobacter uzonensis were isolated from seven geothermal springs, separated by up to 1600 m, within the Uzon Caldera (Kamchatka, Russian Far East). The intraspecies variation and spatial patterns of diversity for this taxon were assessed by multilocus sequence analysis (MLSA) of 106 strains. Analysis of eight protein-coding loci (gyrB, lepA, leuS, pyrG, recA, recG, rplB, and rpoB) revealed that all loci were polymorphic and that nucleotide substitutions were mostly synonymous. There were 148 variable nucleotide sites across 8003 bp concatenates of the protein-coding loci. While pairwise FST values indicated a small but significant level of genetic differentiation between most subpopulations, there was a negligible relationship between genetic divergence and spatial separation. Strains with the same allelic profile were only isolated from the same hot spring, occasionally from consecutive years, and single locus variant (SLV) sequence types were usually derived from the same spring. While recombination occurred, there was an “epidemic” population structure in which a particular T. uzonensis sequence type rose in frequency relative to the rest of the population. These results demonstrate spatial diversity patterns for an anaerobic bacterial species in a relative small geographic location and reinforce the view that terrestrial geothermal springs are excellent places to look for biogeographic diversity patterns regardless of the involved distances. PMID:23801987
Adaptive microclimatic structural and expressional dehydrin 1 evolution in wild barley, Hordeum spontaneum, at 'Evolution Canyon', Mount Carmel, Israel.

PubMed

Yang, Zujun; Zhang, Tao; Bolshoy, Alexander; Beharav, Alexander; Nevo, Eviatar

2009-05-01

'Evolution Canyon' (ECI) at Lower Nahal Oren, Mount Carmel, Israel, is an optimal natural microscale model for unravelling evolution in action highlighting the twin evolutionary processes of adaptation and speciation. A major model organism in ECI is wild barley, Hordeum spontaneum, the progenitor of cultivated barley, which displays dramatic interslope adaptive and speciational divergence on the 'African' dry slope (AS) and the 'European' humid slope (ES), separated on average by 200 m. Here we examined interslope single nucleotide polymorphism (SNP) sequences and the expression diversity of the drought resistant dehydrin 1 gene (Dhn1) between the opposite slopes. We analysed 47 plants (genotypes), 4-10 individuals in each of seven stations (populations) in an area of 7000 m(2), for Dhn1 sequence diversity located in the 5' upstream flanking region of the gene. We found significant levels of Dhn1 genic diversity represented by 29 haplotypes, derived from 45 SNPs in a total of 708 bp sites. Most of the haplotypes, 25 out of 29 (= 86.2%), were represented by one genotype; hence, unique to one population. Only a single haplotype was common to both slopes. Genetic divergence of sequence and haplotype diversity was generally and significantly different among the populations and slopes. Nucleotide diversity was higher on the AS, whereas haplotype diversity was higher on the ES. Interslope divergence was significantly higher than intraslope divergence. The applied Tajima D rejected neutrality of the SNP diversity. The Dhn1 expression under dehydration indicated interslope divergent expression between AS and ES genotypes, reinforcing Dhn1 associated with drought resistance of wild barley at 'Evolution Canyon'. These results are inexplicable by mutation, gene flow, or chance effects, and support adaptive natural microclimatic selection as the major evolutionary divergent driving force.
Adaptive microclimatic evolution of the dehydrin 6 gene in wild barley at "Evolution Canyon", Israel.

PubMed

Yang, Zujun; Zhang, Tao; Li, Guangrong; Nevo, Eviatar

2011-12-01

Dehydrins are one of the major stress-induced gene families, and the expression of dehydrin 6 (Dhn6) is strictly related to drought in barley. In order to investigate how the evolution of the Dhn6 gene is associated with adaptation to environmental changes, we examined 48 genotypes of wild barley, Hordeum spontaneum, from "Evolution Canyon" at Mount Carmel, Israel. The Dhn6 sequences of the 48 genotypes were identified, and a recent insertion of 342 bp at 5'UTR was found in the sequences of 11 genotypes. Both nucleotide and haplotype diversity of single nucleotide polymorphism in Dhn6 coding regions were higher on the AS ("African" slope or dry slope) than on the ES ("European" slope or humid slope), and the applied Tajima D and Fu-Li test rejected neutrality of SNP diversity. Expression analysis indicated that the 342 bp insertion at 5'UTR was associated with the earlier up-regulation of Dhn6 after dehydration. The genetic divergence of amino acids sequences indicated significant positive selection of Dhn6 among the wild barley populations. The diversity of Dhn6 in microclimatic divergence slopes suggested that Dhn6 has been subjected to natural selection and adaptively associated with drought resistance of wild barley at "Evolution Canyon".
Genetic diversity and potential vectors and reservoirs of Cucurbit aphid-borne yellows virus in southeastern Spain.

PubMed

Kassem, Mona A; Juarez, Miguel; Gómez, Pedro; Mengual, Carmen M; Sempere, Raquel N; Plaza, María; Elena, Santiago F; Moreno, Aranzazu; Fereres, Alberto; Aranda, Miguel A

2013-11-01

The genetic variability of a Cucurbit aphid-borne yellows virus (CABYV) (genus Polerovirus, family Luteoviridae) population was evaluated by determining the nucleotide sequences of two genomic regions of CABYV isolates collected in open-field melon and squash crops during three consecutive years in Murcia (southeastern Spain). A phylogenetic analysis showed the existence of two major clades. The sequences did not cluster according to host, year, or locality of collection, and nucleotide similarities among isolates were 97 to 100 and 94 to 97% within and between clades, respectively. The ratio of nonsynonymous to synonymous nucleotide substitutions reflected that all open reading frames have been under purifying selection. Estimates of the population's genetic diversity were of the same magnitude as those previously reported for other plant virus populations sampled at larger spatial and temporal scales, suggesting either the presence of CABYV in the surveyed area long before it was first described, multiple introductions, or a particularly rapid diversification. We also determined the full-length sequences of three isolates, identifying the occurrence and location of recombination events along the CABYV genome. Furthermore, our field surveys indicated that Aphis gossypii was the major vector species of CABYV and the most abundant aphid species colonizing melon fields in the Murcia (Spain) region. Our surveys also suggested the importance of the weed species Ecballium elaterium as an alternative host and potential virus reservoir.
Phylogeography of infectious haematopoietic necrosis virus in North America

USGS Publications Warehouse

Kurath, Gael; Garver, Kyle A.; Troyer, Ryan M.; Emmenegger, Eveline J.; Einer-Jensen, Katja; Anderson, Eric D.

2003-01-01

Infectious hematopoietic necrosis virus (IHNV) is a rhabdoviral pathogen that infects wild and cultured salmonid fish throughout the Pacific Northwest of North America. IHNV causes severe epidemics in young fish and can cause disease or occur asymptomatically in adults. In a broad survey of 323 IHNV field isolates, sequence analysis of a 303 nucleotide variable region within the glycoprotein gene revealed a maximum nucleotide diversity of 8.6 %, indicating low genetic diversity overall for this virus. Phylogenetic analysis revealed three major virus genogroups, designated U, M and L, which varied in topography and geographical range. Intragenogroup genetic diversity measures indicated that the M genogroup had three- to fourfold more diversity than the other genogroups and suggested relatively rapid evolution of the M genogroup and stasis within the U genogroup. We speculate that factors influencing IHNV evolution may have included ocean migration ranges of their salmonid host populations and anthropogenic effects associated with fish culture.
Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

PubMed Central

Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

2005-01-01

Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134
Genome Survey Sequencing of Luffa Cylindrica L. and Microsatellite High Resolution Melting (SSR-HRM) Analysis for Genetic Relationship of Luffa Genotypes

PubMed Central

An, Jianyu; Yin, Mengqi; Zhang, Qin; Gong, Dongting; Jia, Xiaowen; Guan, Yajing; Hu, Jin

2017-01-01

Luffa cylindrica (L.) Roem. is an economically important vegetable crop in China. However, the genomic information on this species is currently unknown. In this study, for the first time, a genome survey of L. cylindrica was carried out using next-generation sequencing (NGS) technology. In total, 43.40 Gb sequence data of L. cylindrica, about 54.94× coverage of the estimated genome size of 789.97 Mb, were obtained from HiSeq 2500 sequencing, in which the guanine plus cytosine (GC) content was calculated to be 37.90%. The heterozygosity of genome sequences was only 0.24%. In total, 1,913,731 contigs (>200 bp) with 525 bp N50 length and 1,410,117 scaffolds (>200 bp) with 885.01 Mb total length were obtained. From the initial assembled L. cylindrica genome, 431,234 microsatellites (SSRs) (≥5 repeats) were identified. The motif types of SSR repeats included 62.88% di-nucleotide, 31.03% tri-nucleotide, 4.59% tetra-nucleotide, 0.96% penta-nucleotide and 0.54% hexa-nucleotide. Eighty genomic SSR markers were developed, and 51/80 primers could be used in both “Zheda 23” and “Zheda 83”. Nineteen SSRs were used to investigate the genetic diversity among 32 accessions through SSR-HRM analysis. The unweighted pair group method analysis (UPGMA) dendrogram tree was built by calculating the SSR-HRM raw data. SSR-HRM could be effectively used for genotype relationship analysis of Luffa species. PMID:28891982
Genome-wide diversity and differentiation in New World populations of the human malaria parasite Plasmodium vivax

PubMed Central

de Oliveira, Thais C.; Rodrigues, Priscila T.; Menezes, Maria José; Gonçalves-Lopes, Raquel M.; Bastos, Melissa S.; Lima, Nathália F.; Barbosa, Susana; Gerber, Alexandra L.; Loss de Morais, Guilherme; Berná, Luisa; Phelan, Jody; Robello, Carlos; de Vasconcelos, Ana Tereza R.

2017-01-01

Background The Americas were the last continent colonized by humans carrying malaria parasites. Plasmodium falciparum from the New World shows very little genetic diversity and greater linkage disequilibrium, compared with its African counterparts, and is clearly subdivided into local, highly divergent populations. However, limited available data have revealed extensive genetic diversity in American populations of another major human malaria parasite, P. vivax. Methods We used an improved sample preparation strategy and next-generation sequencing to characterize 9 high-quality P. vivax genome sequences from northwestern Brazil. These new data were compared with publicly available sequences from recently sampled clinical P. vivax isolates from Brazil (BRA, total n = 11 sequences), Peru (PER, n = 23), Colombia (COL, n = 31), and Mexico (MEX, n = 19). Principal findings/Conclusions We found that New World populations of P. vivax are as diverse (nucleotide diversity π between 5.2 × 10−4 and 6.2 × 10−4) as P. vivax populations from Southeast Asia, where malaria transmission is substantially more intense. They display several non-synonymous nucleotide substitutions (some of them previously undescribed) in genes known or suspected to be involved in antimalarial drug resistance, such as dhfr, dhps, mdr1, mrp1, and mrp-2, but not in the chloroquine resistance transporter ortholog (crt-o) gene. Moreover, P. vivax in the Americas is much less geographically substructured than local P. falciparum populations, with relatively little between-population genome-wide differentiation (pairwise FST values ranging between 0.025 and 0.092). Finally, P. vivax populations show a rapid decline in linkage disequilibrium with increasing distance between pairs of polymorphic sites, consistent with very frequent outcrossing. We hypothesize that the high diversity of present-day P. vivax lineages in the Americas originated from successive migratory waves and subsequent admixture between parasite lineages from geographically diverse sites. Further genome-wide analyses are required to test the demographic scenario suggested by our data. PMID:28759591
Helicobacter pylori Heat Shock Protein A: Serologic Responses and Genetic Diversity

PubMed Central

Ng, Enders K. W.; Thompson, Stuart A.; Pérez-Pérez, Guillermo I.; Kansau, Imad; van der Ende, Arie; Labigne, Agnès; Sung, Joseph J. Y.; Chung, S. C. Sydney; Blaser, Martin J.

1999-01-01

Helicobacter pylori synthesizes an unusual GroES homolog, heat shock protein A (HspA). The present study was aimed at an assessment of the serological response to HspA in a group of Chinese patients with defined gastroduodenal pathologies and determination of whether diversity is present in the nucleotide sequences encoding HspA in isolates from these patients. Serum samples collected from 154 patients who had an upper gastrointestinal pathology and the presence of H. pylori defined by biopsy were tested for an immunoglobulin G (IgG) serologic response to H. pylori HspA by an enzyme linked immunosorbant assay. HspA-encoding nucleotide sequences in H. pylori isolates from 14 patients (7 seropositive and 7 seronegative for HspA) were analyzed by PCR and direct sequencing of the PCR products. The sequencing results were compared to those of 48 isolates from other parts of the world. Of the 154 known H. pylori-positive patients, 54 (35.1%) were seropositive for HspA. The A domain (GroES homology) of HspA was highly conserved in the 14 isolates tested. Although the B domain (metal-binding site unique to H. pylori) resembled that in the known major variant, particular amino acid substitutions allowed definition of an HspA variant associated with isolates from East Asia. There were no associations between patient characteristics and HspA seropositivity or amino acid sequences. We confirmed in this study that the clinical outcomes of H. pylori infection are not related to HspA antigenicity or to sequence variation. However, B-domain sequence variation may be a marker for the study of the genetic diversity of H. pylori strains of different geographic origins. PMID:10225839
Within-Host Variations of Human Papillomavirus Reveal APOBEC Signature Mutagenesis in the Viral Genome.

PubMed

Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

2018-06-15

Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied by the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here, we explored within-host genetic diversity of HPV by performing deep-sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52, and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) and were deep sequenced. After constructing a reference viral genome sequence for each specimen, nucleotide positions showing changes with >0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with various numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the trinucleotide context encompassing substituted bases revealed that TpCpN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep-sequencing analyses, we show for the first time a comprehensive snapshot of the within-host genetic diversity of high-risk HPVs during cervical carcinogenesis. Quasispecies harboring minor nucleotide variations in viral whole-genome sequences were extensively observed across different grades of CIN and cervical cancer. Among the within-host variations, C-to-T transitions, a characteristic change mediated by cellular APOBEC cytosine deaminases, were predominantly detected throughout the whole viral genome, most strikingly in low-grade CIN lesions. The results strongly suggest that within-host variations of the HPV genome are primarily generated through the interaction with host cell DNA-editing enzymes and that such within-host variability is an evolutionary source of the genetic diversity of HPVs. Copyright © 2018 American Society for Microbiology.
Transcript-specific, single-nucleotide polymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.).

PubMed

Allen, Alexandra M; Barker, Gary L A; Berry, Simon T; Coghill, Jane A; Gwilliam, Rhian; Kirby, Susan; Robinson, Phil; Brenchley, Rachel C; D'Amore, Rosalinda; McKenzie, Neil; Waite, Darren; Hall, Anthony; Bevan, Michael; Hall, Neil; Edwards, Keith J

2011-12-01

Food security is a global concern and substantial yield increases in cereal crops are required to feed the growing world population. Wheat is one of the three most important crops for human and livestock feed. However, the complexity of the genome coupled with a decline in genetic diversity within modern elite cultivars has hindered the application of marker-assisted selection (MAS) in breeding programmes. A crucial step in the successful application of MAS in breeding programmes is the development of cheap and easy to use molecular markers, such as single-nucleotide polymorphisms. To mine selected elite wheat germplasm for intervarietal single-nucleotide polymorphisms, we have used expressed sequence tags derived from public sequencing programmes and next-generation sequencing of normalized wheat complementary DNA libraries, in combination with a novel sequence alignment and assembly approach. Here, we describe the development and validation of a panel of 1114 single-nucleotide polymorphisms in hexaploid bread wheat using competitive allele-specific polymerase chain reaction genotyping technology. We report the genotyping results of these markers on 23 wheat varieties, selected to represent a broad cross-section of wheat germplasm including a number of elite UK varieties. Finally, we show that, using relatively simple technology, it is possible to rapidly generate a linkage map containing several hundred single-nucleotide polymorphism markers in the doubled haploid mapping population of Avalon × Cadenza. © 2011 The Authors. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Location analysis for the estrogen receptor-α reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements

PubMed Central

Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.

2010-01-01

Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10–20% nucleotide deviation from the canonical ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966
Location analysis for the estrogen receptor-alpha reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements.

PubMed

Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B

2010-04-01

Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10-20% nucleotide deviation from the canonical ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.
Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

PubMed

Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

2014-07-01

Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Crimean-Congo Hemorrhagic Fever

DTIC Science & Technology

2004-01-01

aminocaproic acid were also indicated. Much emphasis was also placed on preventing reinfection, including the necessity of remov- ing blood crusts from...The se- quence is approximately 60% identical both at the nucleotide and amino acid levels to the L segment of Dugbe virus, the only other Nairovirus...However, more recent data based on nucleic acid sequence analysis have revealed extensive genetic diversity. The first published CCHFV sequence

Polymorphic amplified typing sequences (PATS) and pulsed-field gel electrophoresis (PFGE) yield comparable results in the strain typing of a diverse set of bovine Escherichia coli O157 isolates

USDA-ARS?s Scientific Manuscript database

The PCR-based Escherichia coli O157 (O157) strain typing system, Polymorphic Amplified Typing Sequences (PATS), targets insertions-deletions (Indels) and single nucleotide polymorphisms (SNPs) at the XbaI and AvrII(BlnI) restriction enzyme sites, respectively, besides amplifying four known virulenc...
How Much Do rRNA Gene Surveys Underestimate Extant Bacterial Diversity?

PubMed

Rodriguez-R, Luis M; Castro, Juan C; Kyrpides, Nikos C; Cole, James R; Tiedje, James M; Konstantinidis, Konstantinos T

2018-03-15

The most common practice in studying and cataloguing prokaryotic diversity involves the grouping of sequences into operational taxonomic units (OTUs) at the 97% 16S rRNA gene sequence identity level, often using partial gene sequences, such as PCR-generated amplicons. Due to the high sequence conservation of rRNA genes, organisms belonging to closely related yet distinct species may be grouped under the same OTU. However, it remains unclear how much diversity has been underestimated by this practice. To address this question, we compared the OTUs of genomes defined at the 97% or 98.5% 16S rRNA gene identity level against OTUs of the same genomes defined at the 95% whole-genome average nucleotide identity (ANI), which is a much more accurate proxy for species. Our results show that OTUs resulting from a 98.5% 16S rRNA gene identity cutoff are more accurate than 97% compared to 95% ANI (90.5% versus 89.9% accuracy) but indistinguishable from any other threshold in the 98.29 to 98.78% range. Even with the more stringent thresholds, however, the 16S rRNA gene-based approach commonly underestimates the number of OTUs by ∼12%, on average, compared to the ANI-based approach (∼14% underestimation when using the 97% identity threshold). More importantly, the degree of underestimation can become 50% or more for certain taxa, such as the genera Pseudomonas , Burkholderia , Escherichia , Campylobacter , and Citrobacter These results provide a quantitative view of the degree of underestimation of extant prokaryotic diversity by 16S rRNA gene-defined OTUs and suggest that genomic resolution is often necessary. IMPORTANCE Species diversity is one of the most fundamental pieces of information for community ecology and conservational biology. Therefore, employing accurate proxies for what a species or the unit of diversity is are cornerstones for a large set of microbial ecology and diversity studies. The most common proxies currently used rely on the clustering of 16S rRNA gene sequences at some threshold of nucleotide identity, typically 97% or 98.5%. Here, we explore how well this strategy reflects the more accurate whole-genome-based proxies and determine the frequency with which the high conservation of 16S rRNA sequences masks substantial species-level diversity. Copyright © 2018 American Society for Microbiology.
PCR Primers for Metazoan Nuclear 18S and 28S Ribosomal DNA Sequences

PubMed Central

Machida, Ryuji J.; Knowlton, Nancy

2012-01-01

Background Metagenetic analyses, which amplify and sequence target marker DNA regions from environmental samples, are increasingly employed to assess the biodiversity of communities of small organisms. Using this approach, our understanding of microbial diversity has expanded greatly. In contrast, only a few studies using this approach to characterize metazoan diversity have been reported, despite the fact that many metazoan species are small and difficult to identify or are undescribed. One of the reasons for this discrepancy is the availability of universal primers for the target taxa. In microbial studies, analysis of the 16S ribosomal DNA is standard. In contrast, the best gene for metazoan metagenetics is less clear. In the present study, we have designed primers that amplify the nuclear 18S and 28S ribosomal DNA sequences of most metazoan species with the goal of providing effective approaches for metagenetic analyses of metazoan diversity in environmental samples, with a particular emphasis on marine biodiversity. Methodology/Principal Findings Conserved regions suitable for designing PCR primers were identified using 14,503 and 1,072 metazoan sequences of the nuclear 18S and 28S rDNA regions, respectively. The sequence similarity of both these newly designed and the previously reported primers to the target regions of these primers were compared for each phylum to determine the expected amplification efficacy. The nucleotide diversity of the flanking regions of the primers was also estimated for genera or higher taxonomic groups of 11 phyla to determine the variable regions within the genes. Conclusions/Significance The identified nuclear ribosomal DNA primers (five primer pairs for 18S and eleven for 28S) and the results of the nucleotide diversity analyses provide options for primer combinations for metazoan metagenetic analyses. Additionally, advantages and disadvantages of not only the 18S and 28S ribosomal DNA, but also other marker regions as targets for metazoan metagenetic analyses, are discussed. PMID:23049971
A-to-I RNA Editing Contributes to Proteomic Diversity in Cancer. | Office of Cancer Genomics

Cancer.gov

Adenosine (A) to inosine (I) RNA editing introduces many nucleotide changes in cancer transcriptomes. However, due to the complexity of post-transcriptional regulation, the contribution of RNA editing to proteomic diversity in human cancers remains unclear. Here, we performed an integrated analysis of TCGA genomic data and CPTAC proteomic data. Despite limited site diversity, we demonstrate that A-to-I RNA editing contributes to proteomic diversity in breast cancer through changes in amino acid sequences. We validate the presence of editing events at both RNA and protein levels.
Patterns of nucleotide diversity and phenotypes of two domestication related genes (OsC1 and Wx) in indigenous rice varieties in Northeast India

PubMed Central

2014-01-01

Background During the domestication of crops, individual plants with traits desirable for human needs have been selected from their wild progenitors. Consequently, genetic and nucleotide diversity of genes associated with these selected traits in crop plants are expected to be lower than their wild progenitors. In the present study, we surveyed the pattern of nucleotide diversity of two selected trait specific genes, Wx and OsC1, which regulate amylose content and apiculus coloration respectively in cultivated rice varieties. The analyzed samples were collected from a wide geographic area in Northeast (NE) India, and included contrasting phenotypes considered to be associated with selected genes, namely glutinous and nonglutinous grains and colored and colorless apiculus. Results No statistically significant selection signatures were detected in both Wx and OsC1gene sequences. However, low level of selection that varied across the length of each gene was evident. The glutinous type varieties showed higher levels of nucleotide diversity at the Wx locus (πtot = 0.0053) than nonglutinous type varieties (πtot = 0.0043). The OsC1 gene revealed low levels of selection among the colorless apiculus varieties with lower nucleotide diversity (πtot = 0.0010) than in the colored apiculus varieties (πtot = 0.0023). Conclusions The results revealed that functional mutations at Wx and OsC1genes considered to be associated with specific phenotypes do not necessarily correspond to the phenotypes in indigenous rice varieties in NE India. This suggests that other than previously reported genomic regions may also be involved in determination of these phenotypes. PMID:24935343
A new family of satellite DNA sequences as a major component of centromeric heterochromatin in owls (Strigiformes).

PubMed

Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi

2004-03-01

We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
Construction of nested genetic core collections to optimize the exploitation of natural diversity in Vitis vinifera L. subsp. sativa

PubMed Central

Le Cunff, Loïc; Fournier-Level, Alexandre; Laucou, Valérie; Vezzulli, Silvia; Lacombe, Thierry; Adam-Blondon, Anne-Françoise; Boursiquot, Jean-Michel; This, Patrice

2008-01-01

Background The first high quality draft of the grape genome sequence has just been published. This is a critical step in accessing all the genes of this species and increases the chances of exploiting the natural genetic diversity through association genetics. However, our basic knowledge of the extent of allelic variation within the species is still not sufficient. Towards this goal, we constructed nested genetic core collections (G-cores) to capture the simple sequence repeat (SSR) diversity of the grape cultivated compartment (Vitis vinifera L. subsp. sativa) from the world's largest germplasm collection (Domaine de Vassal, INRA Hérault, France), containing 2262 unique genotypes. Results Sub-samples of 12, 24, 48 and 92 varieties of V. vinifera L. were selected based on their genotypes for 20 SSR markers using the M-strategy. They represent respectively 58%, 73%, 83% and 100% of total SSR diversity. The capture of allelic diversity was analyzed by sequencing three genes scattered throughout the genome on 233 individuals: 41 single nucleotide polymorphisms (SNPs) were identified using the G-92 core (one SNP for every 49 nucleotides) while only 25 were observed using a larger sample of 141 individuals selected on the basis of 50 morphological traits, thus demonstrating the reliability of the approach. Conclusion The G-12 and G-24 core-collections displayed respectively 78% and 88% of the SNPs respectively, and are therefore of great interest for SNP discovery studies. Furthermore, the nested genetic core collections satisfactorily reflected the geographic and the genetic diversity of grape, which are also of great interest for the study of gene evolution in this species. PMID:18384667
Sequence analysis of porcine kobuvirus VP1 region detected in pigs in Japan and Thailand.

PubMed

Okitsu, Shoko; Khamrin, Pattara; Thongprachum, Aksara; Hidaka, Satoshi; Kongkaew, Sompreeya; Kongkaew, Apisek; Maneekarn, Niwat; Mizuguchi, Masashi; Hayakawa, Satoshi; Ushijima, Hiroshi

2012-04-01

Porcine kobuvirus is a new candidate species of the genus Kobuvirus in the family Picornaviridae, and information is still limited. The identification of porcine kobuvirus has been performed by the sequence analyses of the 3D region of the viruses. Therefore, the purpose of this study was to characterize the molecular properties of VP1 nucleotide sequences of the porcine kobuviruses isolated from porcine stool samples in Japan during 2009 and Thailand between 2006 and 2008. In addition, previous identification of a unique porcine kobuvirus; Japanese H023/2009/JP, which is a bovine kobuvirus-like strain based on sequence analysis of the 3D region, was also included in this study. All of the strains were amplified by the VP1-specific primer pair: the amplicons were subjected to direct sequencing and compared with the VP1 nucleotide sequences of reference strains. The VP1 sequences of strains from the GenBank database revealed high nucleotide sequence identity at 84.3-100%. On the other hand, the nucleotide identities among the 15 porcine kobuvirus strains analyzed in this study ranged from 78.8 to 99.8%. The results revealed that diversity of the strains in this study were higher than those of the strains in previous studies. Furthermore, it was found that the VP1 region of the bovine kobuvirus-like strain, H023/2009/JP, clustered with nine porcine kobuvirus strains that were isolated in Thailand and Japan. Since this strain was previously found to be closely related to bovine kobuviruses in the 3D gene region, it may be a natural recombinant.
New chloroplast microsatellite markers suitable for assessing genetic diversity of Lolium perenne and other related grass species

PubMed Central

Diekmann, Kerstin; Hodkinson, Trevor R.; Barth, Susanne

2012-01-01

Background and Aims Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne ‘Cashel’. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Methods Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. Key Results All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A8 mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. Conclusions The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species. PMID:22419761
New chloroplast microsatellite markers suitable for assessing genetic diversity of Lolium perenne and other related grass species.

PubMed

Diekmann, Kerstin; Hodkinson, Trevor R; Barth, Susanne

2012-11-01

Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne 'Cashel'. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species. Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species. All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A(8) mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively. The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species.
Microsatellite genotyping and genome-wide single nucleotide polymorphism-based indices of Plasmodium falciparum diversity within clinical infections.

PubMed

Murray, Lee; Mobegi, Victor A; Duffy, Craig W; Assefa, Samuel A; Kwiatkowski, Dominic P; Laman, Eugene; Loua, Kovana M; Conway, David J

2016-05-12

In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples. PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs. As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = -0.88, P < 0.001). However, the microsatellite analysis revealed that most isolates contained mixed genotypes, even those that had no detectable genome sequence heterogeneity. Random sampling of different numbers of SNPs showed that an F ws index derived from ten or more SNPs with minor allele frequencies of >10 % had high correlation (r > 0.90) with the index derived using all SNPs. Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).
Molecular characterization of the 17D-204 yellow fever vaccine.

PubMed

Salmona, Maud; Gazaignes, Sandrine; Mercier-Delarue, Severine; Garnier, Fabienne; Korimbocus, Jehanara; Colin de Verdière, Nathalie; LeGoff, Jerome; Roques, Pierre; Simon, François

2015-10-05

The worldwide use of yellow fever (YF) live attenuated vaccines came recently under close scrutiny as rare but serious adverse events have been reported. The population identified at major risk for these safety issues were extreme ages and immunocompromised subjects. Study NCT01426243 conducted by the French National Agency for AIDS research is an ongoing interventional study to evaluate the safety of the vaccine and the specific immune responses in HIV-infected patients following 17D-204 vaccination. As a preliminary study, we characterized the molecular diversity from E gene of the single 17D-204 vaccine batch used in this clinical study. Eight vials of lyophilized 17D-204 vaccine (Stamaril, Sanofi-Pasteur, Lyon, France) of the E5499 batch were reconstituted for viral quantification, cloning and sequencing of C/prM/E region. The average rate of virions per vial was 8.68 ± 0.07 log₁₀ genome equivalents with a low coefficient of variation (0.81%). 246 sequences of the C/prM/E region (29-33 per vials) were generated and analyzed for the eight vials, 25 (10%) being defective and excluded from analyses. 95% of sequences had at least one nucleotide mutation. The mutations were observed on 662 variant sites distributed through all over the 1995 nucleotides sequence and were mainly non-synonymous (66%). Genome variability between vaccine vials was highly homogeneous with a nucleotide distance ranging from 0.29% to 0.41%. Average p-distances observed for each vial were also homogeneous, ranging from 0.15% to 0.31%. This study showed a homogenous YF virus RNA quantity in vaccine vials within a single lot and a low clonal diversity inter and intra vaccine vials. These results are consistent with a recent study showing that the main mechanism of attenuation resulted in the loss of diversity in the YF virus quasi-species. Copyright © 2015 Elsevier Ltd. All rights reserved.
Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

PubMed Central

Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

2015-01-01

Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559
Diversity and phylogenetic relationships among Bartonella strains from Thai bats.

PubMed

McKee, Clifton D; Kosoy, Michael Y; Bai, Ying; Osikowicz, Lynn M; Franka, Richard; Gilbert, Amy T; Boonmar, Sumalee; Rupprecht, Charles E; Peruski, Leonard F

2017-01-01

Bartonellae are phylogenetically diverse, intracellular bacteria commonly found in mammals. Previous studies have demonstrated that bats have a high prevalence and diversity of Bartonella infections globally. Isolates (n = 42) were obtained from five bat species in four provinces of Thailand and analyzed using sequences of the citrate synthase gene (gltA). Sequences clustered into seven distinct genogroups; four of these genogroups displayed similarity with Bartonella spp. sequences from other bats in Southeast Asia, Africa, and Eastern Europe. Thirty of the isolates representing these seven genogroups were further characterized by sequencing four additional loci (ftsZ, nuoG, rpoB, and ITS) to clarify their evolutionary relationships with other Bartonella species and to assess patterns of diversity among strains. Among the seven genogroups, there were differences in the number of sequence variants, ranging from 1-5, and the amount of nucleotide divergence, ranging from 0.035-3.9%. Overall, these seven genogroups meet the criteria for distinction as novel Bartonella species, with sequence divergence among genogroups ranging from 6.4-15.8%. Evidence of intra- and intercontinental phylogenetic relationships and instances of homologous recombination among Bartonella genogroups in related bat species were found in Thai bats.
Genetic diversity and epidemiology of infectious hematopoietic necrosis virus in Alaska

USGS Publications Warehouse

Emmenegger, E.G; Meyers, T.R.; Burton, T.O.; Kurath, G.

2000-01-01

Forty-two infectious hematopoietic necrosis virus (IHNV) isolates from Alaska were analyzed using the ribonuclease protection assay (RPA) and nucleotide sequencing. RPA analyses, utilizing 4 probes, N5, N3 (N gene), GF (G gene), and NV (NV gene), determined that the haplotypes of all 3 genes demonstrated a consistent spatial pattern. Virus isolates belonging to the most common haplotype groups were distributed throughout Alaska, whereas isolates in small haplotype groups were obtained from only 1 site (hatchery, lake, etc.). The temporal pattern of the GF haplotypes suggested a 'genetic acclimation' of the G gene, possibly due to positive selection on the glycoprotein. A pairwise comparison of the sequence data determined that the maximum nucleotide diversity of the isolates was 2.75% (10 mismatches) for the NV gene, and 1.99% (6 mismatches) for a 301 base pair region of the G gene, indicating that the genetic diversity of IHNV within Alaska is notably lower than in the more southern portions of the IHNV North American range. Phylogenetic analysis of representative Alaskan sequences and sequences of 12 previously characterized IHNV strains from Washington, Oregon, Idaho, California (USA) and British Columbia (Canada) distinguished the isolates into clusters that correlated with geographic origin and indicated that the Alaskan and British Columbia isolates may have a common viral ancestral lineage. Comparisons of multiple isolates from the same site provided epidemiological insights into viral transmission patterns and indicated that viral evolution, viral introduction, and genetic stasis were the mechanisms involved with IHN virus population dynamics in Alaska. The examples of genetic stasis and the overall low sequence heterogeneity of the Alaskan isolates suggested that they are evolutionarily constrained. This study establishes a baseline of genetic fingerprint patterns and sequence groups representing the genetic diversity of Alaskan IHNV isolates. This information could be used to determine the source of an IHN outbreak and to facilitate decisions in fisheries management of Alaskan salmonid stocks.
High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

PubMed Central

Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

2015-01-01

The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442
Phylogenetic Diversity of NTT Nucleotide Transport Proteins in Free-Living and Parasitic Bacteria and Eukaryotes

PubMed Central

Major, Peter; Embley, T. Martin

2017-01-01

Plasma membrane-located nucleotide transport proteins (NTTs) underpin the lifestyle of important obligate intracellular bacterial and eukaryotic pathogens by importing energy and nucleotides from infected host cells that the pathogens can no longer make for themselves. As such their presence is often seen as a hallmark of an intracellular lifestyle associated with reductive genome evolution and loss of primary biosynthetic pathways. Here, we investigate the phylogenetic distribution of NTT sequences across the domains of cellular life. Our analysis reveals an unexpectedly broad distribution of NTT genes in both host-associated and free-living prokaryotes and eukaryotes. We also identify cases of within-bacteria and bacteria-to-eukaryote horizontal NTT transfer, including into the base of the oomycetes, a major clade of parasitic eukaryotes. In addition to identifying sequences that retain the canonical NTT structure, we detected NTT gene fusions with HEAT-repeat and cyclic nucleotide binding domains in Cyanobacteria, pathogenic Chlamydiae and Oomycetes. Our results suggest that NTTs are versatile functional modules with a much wider distribution and a broader range of potential roles than has previously been appreciated. PMID:28164241
Diversity of virus-host systems in hypersaline Lake Retba, Senegal.

PubMed

Sime-Ngando, Télesphore; Lucas, Soizick; Robin, Agnès; Tucker, Kimberly Pause; Colombet, Jonathan; Bettarel, Yvan; Desmond, Elie; Gribaldo, Simonetta; Forterre, Patrick; Breitbart, Mya; Prangishvili, David

2011-08-01

Remarkable morphological diversity of virus-like particles was observed by transmission electron microscopy in a hypersaline water sample from Lake Retba, Senegal. The majority of particles morphologically resembled hyperthermophilic archaeal DNA viruses isolated from extreme geothermal environments. Some hypersaline viral morphotypes have not been previously observed in nature, and less than 1% of observed particles had a head-and-tail morphology, which is typical for bacterial DNA viruses. Culture-independent analysis of the microbial diversity in the sample suggested the dominance of extremely halophilic archaea. Few of the 16S sequences corresponded to known archeal genera (Haloquadratum, Halorubrum and Natronomonas), whereas the majority represented novel archaeal clades. Three sequences corresponded to a new basal lineage of the haloarchaea. Bacteria belonged to four major phyla, consistent with the known diversity in saline environments. Metagenomic sequencing of DNA from the purified virus-like particles revealed very few similarities to the NCBI non-redundant database at either the nucleotide or amino acid level. Some of the identifiable virus sequences were most similar to previously described haloarchaeal viruses, but no sequence similarities were found to archaeal viruses from extreme geothermal environments. A large proportion of the sequences had similarity to previously sequenced viral metagenomes from solar salterns. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.
Sequence Variability and Geographic Distribution of Lassa Virus, Sierra Leone

PubMed Central

Stockelman, Michael G.; Moses, Lina M.; Park, Matthew; Stenger, David A.; Ansumana, Rashid; Bausch, Daniel G.; Lin, Baochuan

2015-01-01

Lassa virus (LASV) is endemic to parts of West Africa and causes highly fatal hemorrhagic fever. The multimammate rat (Mastomys natalensis) is the only known reservoir of LASV. Most human infections result from zoonotic transmission. The very diverse LASV genome has 4 major lineages associated with different geographic locations. We used reverse transcription PCR and resequencing microarrays to detect LASV in 41 of 214 samples from rodents captured at 8 locations in Sierra Leone. Phylogenetic analysis of partial sequences of nucleoprotein (NP), glycoprotein precursor (GPC), and polymerase (L) genes showed 5 separate clades within lineage IV of LASV in this country. The sequence diversity was higher than previously observed; mean diversity was 7.01% for nucleoprotein gene at the nucleotide level. These results may have major implications for designing diagnostic tests and therapeutic agents for LASV infections in Sierra Leone. PMID:25811712
Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity

PubMed Central

Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen

2015-01-01

The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952

Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

PubMed

Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

2015-04-28

The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.
Miniprimer PCR, a New Lens for Viewing the Microbial World▿ †

PubMed Central

Isenbarger, Thomas A.; Finney, Michael; Ríos-Velázquez, Carlos; Handelsman, Jo; Ruvkun, Gary

2008-01-01

Molecular methods based on the 16S rRNA gene sequence are used widely in microbial ecology to reveal the diversity of microbial populations in environmental samples. Here we show that a new PCR method using an engineered polymerase and 10-nucleotide “miniprimers” expands the scope of detectable sequences beyond those detected by standard methods using longer primers and Taq polymerase. After testing the method in silico to identify divergent ribosomal genes in previously cloned environmental sequences, we applied the method to soil and microbial mat samples, which revealed novel 16S rRNA gene sequences that would not have been detected with standard primers. Deeply divergent sequences were discovered with high frequency and included representatives that define two new division-level taxa, designated CR1 and CR2, suggesting that miniprimer PCR may reveal new dimensions of microbial diversity. PMID:18083877
Characterization, genetic diversity, and evolutionary link of Cucumber mosaic virus strain New Delhi from India.

PubMed

Koundal, Vikas; Haq, Qazi Mohd Rizwanul; Praveen, Shelly

2011-02-01

The genome of Cucumber mosaic virus New Delhi strain (CMV-ND) from India, obtained from tomato, was completely sequenced and compared with full genome sequences of 14 known CMV strains from subgroups I and II, for their genetic diversity. Sequence analysis suggests CMV-ND shares maximum sequence identity at the nucleotide level with a CMV strain from Taiwan. Among all 15 strains of CMV, the encoded protein 2b is least conserved, whereas the coat protein (CP) is most conserved. Sequence identity values and phylogram results indicate that CMV-ND belongs to subgroup I. Based on the recombination detection program result, it appears that CMV is prone to recombination, and different RNA components of CMV-ND have evolved differently. Recombinational analysis of all 15 CMV strains detected maximum recombination breakpoints in RNA2; CP showed the least recombination sites.
CPm gene diversity in field isolates of Citrus tristeza virus from Colombia.

PubMed

Oliveros-Garay, Oscar Arturo; Martinez-Salazar, Natalhie; Torres-Ruiz, Yanneth; Acosta, Orlando

2009-01-01

The nucleotide sequence diversity of the CPm gene from 28 field isolates of Citrus tristeza virus (CTV) was assessed by SSCP and sequence analyses. These isolates showed two major shared haplotypes, which differed in distribution: A1 was the major haplotype in 23 isolates from different geographic regions, whereas R1 was found in isolates from a discrete region. Phylogenetic reconstruction clustered A1 within an independent group, while R1 was grouped with mild isolates T30 from Florida and T385 from Spain. Some isolates contained several minor haplotypes, which were very similar to, and associated with, the major haplotype.
Genetic diversity analysis of Leuconostoc mesenteroides from Korean vegetables and food products by multilocus sequence typing.

PubMed

Sharma, Anshul; Kaur, Jasmine; Lee, Sulhee; Park, Young-Seo

2018-06-01

In the present study, 35 Leuconostoc mesenteroides strains isolated from vegetables and food products from South Korea were studied by multilocus sequence typing (MLST) of seven housekeeping genes (atpA, groEL, gyrB, pheS, pyrG, rpoA, and uvrC). The fragment sizes of the seven amplified housekeeping genes ranged in length from 366 to 1414 bp. Sequence analysis indicated 27 different sequence types (STs) with 25 of them being represented by a single strain indicating high genetic diversity, whereas the remaining 2 were characterized by five strains each. In total, 220 polymorphic nucleotide sites were detected among seven housekeeping genes. The phylogenetic analysis based on the STs of the seven loci indicated that the 35 strains belonged to two major groups, A (28 strains) and B (7 strains). Split decomposition analysis showed that intraspecies recombination played a role in generating diversity among strains. The minimum spanning tree showed that the evolution of the STs was not correlated with food source. This study signifies that the multilocus sequence typing is a valuable tool to access the genetic diversity among L. mesenteroides strains from South Korea and can be used further to monitor the evolutionary changes.
Genetic diversity assessment of anoxygenic photosynthetic bacteria by distance-based grouping analysis of pufM sequences.

PubMed

Zeng, Y H; Chen, X H; Jiao, N Z

2007-12-01

To assess how completely the diversity of anoxygenic phototrophic bacteria (APB) was sampled in natural environments. All nucleotide sequences of the APB marker gene pufM from cultures and environmental clones were retrieved from the GenBank database. A set of cutoff values (sequence distances 0.06, 0.15 and 0.48 for species, genus, and (sub)phylum levels, respectively) was established using a distance-based grouping program. Analysis of the environmental clones revealed that current efforts on APB isolation and sampling in natural environments are largely inadequate. Analysis of the average distance between each identified genus and an uncultured environmental pufM sequence indicated that the majority of cultured APB genera lack environmental representatives. The distance-based grouping method is fast and efficient for bulk functional gene sequences analysis. The results clearly show that we are at a relatively early stage in sampling the global richness of APB species. Periodical assessment will undoubtedly facilitate in-depth analysis of potential biogeographical distribution pattern of APB. This is the first attempt to assess the present understanding of APB diversity in natural environments. The method used is also useful for assessing the diversity of other functional genes.
A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection

PubMed Central

Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike

2018-01-01

ABSTRACT Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection. PMID:29564396
A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

PubMed

Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

2018-01-01

Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection.
Generation of diversity in Streptococcus mutans genes demonstrated by MLST.

PubMed

Do, Thuy; Gilbert, Steven C; Clark, Douglas; Ali, Farida; Fatturi Parolo, Clarissa C; Maltz, Marisa; Russell, Roy R; Holbrook, Peter; Wade, William G; Beighton, David

2010-02-05

Streptococcus mutans, consisting of serotypes c, e, f and k, is an oral aciduric organism associated with the initiation and progression of dental caries. A total of 135 independent Streptococcus mutans strains from caries-free and caries-active subjects isolated from various geographical locations were examined in two versions of an MLST scheme consisting of either 6 housekeeping genes [accC (acetyl-CoA carboxylase biotin carboxylase subunit), gki (glucokinase), lepA (GTP-binding protein), recP (transketolase), sodA (superoxide dismutase), and tyrS (tyrosyl-tRNA synthetase)] or the housekeeping genes supplemented with 2 extracellular putative virulence genes [gtfB (glucosyltransferase B) and spaP (surface protein antigen I/II)] to increase sequence type diversity. The number of alleles found varied between 20 (lepA) and 37 (spaP). Overall, 121 sequence types (STs) were defined using the housekeeping genes alone and 122 with all genes. However pi, nucleotide diversity per site, was low for all loci being in the range 0.019-0.007. The virulence genes exhibited the greatest nucleotide diversity and the recombination/mutation ratio was 0.67 [95% confidence interval 0.3-1.15] compared to 8.3 [95% confidence interval 5.0-14.5] for the 6 concatenated housekeeping genes alone. The ML trees generated for individual MLST loci were significantly incongruent and not significantly different from random trees. Analysis using ClonalFrame indicated that the majority of isolates were singletons and no evidence for a clonal structure or evidence to support serotype c strains as the ancestral S. mutans strain was apparent. There was also no evidence of a geographical distribution of individual isolates or that particular isolate clusters were associated with caries. The overall low sequence diversity suggests that S. mutans is a newly emerged species which has not accumulated large numbers of mutations but those that have occurred have been shuffled as a consequence of intra-species recombination generating genotypes which can be readily distinguished by sequence analysis.
CRISPR regulation of intraspecies diversification by limiting IS transposition and intercellular recombination.

PubMed

Watanabe, Takayasu; Nozawa, Takashi; Aikawa, Chihiro; Amano, Atsuo; Maruyama, Fumito; Nakagawa, Ichiro

2013-01-01

Mobile genetic elements (MGEs) and genetic rearrangement are considered as major driving forces of bacterial diversification. Previous comparative genome analysis of Porphyromonas gingivalis, a pathogen related to periodontitis, implied such an important relationship. As a counterpart system to MGEs, clustered regularly interspaced short palindromic repeats (CRISPRs) in bacteria may be useful for genetic typing. We found that CRISPR typing could be a reasonable alternative to conventional methods for characterizing phylogenetic relationships among 60 highly diverse P. gingivalis isolates. Examination of genetic recombination along with multilocus sequence typing suggests the importance of such events between different isolates. MGEs appear to be strategically located at the breakpoint gaps of complicated genome rearrangements. Of these MGEs, insertion sequences (ISs) were found most frequently. CRISPR analysis identified 2,150 spacers that were clustered into 1,187 unique ones. Most of these spacers exhibited no significant nucleotide similarity to known sequences (97.6%: 1,158/1,187). Surprisingly, CRISPR spacers exhibiting high nucleotide similarity to regions of P. gingivalis genomes including ISs were predominant. The proportion of such spacers to all the unique spacers (1.6%: 19/1,187) was the highest among previous studies, suggesting novel functions for these CRISPRs. These results indicate that P. gingivalis is a bacterium with high intraspecies diversity caused by frequent insertion sequence (IS) transposition, whereas both the introduction of foreign DNA, primarily from other P. gingivalis cells, and IS transposition are limited by CRISPR interference. It is suggested that P. gingivalis CRISPRs could be an important source for understanding the role of CRISPRs in the development of bacterial diversity.
Complete sequence and diversity of a maize-associated Polerovirus in East Africa.

PubMed

Massawe, Deogracious P; Stewart, Lucy R; Kamatenesi, Jovia; Asiimwe, Theodore; Redinbaugh, Margaret G

2018-06-01

Since 2011-2012, Maize lethal necrosis (MLN) has emerged in East Africa, causing massive yield loss and propelling research to identify viruses and virus populations present in maize. As expected, next generation sequencing (NGS) has revealed diverse and abundant viruses from the family Potyviridae, primarily sugarcane mosaic virus (SCMV), and maize chlorotic mottle virus (MCMV) (Tombusviridae), which are known to cause MLN by synergistic co-infection. In addition to these expected viruses, we identified a virus in the genus Polerovirus (family Luteoviridae) in 104/172 samples selected for MLN or other potential virus symptoms from Kenya, Uganda, Rwanda, and Tanzania. This polerovirus (MF974579) nucleotide sequence is 97% identical to maize-associated viruses recently reported in China, termed 'maize yellow mosaic virus' (MaYMV) and maize yellow dwarf virus (MaYMV; KU291101, KU291107, MYDV-RMV2; KT992824); and 99% identical to MaYMV (KY684356) infecting sugarcane and itch grass in Nigeria; 83% identical to a barley-associated polerovirus recently identified in Korea (BVG; KT962089); and 79% identical to the U.S. maize-infecting polerovirus maize yellow dwarf virus (MYDV-RMV; KT992824). Nucleotide sequences from ORF0 of 20 individual East African isolates collected from Kenya, Uganda, Rwanda, and Tanzania shared 98% or higher identity, and were detected in 104/172 (60.5%) of samples collected for virus-like symptoms, indicating extensive prevalence but limited diversity of this virus in East Africa. We refer to this virus as "MYDV-like polerovirus" until symptoms of the virus in maize are known.
DNA Sequence Polymorphism of the Lactate Dehydrogenase Genefrom Iranian Plasmodium vivax and Plasmodium falciparum Isolates.

PubMed

Getacher Feleke, Daniel; Nateghpour, Mehdi; Motevalli Haghi, Afsaneh; Hajjaran, Homa; Farivar, Leila; Mohebali, Mehdi; Raoofian, Reza

2015-01-01

Parasite lactate dehydrogenase (pLDH) is extensively employed as malaria rapid diagnostic tests (RDTs). Moreover, it is a well-known drug target candidate. However, the genetic diversity of this gene might influence performance of RDT kits and its drug target candidacy. This study aimed to determine polymorphism of pLDH gene from Iranian isolates of P. vivax and P. falciparum. Genomic DNA was extracted from whole blood of microscopically confirmed P. vivax and P. falciparum infected patients. pLDH gene of P. falciparum and P. vivax was amplified using conventional PCR from 43 symptomatic malaria patients from Sistan and Baluchistan Province, Southeast Iran from 2012 to 2013. Sequence analysis of 15 P. vivax LDH showed fourteen had 100% identity with P. vivax Sal-1 and Belem strains. Two nucleotide substitutions were detected with only one resulted in amino acid change. Analysis of P. falciparum LDH sequences showed six of the seven sequences had 100% homology with P. falciparum 3D7 and Mzr-1. Moreover, PfLDH displayed three nucleotide changes that resulted in changing only one amino acid. PvLDH and PfLDH showed 75%-76% nucleotide and 90.4%-90.76% amino acid homology. pLDH gene from Iranian P. falciparum and P. vivax isolates displayed 98.8-100% homology with 1-3 nucleotide substitutions. This indicated this gene was relatively conserved. Additional studies can be done weather this genetic variation can influence the performance of pLDH based RDTs or not.
Combined hairpin-antisense compositions and methods for modulating expression

DOEpatents

Shanklin, John; Nguyen, Tam

2014-08-05

A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.
Combined hairpin-antisense compositions and methods for modulating expression

DOEpatents

Shanklin, John; Nguyen, Tam Huu

2015-11-24

A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.
Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana.

PubMed

Aguadé, M

2001-01-01

The FAH1 and F3H genes encode ferulate-5-hydroxylase and flavanone-3-hydroxylase, which are enzymes in the pathways leading to the synthesis of sinapic acid esters and flavonoids, respectively. Nucleotide variation at these genes was surveyed by sequencing a sample of 20 worldwide Arabidopsis thaliana ecotypes and one Arabidopsis lyrata spp. petraea stock. In contrast with most previously studied genes, the percentage of singletons was rather low in both the FAH1 and the F3H gene regions. There was, therefore, no footprint of a recent species expansion in the pattern of nucleotide variation in these regions. In both FAH1 and F3H, nucleotide variation was structured into two major highly differentiated haplotypes. In both genes, there was a peak of silent polymorphism in the 5' part of the coding region without a parallel increase in silent divergence. In FAH1, the peak was centered at the beginning of the second exon. In F3H, nucleotide diversity was highest at the beginning of the gene. The observed pattern of variation in both FAH1 and F3H, although suggestive of balancing selection, was compatible with a neutral model with no recombination.
An Outbreak of Respiratory Tularemia Caused by Diverse Clones of Francisella tularensis

PubMed Central

Johansson, Anders; Lärkeryd, Adrian; Widerström, Micael; Mörtberg, Sara; Myrtännäs, Kerstin; Öhrman, Caroline; Birdsell, Dawn; Keim, Paul; Wagner, David M.; Forsman, Mats; Larsson, Pär

2014-01-01

Background. The bacterium Francisella tularensis is recognized for its virulence, infectivity, genetic homogeneity, and potential as a bioterrorism agent. Outbreaks of respiratory tularemia, caused by inhalation of this bacterium, are poorly understood. Such outbreaks are exceedingly rare, and F. tularensis is seldom recovered from clinical specimens. Methods. A localized outbreak of tularemia in Sweden was investigated. Sixty-seven humans contracted laboratory-verified respiratory tularemia. F. tularensis subspecies holarctica was isolated from the blood or pleural fluid of 10 individuals from July to September 2010. Using whole-genome sequencing and analysis of single-nucleotide polymorphisms (SNPs), outbreak isolates were compared with 110 archived global isolates. Results. There were 757 SNPs among the genomes of the 10 outbreak isolates and the 25 most closely related archival isolates (all from Sweden/Finland). Whole genomes of outbreak isolates were >99.9% similar at the nucleotide level and clustered into 3 distinct genetic clades. Unexpectedly, high-sequence similarity grouped some outbreak and archival isolates that originated from patients from different geographic regions and up to 10 years apart. Outbreak and archival genomes frequently differed by only 1–3 of 1 585 229 examined nucleotides. Conclusions. The outbreak was caused by diverse clones of F. tularensis that occurred concomitantly, were widespread, and apparently persisted in the environment. Multiple independent acquisitions of F. tularensis from the environment over a short time period suggest that natural outbreaks of respiratory tularemia are triggered by environmental cues. The findings additionally caution against interpreting genome sequence identity for this pathogen as proof of a direct epidemiological link. PMID:25097081
Sequence diversity of wheat mosaic virus isolates.

PubMed

Stewart, Lucy R

2016-02-02

Wheat mosaic virus (WMoV), transmitted by eriophyid wheat curl mites (Aceria tosichella) is the causal agent of High Plains disease in wheat and maize. WMoV and other members of the genus Emaravirus evaded thorough molecular characterization for many years due to the experimental challenges of mite transmission and manipulating multisegmented negative sense RNA genomes. Recently, the complete genome sequence of a Nebraska isolate of WMoV revealed eight segments, plus a variant sequence of the nucleocapsid protein-encoding segment. Here, near-complete and partial consensus sequences of five more WMoV isolates are reported and compared to the Nebraska isolate: an Ohio maize isolate (GG1), a Kansas barley isolate (KS7), and three Ohio wheat isolates (H1, K1, W1). Results show two distinct groups of WMoV isolates: Ohio wheat isolate RNA segments had 84% or lower nucleotide sequence identity to the NE isolate, whereas GG1 and KS7 had 98% or higher nucleotide sequence identity to the NE isolate. Knowledge of the sequence variability of WMoV isolates is a step toward understanding virus biology, and potentially explaining observed biological variation. Published by Elsevier B.V.
Templated sequence insertion polymorphisms in the human genome

NASA Astrophysics Data System (ADS)

Onozawa, Masahiro; Aplan, Peter

2016-11-01

Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.
Draft genome sequence of Cicer reticulatum L., the wild progenitor of chickpea provides a resource for agronomic trait improvement.

PubMed

Gupta, Sonal; Nawaz, Kashif; Parween, Sabiha; Roy, Riti; Sahu, Kamlesh; Kumar Pole, Anil; Khandal, Hitaishi; Srivastava, Rishi; Kumar Parida, Swarup; Chattopadhyay, Debasis

2017-02-01

Cicer reticulatum L. is the wild progenitor of the fourth most important legume crop chickpea (C. arietinum L.). We assembled short-read sequences into 416 Mb draft genome of C. reticulatum and anchored 78% (327 Mb) of this assembly to eight linkage groups. Genome annotation predicted 25,680 protein-coding genes covering more than 90% of predicted gene space. The genome assembly shared a substantial synteny and conservation of gene orders with the genome of the model legume Medicago truncatula. Resistance gene homologs of wild and domesticated chickpeas showed high sequence homology and conserved synteny. Comparison of gene sequences and nucleotide diversity using 66 wild and domesticated chickpea accessions suggested that the desi type chickpea was genetically closer to the wild species than the kabuli type. Comparative analyses predicted gene flow between the wild and the cultivated species during domestication. Molecular diversity and population genetic structure determination using 15,096 genome-wide single nucleotide polymorphisms revealed an admixed domestication pattern among cultivated (desi and kabuli) and wild chickpea accessions belonging to three population groups reflecting significant influence of parentage or geographical origin for their cultivar-specific population classification. The assembly and the polymorphic sequence resources presented here would facilitate the study of chickpea domestication and targeted use of wild Cicer germplasms for agronomic trait improvement in chickpea. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Population genetic implications from sequence variation in four Y chromosome genes.

PubMed

Shen, P; Wang, F; Underhill, P A; Franco, C; Yang, W H; Roxas, A; Sung, R; Lin, A A; Hyman, R W; Vollrath, D; Davis, R W; Cavalli-Sforza, L L; Oefner, P J

2000-06-20

Some insight into human evolution has been gained from the sequencing of four Y chromosome genes. Primary genomic sequencing determined gene SMCY to be composed of 27 exons that comprise 4,620 bp of coding sequence. The unfinished sequencing of the 5' portion of gene UTY1 was completed by primer walking, and a total of 20 exons were found. By using denaturing HPLC, these two genes, as well as DBY and DFFRY, were screened for polymorphic sites in 53-72 representatives of the five continents. A total of 98 variants were found, yielding nucleotide diversity estimates of 2.45 x 10(-5), 5. 07 x 10(-5), and 8.54 x 10(-5) for the coding regions of SMCY, DFFRY, and UTY1, respectively, with no variant having been observed in DBY. In agreement with most autosomal genes, diversity estimates for the noncoding regions were about 2- to 3-fold higher and ranged from 9. 16 x 10(-5) to 14.2 x 10(-5) for the four genes. Analysis of the frequencies of derived alleles for all four genes showed that they more closely fit the expectation of a Luria-Delbrück distribution than a distribution expected under a constant population size model, providing evidence for exponential population growth. Pairwise nucleotide mismatch distributions date the occurrence of population expansion to approximately 28,000 years ago. This estimate is in accord with the spread of Aurignacian technology and the disappearance of the Neanderthals.

Identification and characterization of long non-coding RNAs in rainbow trout eggs

USDA-ARS?s Scientific Manuscript database

Long non-coding RNAs (lncRNAs) are in general considered as a diverse class of transcripts longer than 200 nucleotides that structurally resemble mRNAs but do not encode proteins. Recent advances in RNA sequencing (RNA-Seq) and bioinformatics methods have provided an opportunity to indentify and ana...
Rate of de novo mutations and the importance of father's age to disease risk.

PubMed

Kong, Augustine; Frigge, Michael L; Masson, Gisli; Besenbacher, Soren; Sulem, Patrick; Magnusson, Gisli; Gudjonsson, Sigurjon A; Sigurdsson, Asgeir; Jonasdottir, Aslaug; Jonasdottir, Adalbjorg; Wong, Wendy S W; Sigurdsson, Gunnar; Walters, G Bragi; Steinberg, Stacy; Helgason, Hannes; Thorleifsson, Gudmar; Gudbjartsson, Daniel F; Helgason, Agnar; Magnusson, Olafur Th; Thorsteinsdottir, Unnur; Stefansson, Kari

2012-08-23

Mutations generate sequence diversity and provide a substrate for selection. The rate of de novo mutations is therefore of major importance to evolution. Here we conduct a study of genome-wide mutation rates by sequencing the entire genomes of 78 Icelandic parent-offspring trios at high coverage. We show that in our samples, with an average father's age of 29.7, the average de novo mutation rate is 1.20 × 10(-8) per nucleotide per generation. Most notably, the diversity in mutation rate of single nucleotide polymorphisms is dominated by the age of the father at conception of the child. The effect is an increase of about two mutations per year. An exponential model estimates paternal mutations doubling every 16.5 years. After accounting for random Poisson variation, father's age is estimated to explain nearly all of the remaining variation in the de novo mutation counts. These observations shed light on the importance of the father's age on the risk of diseases such as schizophrenia and autism.
Genetic Diversity among Clostridium botulinum Strains Harboring bont/A2 and bont/A3 Genes

PubMed Central

Raphael, Brian H.; Joseph, Lavin A.; Meno, Sarah R.; Fernández, Rafael A.; Maslanka, Susan E.

2012-01-01

Clostridium botulinum type A strains are known to be genetically diverse and widespread throughout the world. Genetic diversity studies have focused mainly on strains harboring one type A botulinum toxin gene, bont/A1, although all reported bont/A gene variants have been associated with botulism cases. Our study provides insight into the genetic diversity of C. botulinum type A strains, which contain bont/A2 (n = 42) and bont/A3 (n = 4) genes, isolated from diverse samples and geographic origins. Genetic diversity was assessed by using bont nucleotide sequencing, content analysis of the bont gene clusters, multilocus sequence typing (MLST), and pulsed-field gel electrophoresis (PFGE). Sequences of bont genes obtained in this study showed 99.9 to 100% identity with other bont/A2 or bont/A3 gene sequences available in public databases. The neurotoxin gene clusters of the subtype A2 and A3 strains analyzed in this study were similar in gene content. C. botulinum strains harboring bont/A2 and bont/A3 genes were divided into six and two MLST profiles, respectively. Four groups of strains shared a similarity of at least 95% by PFGE; the largest group included 21 out of 46 strains. The strains analyzed in this study showed relatively limited genetic diversity using either MLST or PFGE. PMID:23042179
Theileria parva antigens recognized by CD8+ T cells show varying degrees of diversity in buffalo-derived infected cell lines.

PubMed

Sitt, Tatjana; Pelle, Roger; Chepkwony, Maurine; Morrison, W Ivan; Toye, Philip

2018-05-06

The extent of sequence diversity among the genes encoding 10 antigens (Tp1-10) known to be recognized by CD8+ T lymphocytes from cattle immune to Theileria parva was analysed. The sequences were derived from parasites in 23 buffalo-derived cell lines, three cattle-derived isolates and one cloned cell line obtained from a buffalo-derived stabilate. The results revealed substantial variation among the antigens through sequence diversity. The greatest nucleotide and amino acid diversity were observed in Tp1, Tp2 and Tp9. Tp5 and Tp7 showed the least amount of allelic diversity, and Tp5, Tp6 and Tp7 had the lowest levels of protein diversity. Tp6 was the most conserved protein; only a single non-synonymous substitution was found in all obtained sequences. The ratio of non-synonymous: synonymous substitutions varied from 0.84 (Tp1) to 0.04 (Tp6). Apart from Tp2 and Tp9, we observed no variation in the other defined CD8+ T cell epitopes (Tp4, 5, 7 and 8), indicating that epitope variation is not a universal feature of T. parva antigens. In addition to providing markers that can be used to examine the diversity in T. parva populations, the results highlight the potential for using conserved antigens to develop vaccines that provide broad protection against T. parva.
Elaeis oleifera Genomic-SSR Markers: Exploitation in Oil Palm Germplasm Diversity and Cross-Amplification in Arecaceae

PubMed Central

Zaki, Noorhariza Mohd; Singh, Rajinder; Rosli, Rozana; Ismail, Ismanizan

2012-01-01

Species-specific simple sequence repeat (SSR) markers are favored for genetic studies and marker-assisted selection (MAS) breeding for oil palm genetic improvement. This report characterizes 20 SSR markers from an Elaeis oleifera genomic library (gSSR). Characterization of the repeat type in 2000 sequences revealed a high percentage of di-nucleotides (63.6%), followed by tri-nucleotides (24.2%). Primer pairs were successfully designed for 394 of the E. oleifera gSSRs. Subsequent analysis showed the ability of the 20 selected E. oleifera gSSR markers to reveal genetic diversity in the genus Elaeis. The average Polymorphism Information Content (PIC) value for the SSRs was 0.402, with the tri-repeats showing the highest average PIC (0.626). Low values of observed heterozygosity (Ho) (0.164) and highly positive fixation indices (Fis) in the E. oleifera germplasm collection, compared to the E. guineensis, indicated an excess of homozygosity in E. oleifera. The transferability of the markers to closely related palms, Elaeis guineensis, Cocos nucifera and ornamental palms is also reported. Sequencing the amplicons of three selected E. oleifera gSSRs across both species and palm taxa revealed variations in the repeat-units. The study showed the potential of E. oleifera gSSR markers to reveal genetic diversity in the genus Elaeis. The markers are also a valuable genetic resource for studying E. oleifera and other genus in the Arecaceae family. PMID:22605966
Characterization of MHC-I in the blue tit (Cyanistes caeruleus) reveals low levels of genetic diversity and trans-population evolution across European populations.

PubMed

Schut, Elske; Aguilar, Juan Rivero-de; Merino, Santiago; Magrath, Michael J L; Komdeur, Jan; Westerdahl, Helena

2011-08-01

The major histcompatibility complex (MHC) is a vital component of the adaptive immune system in all vertebrates. This study is the first to characterize MHC class I (MHC-I) in blue tits (Cyanistes caeruleus), and we use MHC-I exon 3 sequence data from individuals originating from three locations across Europe: Spain, the Netherlands to Sweden. Our phylogeny of the 17 blue tit MHC-I alleles contains one allele cluster with low nucleotide diversity compared to the remaining more diverse alleles. We found a significant evidence for balancing selection in the peptide-binding region in the diverse allele group only. No separation according to geographic location was found in the phylogeny of alleles. Although the number of MHC-I loci of the blue tit is comparable to that of other passerine species, the nucleotide diversity of MHC-I appears to be much lower than that of other passerine species, including the closely related great tit (Parus major) and the severely inbred Seychelles warbler (Acrocephalus sechellensis). We believe that this initial MHC-I characterization in blue tits provides an important step towards understanding the mechanisms shaping MHC-I diversity in natural populations.
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

PubMed Central

Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

2015-01-01

The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960
Population diversity of Diaphorina citri (Hemiptera: Liviidae) in China based on whole mitochondrial genome sequences.

PubMed

Wu, Fengnian; Jiang, Hongyan; Beattie, G Andrew C; Holford, Paul; Chen, Jianchi; Wallis, Christopher M; Zheng, Zheng; Deng, Xiaoling; Cen, Yijing

2018-04-24

Diaphorina citri (Asian citrus psyllid; ACP) transmits 'Candidatus Liberibacter asiaticus' associated with citrus Huanglongbing (HLB). ACP has been reported in 11 provinces/regions in China, yet its population diversity remains unclear. In this study, we evaluated ACP population diversity in China using representative whole mitochondrial genome (mitogenome) sequences. Additional mitogenome sequences outside China were also acquired and evaluated. The sizes of the 27 ACP mitogenome sequences ranged from 14 986 to 15 030 bp. Along with three previously published mitogenome sequences, the 30 sequences formed three major mitochondrial groups (MGs): MG1, present in southwestern China and occurring at elevations above 1000 m; MG2, present in southeastern China and Southeast Asia (Cambodia, Indonesia, Malaysia, and Vietnam) and occurring at elevations below 180 m; and MG3, present in the USA and Pakistan. Single nucleotide polymorphisms in five genes (cox2, atp8, nad3, nad1 and rrnL) contributed mostly in the ACP diversity. Among these genes, rrnL had the most variation. Mitogenome sequences analyses revealed two major phylogenetic groups of ACP present in China as well as a possible unique group present currently in Pakistan and the USA. The information could have significant implications for current ACP control and HLB management. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Genome sequence, comparative analysis and haplotype structure of the domestic dog.

PubMed

Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

2005-12-08

Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Distribution and molecular diversity of three cucurbit-infecting poleroviruses in China.

PubMed

Shang, Qiao-xia; Xiang, Hai-ying; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

2009-11-01

Cucurbit aphid-borne yellows virus (CABYV) and Melon aphid-borne yellows virus (MABYV) have been found to be associated with cucurbit yellowing disease in China. Our report identifies for the first time a third distinct polerovirus, tentatively named Suakwa aphid-borne yellows virus (SABYV), infecting Suakwa vegetable sponge. To better understand the distribution and molecular diversity of these three poleroviruses infecting cucurbits, a total of 214 cucurbitaceous crop samples were collected from 25 provinces in China, and were investigated by RT-PCR and sequencing. Of these, 108 samples tested positive for CABYV, while 40 samples from five provinces were positive for MABYV, and SABYV was detected in only 4 samples which were collected in the southern part of China. Forty-one PCR-amplified fragments containing a portion of the RdRp gene, intergenic NCR and CP gene were cloned and sequenced. Sequence comparisons showed that CABYV isolates shared 78.0-79.2% nucleotide sequence identity with MABYV isolates, and 69.7-70.8% with SABYV. Sequence identity between MABYV and SABYV was 73.3-76.5%. In contrast, the nucleotide identities within each species were 93.2-98.7% (CABYV), 98.1-99.9% (MABYV), and 96.1-98.6% (SABYV). Phylogenetic analyses revealed that the polerovirus isolates fit into three distinct groups, corresponding to the three species. The CABYV group could be further divided into two subgroups: the Asia subgroup and the Mediterranean subgroup, based on CP gene and partial RdRp gene sequences. Recombination analysis suggested that MABYV may be a recombinant virus.
Phylogeographic reconstruction of a bacterial species with high levels of lateral gene transfer

USGS Publications Warehouse

Pearson, T.; Giffard, P.; Beckstrom-Sternberg, S.; Auerbach, R.; Hornstra, H.; Tuanyok, A.; Price, E.P.; Glass, M.B.; Leadem, B.; Beckstrom-Sternberg, J. S.; Allan, G.J.; Foster, J.T.; Wagner, D.M.; Okinaka, R.T.; Sim, S.H.; Pearson, O.; Wu, Z.; Chang, J.; Kaul, R.; Hoffmaster, A.R.; Brettin, T.S.; Robison, R.A.; Mayo, M.; Gee, J.E.; Tan, P.; Currie, B.J.; Keim, P.

2009-01-01

Background: Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results: Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. Conclusion: We describe an Australian origin for B. pseudomallei, characterized by a single introduction event into Southeast Asia during a recent glacial period, and variable levels of lateral gene transfer within populations. These patterns provide insights into mechanisms of genetic diversification in B. pseudomallei and its closest relatives, and provide a framework for integrating the traditionally separate fields of population genetics and phylogenetics for other bacterial species with high levels of lateral gene transfer. ?? 2009 Pearson et al; licensee BioMed Central Ltd.
Proteogenomic Investigation of Strain Variation in Clinical Mycobacterium tuberculosis Isolates.

PubMed

Heunis, Tiaan; Dippenaar, Anzaan; Warren, Robin M; van Helden, Paul D; van der Merwe, Ruben G; Gey van Pittius, Nicolaas C; Pain, Arnab; Sampson, Samantha L; Tabb, David L

2017-10-06

Mycobacterium tuberculosis consists of a large number of different strains that display unique virulence characteristics. Whole-genome sequencing has revealed substantial genetic diversity among clinical M. tuberculosis isolates, and elucidating the phenotypic variation encoded by this genetic diversity will be of the utmost importance to fully understand M. tuberculosis biology and pathogenicity. In this study, we integrated whole-genome sequencing and mass spectrometry (GeLC-MS/MS) to reveal strain-specific characteristics in the proteomes of two clinical M. tuberculosis Latin American-Mediterranean isolates. Using this approach, we identified 59 peptides containing single amino acid variants, which covered ∼9% of all coding nonsynonymous single nucleotide variants detected by whole-genome sequencing. Furthermore, we identified 29 distinct peptides that mapped to a hypothetical protein not present in the M. tuberculosis H37Rv reference proteome. Here, we provide evidence for the expression of this protein in the clinical M. tuberculosis SAWC3651 isolate. The strain-specific databases enabled confirmation of genomic differences (i.e., large genomic regions of difference and nonsynonymous single nucleotide variants) in these two clinical M. tuberculosis isolates and allowed strain differentiation at the proteome level. Our results contribute to the growing field of clinical microbial proteogenomics and can improve our understanding of phenotypic variation in clinical M. tuberculosis isolates.
[Genetic characterization of different populations of Rhopilema esculentum based on the mitochondrial COI sequence.

PubMed

Li, Yu Long; Dong, Jing; Wang, Bin; Li, Yi Ping; Yu, Xu Guang; Fu, Jie; Wang, Wen Bo

2016-07-01

To investigate the genetic characterization and population genetic structure of Rhopilema esculentum, we sequenced the mtDNA COI gene (624 bp) in 56 individuals collected from Liaodong Bay and the Ganghwado Island in the estuarine waters of the Han River. In addition, the homologous sequences of other 15 individuals which were sampled from the Bohai and Yellow seas and Sea of Japan were analyzed. A total of 28 polymorphic nucleotide sites were detected among the 71 individuals, which defined 32 haplotypes. Haplotype diversity levels were high (0.91±0.06-0.94±0.01) in R. esculentum populations, whereas those of nucleotide diversity were moderate to low [(0.60±0.34)%-(0.68±0.40)%]. Compared with several other giant jellyfish species, the variation level of R. esculentum was high. Phylogeographic analysis of the COI region revealed two lineages. The pairwise F ST comparison and hierarchical molecular variance analysis (AMOVA) showed that significant population structure existed throughout the range of R. esculentum. The results of this study indicated that the life-cycle characteristics, together with possible anthropogenic introduction such as stock enhancement and the prevailing ocean currents in this region, were proposed as the main factors that determined the genetic patterns of R. esculentum.
The History of Bordetella pertussis Genome Evolution Includes Structural Rearrangement

PubMed Central

Peng, Yanhui; Loparev, Vladimir; Batra, Dhwani; Bowden, Katherine E.; Burroughs, Mark; Cassiday, Pamela K.; Davis, Jamie K.; Johnson, Taccara; Juieng, Phalasy; Knipe, Kristen; Mathis, Marsenia H.; Pruitt, Andrea M.; Rowe, Lori; Sheth, Mili; Tondella, M. Lucia; Williams, Margaret M.

2017-01-01

ABSTRACT Despite high pertussis vaccine coverage, reported cases of whooping cough (pertussis) have increased over the last decade in the United States and other developed countries. Although Bordetella pertussis is well known for its limited gene sequence variation, recent advances in long-read sequencing technology have begun to reveal genomic structural heterogeneity among otherwise indistinguishable isolates, even within geographically or temporally defined epidemics. We have compared rearrangements among complete genome assemblies from 257 B. pertussis isolates to examine the potential evolution of the chromosomal structure in a pathogen with minimal gene nucleotide sequence diversity. Discrete changes in gene order were identified that differentiated genomes from vaccine reference strains and clinical isolates of various genotypes, frequently along phylogenetic boundaries defined by single nucleotide polymorphisms. The observed rearrangements were primarily large inversions centered on the replication origin or terminus and flanked by IS481, a mobile genetic element with >240 copies per genome and previously suspected to mediate rearrangements and deletions by homologous recombination. These data illustrate that structural genome evolution in B. pertussis is not limited to reduction but also includes rearrangement. Therefore, although genomes of clinical isolates are structurally diverse, specific changes in gene order are conserved, perhaps due to positive selection, providing novel information for investigating disease resurgence and molecular epidemiology. IMPORTANCE Whooping cough, primarily caused by Bordetella pertussis, has resurged in the United States even though the coverage with pertussis-containing vaccines remains high. The rise in reported cases has included increased disease rates among all vaccinated age groups, provoking questions about the pathogen's evolution. The chromosome of B. pertussis includes a large number of repetitive mobile genetic elements that obstruct genome analysis. However, these mobile elements facilitate large rearrangements that alter the order and orientation of essential protein-encoding genes, which otherwise exhibit little nucleotide sequence diversity. By comparing the complete genome assemblies from 257 isolates, we show that specific rearrangements have been conserved throughout recent evolutionary history, perhaps by eliciting changes in gene expression, which may also provide useful information for molecular epidemiology. PMID:28167525
Genetic diversity of mtDNA D-loop sequences in four native Chinese chicken breeds.

PubMed

Guo, H W; Li, C; Wang, X N; Li, Z J; Sun, G R; Li, G X; Liu, X J; Kang, X T; Han, R L

2017-10-01

1. To explore the genetic diversity of Chinese indigenous chicken breeds, a 585 bp fragment of the mitochondrial DNA (mtDNA) region was sequenced in 102 birds from the Xichuan black-bone chicken, Yunyang black-bone chicken and Lushi chicken. In addition, 30 mtDNA D-loop sequences of Silkie fowls were downloaded from NCBI. The mtDNA D-loop sequence polymorphism and maternal origin of 4 chicken breeds were analysed in this study. 2. The results showed that a total of 33 mutation sites and 28 haplotypes were detected in the 4 chicken breeds. The haplotype diversity and nucleotide diversity of these 4 native breeds were 0.916 ± 0.014 and 0.012 ± 0.002, respectively. Three clusters were formed in 4 Chinese native chickens and 12 reference breeds. Both the Xichuan black-bone chicken and Yunyang black-bone chicken were grouped into one cluster. Four haplogroups (A, B, C and E) emerged in the median-joining network in these breeds. 3. It was concluded that these 4 Chinese chicken breeds had high genetic diversity. The phylogenetic tree and median network profiles showed that Chinese native chickens and its neighbouring countries had at least two maternal origins, one from Yunnan, China and another from Southeast Asia or its surrounding area.
Single nucleotide polymorphism (SNP) discovery in duplicated genomes: intron-primed exon-crossing (IPEC) as a strategy for avoiding amplification of duplicated loci in Atlantic salmon (Salmo salar) and other salmonid fishes

PubMed Central

Ryynänen, Heikki J; Primmer, Craig R

2006-01-01

Background Single nucleotide polymorphisms (SNPs) represent the most abundant type of DNA variation in the vertebrate genome, and their applications as genetic markers in numerous studies of molecular ecology and conservation of natural populations are emerging. Recent large-scale sequencing projects in several fish species have provided a vast amount of data in public databases, which can be utilized in novel SNP discovery in salmonids. However, the suggested duplicated nature of the salmonid genome may hamper SNP characterization if the primers designed in conserved gene regions amplify multiple loci. Results Here we introduce a new intron-primed exon-crossing (IPEC) method in an attempt to overcome this duplication problem, and also evaluate different priming methods for SNP discovery in Atlantic salmon (Salmo salar) and other salmonids. A total of 69 loci with differing priming strategies were screened in S. salar, and 27 of these produced ~13 kb of high-quality sequence data consisting of 19 SNPs or indels (one per 680 bp). The SNP frequency and the overall nucleotide diversity (3.99 × 10-4) in S. salar was lower than reported in a majority of other organisms, which may suggest a relative young population history for Atlantic salmon. A subset of primers used in cross-species analyses revealed considerable variation in the SNP frequencies and nucleotide diversities in other salmonids. Conclusion Sequencing success was significantly higher with the new IPEC primers; thus the total number of loci to screen in order to identify one potential polymorphic site was six times less with this new strategy. Given that duplication may hamper SNP discovery in some species, the IPEC method reported here is an alternative way of identifying novel polymorphisms in such cases. PMID:16872523
Evolution and Diversity in Human Herpes Simplex Virus Genomes

PubMed Central

Gatherer, Derek; Ochoa, Alejandro; Greenbaum, Benjamin; Dolan, Aidan; Bowden, Rory J.; Enquist, Lynn W.; Legendre, Matthieu; Davison, Andrew J.

2014-01-01

Herpes simplex virus 1 (HSV-1) causes a chronic, lifelong infection in >60% of adults. Multiple recent vaccine trials have failed, with viral diversity likely contributing to these failures. To understand HSV-1 diversity better, we comprehensively compared 20 newly sequenced viral genomes from China, Japan, Kenya, and South Korea with six previously sequenced genomes from the United States, Europe, and Japan. In this diverse collection of passaged strains, we found that one-fifth of the newly sequenced members share a gene deletion and one-third exhibit homopolymeric frameshift mutations (HFMs). Individual strains exhibit genotypic and potential phenotypic variation via HFMs, deletions, short sequence repeats, and single-nucleotide polymorphisms, although the protein sequence identity between strains exceeds 90% on average. In the first genome-scale analysis of positive selection in HSV-1, we found signs of selection in specific proteins and residues, including the fusion protein glycoprotein H. We also confirmed previous results suggesting that recombination has occurred with high frequency throughout the HSV-1 genome. Despite this, the HSV-1 strains analyzed clustered by geographic origin during whole-genome distance analysis. These data shed light on likely routes of HSV-1 adaptation to changing environments and will aid in the selection of vaccine antigens that are invariant worldwide. PMID:24227835
Phylogenetic analysis of porcine reproductive and respiratory syndrome virus isolates from Northern Ireland.

PubMed

Smith, Natalie; Power, Ultan F; McKillen, John

2018-05-29

To investigate the genetic diversity of porcine reproductive and respiratory syndrome virus (PRRSV) in Northern Ireland, the ORF5 gene from nine field isolates was sequenced and phylogenetically analysed. The results revealed relatively high diversity amongst isolates, with 87.6-92.2% identity between farms at the nucleotide level and 84.1-93.5% identity at the protein level. Phylogenetic analysis confirmed that all nine isolates belonged to the European (type 1) genotype and formed a cluster within the subtype 1 subgroup. This study provides the first report on PRRSV isolate diversity in Northern Ireland.
Two distinct phylogenetic clades of infectious hematopoietic necrosis virus overlap within the Columbia River basin

USGS Publications Warehouse

Garver, K.A.; Troyer, R.M.; Kurath, G.

2003-01-01

Infectious hematopoietic necrosis virus (IHNV), an aquatic rhabdovirus, causes a highly lethal disease of salmonid fish in North America. To evaluate the genetic diversity of IHNV from throughout the Columbia River basin, excluding the Hagerman Valley, Idaho, the sequences of a 303 nt region of the glycoprotein gene (mid-G) of 120 virus isolates were determined. Sequence comparisons revealed 30 different sequence types, with a maximum nucleotide diversity of 7.3% (22 mismatches) and an intrapopulational nucleotide diversity of 0.018. This indicates that the genetic diversity of IHNV within the Columbia River basin is 3-fold higher than in Alaska, but 2-fold lower than in the Hagerman Valley, Idaho. Phylogenetic analyses separated the Columbia River basin IHNV isolates into 2 major clades, designated U and M. The 2 clades geographically overlapped within the lower Columbia River basin and in the lower Snake River and tributaries, while the upper Columbia River basin had only U clade and the upper Snake River basin had only M clade virus types. These results suggest that there are co-circulating lineages of IHNV present within specific areas of the Columbia River basin. The epidemiological significance of these findings provided insight into viral traffic patterns exhibited by IHNV in the Columbia River basin, with specific relevance to how the Columbia River basin IHNV types were related to those in the Hagerman Valley. These analyses indicate that there have likely been 2 historical events in which Hagerman Valley IHNV types were introduced and became established in the lower Columbia River basin. However, the data also clearly indicates that the Hagerman Valley is not a continuous source of waterborne virus infecting salmonid stocks downstream.
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.

A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Genetic diversity and natural selection of Plasmodium knowlesi merozoite surface protein 1 paralog gene in Malaysia.

PubMed

Ahmed, Md Atique; Fauzi, Muh; Han, Eun-Taek

2018-03-14

Human infections due to the monkey malaria parasite Plasmodium knowlesi is on the rise in most Southeast Asian countries specifically Malaysia. The C-terminal 19 kDa domain of PvMSP1P is a potential vaccine candidate, however, no study has been conducted in the orthologous gene of P. knowlesi. This study investigates level of polymorphisms, haplotypes and natural selection of full-length pkmsp1p in clinical samples from Malaysia. A total of 36 full-length pkmsp1p sequences along with the reference H-strain and 40 C-terminal pkmsp1p sequences from clinical isolates of Malaysia were downloaded from published genomes. Genetic diversity, polymorphism, haplotype and natural selection were determined using DnaSP 5.10 and MEGA 5.0 software. Genealogical relationships were determined using haplotype network tree in NETWORK software v5.0. Population genetic differentiation index (F ST ) and population structure of parasite was determined using Arlequin v3.5 and STRUCTURE v2.3.4 software. Comparison of 36 full-length pkmsp1p sequences along with the H-strain identified 339 SNPs (175 non-synonymous and 164 synonymous substitutions). The nucleotide diversity across the full-length gene was low compared to its ortholog pvmsp1p. The nucleotide diversity was higher toward the N-terminal domains (pkmsp1p-83 and 30) compared to the C-terminal domains (pkmsp1p-38, 33 and 19). Phylogenetic analysis of full-length genes identified 2 distinct clusters of P. knowlesi from Malaysian Borneo. The 40 pkmsp1p-19 sequences showed low polymorphisms with 16 polymorphisms leading to 18 haplotypes. In total there were 10 synonymous and 6 non-synonymous substitutions and 12 cysteine residues were intact within the two EGF domains. Evidence of strong purifying selection was observed within the full-length sequences as well in all the domains. Shared haplotypes of 40 pkmsp1p-19 were identified within Malaysian Borneo haplotypes. This study is the first to report on the genetic diversity and natural selection of pkmsp1p. A low level of genetic diversity and strong evidence of negative selection was detected and observed in all the domains of pkmsp1p of P. knowlesi indicating functional constrains. Shared haplotypes were identified within pkmsp1p-19 highlighting further evaluation using larger number of clinical samples from Malaysia.
Genetic epidemiology of pharmacogenetic variants in South East Asian Malays using whole-genome sequences.

PubMed

Sivadas, A; Salleh, M Z; Teh, L K; Scaria, V

2017-10-01

Expanding the scope of pharmacogenomic research by including multiple global populations is integral to building robust evidence for its clinical translation. Deep whole-genome sequencing of diverse ethnic populations provides a unique opportunity to study rare and common pharmacogenomic markers that often vary in frequency across populations. In this study, we aim to build a diverse map of pharmacogenetic variants in South East Asian (SEA) Malay population using deep whole-genome sequences of 100 healthy SEA Malay individuals. We investigated the allelic diversity of potentially deleterious pharmacogenomic variants in SEA Malay population. Our analysis revealed 227 common and 466 rare potentially functional single nucleotide variants (SNVs) in 437 pharmacogenomic genes involved in drug metabolism, transport and target genes, including 74 novel variants. This study has created one of the most comprehensive maps of pharmacogenetic markers in any population from whole genomes and will hugely benefit pharmacogenomic investigations and drug dosage recommendations in SEA Malays.
Diversity of partial RNA-dependent RNA polymerase gene sequences of soybean blotchy mosaic virus isolates from different host-, geographical- and temporal origins.

PubMed

Strydom, Elrea; Pietersen, Gerhard

2018-05-01

Infection of soybean by the plant cytorhabdovirus soybean blotchy mosaic virus (SbBMV) results in significant yield losses in the temperate, lower-lying soybean production regions of South Africa. A 277 bp portion of the RNA-dependent RNA polymerase gene of 66 SbBMV isolates from different: hosts, geographical locations in South Africa, and times of collection (spanning 16 years) were amplified by RT-PCR and sequenced to investigate the genetic diversity of isolates. Phylogenetic reconstruction revealed three main lineages, designated Groups A, B and C, with isolates grouping primarily according to geographic origin. Pairwise nucleotide identities ranged between 85.7% and 100% among all isolates, with isolates in Group A exhibiting the highest degree of sequence identity, and isolates of Groups A and B being more closely related to each other than to those in Group C. This is the first study investigating the genetic diversity of SbBMV.
New Mycobacterium tuberculosis Complex Sublineage, Brazzaville, Congo

PubMed Central

Malm, Sven; Linguissi, Laure S. Ghoma; Tekwu, Emmanuel M.; Vouvoungui, Jeannhey C.; Kohl, Thomas A.; Beckert, Patrick; Sidibe, Anissa; Rüsch-Gerdes, Sabine; Madzou-Laboum, Igor K.; Kwedi, Sylvie; Penlap Beng, Véronique; Frank, Matthias; Ntoumi, Francine

2017-01-01

Tuberculosis is a leading cause of illness and death in Congo. No data are available about the population structure and transmission dynamics of the Mycobacterium tuberculosis complex strains prevalent in this central Africa country. On the basis of single-nucleotide polymorphisms detected by whole-genome sequencing, we phylogenetically characterized 74 MTBC isolates from Brazzaville, the capital of Congo. The diversity of the study population was high; most strains belonged to the Euro-American lineage, which split into Latin American Mediterranean, Uganda I, Uganda II, Haarlem, X type, and a new dominant sublineage named Congo type (n = 26). Thirty strains were grouped in 5 clusters (each within 12 single-nucleotide polymorphisms), from which 23 belonged to the Congo type. High cluster rates and low genomic diversity indicate recent emergence and transmission of the Congo type, a new Euro-American sublineage of MTBC. PMID:28221129
New Mycobacterium tuberculosis Complex Sublineage, Brazzaville, Congo.

PubMed

Malm, Sven; Linguissi, Laure S Ghoma; Tekwu, Emmanuel M; Vouvoungui, Jeannhey C; Kohl, Thomas A; Beckert, Patrick; Sidibe, Anissa; Rüsch-Gerdes, Sabine; Madzou-Laboum, Igor K; Kwedi, Sylvie; Penlap Beng, Véronique; Frank, Matthias; Ntoumi, Francine; Niemann, Stefan

2017-03-01

Tuberculosis is a leading cause of illness and death in Congo. No data are available about the population structure and transmission dynamics of the Mycobacterium tuberculosis complex strains prevalent in this central Africa country. On the basis of single-nucleotide polymorphisms detected by whole-genome sequencing, we phylogenetically characterized 74 MTBC isolates from Brazzaville, the capital of Congo. The diversity of the study population was high; most strains belonged to the Euro-American lineage, which split into Latin American Mediterranean, Uganda I, Uganda II, Haarlem, X type, and a new dominant sublineage named Congo type (n = 26). Thirty strains were grouped in 5 clusters (each within 12 single-nucleotide polymorphisms), from which 23 belonged to the Congo type. High cluster rates and low genomic diversity indicate recent emergence and transmission of the Congo type, a new Euro-American sublineage of MTBC.
Proteopedia: 3D Visualization and Annotation of Transcription Factor-DNA Readout Modes

ERIC Educational Resources Information Center

Dantas Machado, Ana Carolina; Saleebyan, Skyler B.; Holmes, Bailey T.; Karelina, Maria; Tam, Julia; Kim, Sharon Y.; Kim, Keziah H.; Dror, Iris; Hodis, Eran; Martz, Eric; Compeau, Patricia A.; Rohs, Remo

2012-01-01

3D visualization assists in identifying diverse mechanisms of protein-DNA recognition that can be observed for transcription factors and other DNA binding proteins. We used Proteopedia to illustrate transcription factor-DNA readout modes with a focus on DNA shape, which can be a function of either nucleotide sequence (Hox proteins) or base pairing…
LSGermOPA, a custom OPA of 384 EST-derived SNPs for high-throughput lettuce (Lactuca sativa L.) germplasm fingerprinting

USDA-ARS?s Scientific Manuscript database

We assessed the genetic diversity and population structure among 148 cultivated lettuce (Lactuca sativa L.) accessions using the high-throughput GoldenGate assay and 384 EST (Expressed Sequence Tag)-derived SNP (single nucleotide polymorphism) markers. A custom OPA (Oligo Pool All), LSGermOPA was fo...
Evolution and Diversity of the Human Hepatitis D Virus Genome

PubMed Central

Huang, Chi-Ruei; Lo, Szecheng J.

2010-01-01

Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work. PMID:20204073
Genetic diversity of the captive Asian tapir population in Thailand, based on mitochondrial control region sequence data and the comparison of its nucleotide structure with Brazilian tapir.

PubMed

Muangkram, Yuttamol; Amano, Akira; Wajjwalku, Worawidh; Pinyopummintr, Tanu; Thongtip, Nikorn; Kaolim, Nongnid; Sukmak, Manakorn; Kamolnorranath, Sumate; Siriaroonrat, Boripat; Tipkantha, Wanlaya; Maikaew, Umaporn; Thomas, Warisara; Polsrila, Kanda; Dongsaard, Kwanreaun; Sanannu, Saowaphang; Wattananorrasate, Anuwat

2017-07-01

The Asian tapir (Tapirus indicus) has been classified as Endangered on the IUCN Red List of Threatened Species (2008). Genetic diversity data provide important information for the management of captive breeding and conservation of this species. We analyzed mitochondrial control region (CR) sequences from 37 captive Asian tapirs in Thailand. Multiple alignments of the full-length CR sequences sized 1268 bp comprised three domains as described in other mammal species. Analysis of 16 parsimony-informative variable sites revealed 11 haplotypes. Furthermore, the phylogenetic analysis using median-joining network clearly showed three clades correlated with our earlier cytochrome b gene study in this endangered species. The repetitive motif is located between first and second conserved sequence blocks, similar to the Brazilian tapir. The highest polymorphic site was located in the extended termination associated sequences domain. The results could be applied for future genetic management based in captivity and wild that shows stable populations.
DNA sequence variation of wild barley Hordeum spontaneum (L.) across environmental gradients in Israel

PubMed Central

Bedada, G; Westerbergh, A; Nevo, E; Korol, A; Schmid, K J

2014-01-01

Wild barley Hordeum spontaneum (L.) shows a wide geographic distribution and ecological diversity. A key question concerns the spatial scale at which genetic differentiation occurs and to what extent it is driven by natural selection. The Levant region exhibits a strong ecological gradient along the North–South axis, with numerous small canyons in an East–West direction and with small-scale environmental gradients on the opposing North- and South-facing slopes. We sequenced 34 short genomic regions in 54 accessions of wild barley collected throughout Israel and from the opposing slopes of two canyons. The nucleotide diversity of the total sample is 0.0042, which is about two-thirds of a sample from the whole species range (0.0060). Thirty accessions collected at ‘Evolution Canyon' (EC) at Nahal Oren, close to Haifa, have a nucleotide diversity of 0.0036, and therefore harbor a large proportion of the genetic diversity. There is a high level of genetic clustering throughout Israel and within EC, which roughly differentiates the slopes. Accessions from the hot and dry South-facing slope have significantly reduced genetic diversity and are genetically more distinct from accessions from the North-facing slope, which are more similar to accessions from other regions in Northern Israel. Statistical population models indicate that wild barley within the EC consist of three separate genetic clusters with substantial gene flow. The data indicate a high level of population structure at large and small geographic scales that shows isolation-by-distance, and is also consistent with ongoing natural selection contributing to genetic differentiation at a small geographic scale. PMID:24619177
Genomic Epidemiology of Salmonella enterica Serotype Enteritidis based on Population Structure of Prevalent Lineages

PubMed Central

Desai, Prerak T.; den Bakker, Henk C.; Mikoleit, Matthew; Tolar, Beth; Trees, Eija; Hendriksen, Rene S.; Frye, Jonathan G.; Porwollik, Steffen; Weimer, Bart C.; Wiedmann, Martin; Weinstock, George M.; Fields, Patricia I.; McClelland, Michael

2014-01-01

Salmonella enterica serotype Enteritidis is one of the most commonly reported causes of human salmonellosis. Its low genetic diversity, measured by fingerprinting methods, has made subtyping a challenge. We used whole-genome sequencing to characterize 125 S. enterica Enteritidis and 3 S. enterica serotype Nitra strains. Single-nucleotide polymorphisms were filtered to identify 4,887 reliable loci that distinguished all isolates from each other. Our whole-genome single-nucleotide polymorphism typing approach was robust for S. enterica Enteritidis subtyping with combined data for different strains from 2 different sequencing platforms. Five major genetic lineages were recognized, which revealed possible patterns of geographic and epidemiologic distribution. Analyses on the population dynamics and evolutionary history estimated that major lineages emerged during the 17th–18th centuries and diversified during the 1920s and 1950s. PMID:25147968
Analysis of Facultative Lithotroph Distribution and Diversity on Volcanic Deposits by Use of the Large Subunit of Ribulose 1,5-Bisphosphate Carboxylase/Oxygenase†

PubMed Central

Nanba, K.; King, G. M.; Dunfield, K.

2004-01-01

A 492- to 495-bp fragment of the gene coding for the large subunit of the form I ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) (rbcL) was amplified by PCR from facultatively lithotrophic aerobic CO-oxidizing bacteria, colorless and purple sulfide-oxidizing microbial mats, and genomic DNA extracts from tephra and ash deposits from Kilauea volcano, for which atmospheric CO and hydrogen have been previously documented as important substrates. PCR products from the mats and volcanic sites were used to construct rbcL clone libraries. Phylogenetic analyses showed that the rbcL sequences from all isolates clustered with form IC rbcL sequences derived from facultative lithotrophs. In contrast, the microbial mat clone sequences clustered with sequences from obligate lithotrophs representative of form IA rbcL. Clone sequences from volcanic sites fell within the form IC clade, suggesting that these sites were dominated by facultative lithotrophs, an observation consistent with biogeochemical patterns at the sites. Based on phylogenetic and statistical analyses, clone libraries differed significantly among volcanic sites, indicating that they support distinct lithotrophic assemblages. Although some of the clone sequences were similar to known rbcL sequences, most were novel. Based on nucleotide diversity and average pairwise difference, a forested site and an 1894 lava flow were found to support the most diverse and least diverse lithotrophic populations, respectively. These indices of diversity were not correlated with rates of atmospheric CO and hydrogen uptake but were correlated with estimates of respiration and microbial biomass. PMID:15066819
Analysis of facultative lithotroph distribution and diversity on volcanic deposits by use of the large subunit of ribulose 1,5-bisphosphate carboxylase/oxygenase.

PubMed

Nanba, K; King, G M; Dunfield, K

2004-04-01

A 492- to 495-bp fragment of the gene coding for the large subunit of the form I ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) (rbcL) was amplified by PCR from facultatively lithotrophic aerobic CO-oxidizing bacteria, colorless and purple sulfide-oxidizing microbial mats, and genomic DNA extracts from tephra and ash deposits from Kilauea volcano, for which atmospheric CO and hydrogen have been previously documented as important substrates. PCR products from the mats and volcanic sites were used to construct rbcL clone libraries. Phylogenetic analyses showed that the rbcL sequences from all isolates clustered with form IC rbcL sequences derived from facultative lithotrophs. In contrast, the microbial mat clone sequences clustered with sequences from obligate lithotrophs representative of form IA rbcL. Clone sequences from volcanic sites fell within the form IC clade, suggesting that these sites were dominated by facultative lithotrophs, an observation consistent with biogeochemical patterns at the sites. Based on phylogenetic and statistical analyses, clone libraries differed significantly among volcanic sites, indicating that they support distinct lithotrophic assemblages. Although some of the clone sequences were similar to known rbcL sequences, most were novel. Based on nucleotide diversity and average pairwise difference, a forested site and an 1894 lava flow were found to support the most diverse and least diverse lithotrophic populations, respectively. These indices of diversity were not correlated with rates of atmospheric CO and hydrogen uptake but were correlated with estimates of respiration and microbial biomass.
Genetic diversity of tyrosine hydroxylase (TH) and dopamine β-hydroxylase (DBH) genes in cattle breeds

PubMed Central

Lourenco-Jaramillo, Diana Lelidett; Sifuentes-Rincón, Ana María; Parra-Bracamonte, Gaspar Manuel; de la Rosa-Reyna, Xochitl Fabiola; Segura-Cabrera, Aldo; Arellano-Vera, Williams

2012-01-01

DNA from four cattle breeds was used to re-sequence all of the exons and 56% of the introns of the bovine tyrosine hydroxylase (TH) gene and 97% and 13% of the bovine dopamine β-hydroxylase (DBH) coding and non-coding sequences, respectively. Two novel single nucleotide polymorphisms (SNPs) and a microsatellite motif were found in the TH sequences. The DBH sequences contained 62 nucleotide changes, including eight non-synonymous SNPs (nsSNPs) that are of particular interest because they may alter protein function and therefore affect the phenotype. These DBH nsSNPs resulted in amino acid substitutions that were predicted to destabilize the protein structure. Six SNPs (one from TH and five from DBH non-synonymous SNPs) were genotyped in 140 animals; all of them were polymorphic and had a minor allele frequency of > 9%. There were significant differences in the intra- and inter-population haplotype distributions. The haplotype differences between Brahman cattle and the three B. t. taurus breeds (Charolais, Holstein and Lidia) were interesting from a behavioural point of view because of the differences in temperament between these breeds. PMID:22888292
Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

PubMed

Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

2012-01-01

The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.
Molecular Properties of Poliovirus Isolates: Nucleotide Sequence Analysis, Typing by PCR and Real-Time RT-PCR.

PubMed

Burns, Cara C; Kilpatrick, David R; Iber, Jane C; Chen, Qi; Kew, Olen M

2016-01-01

Virologic surveillance is essential to the success of the World Health Organization initiative to eradicate poliomyelitis. Molecular methods have been used to detect polioviruses in tissue culture isolates derived from stool samples obtained through surveillance for acute flaccid paralysis. This chapter describes the use of realtime PCR assays to identify and serotype polioviruses. In particular, a degenerate, inosine-containing, panpoliovirus (panPV) PCR primer set is used to distinguish polioviruses from NPEVs. The high degree of nucleotide sequence diversity among polioviruses presents a challenge to the systematic design of nucleic acid-based reagents. To accommodate the wide variability and rapid evolution of poliovirus genomes, degenerate codon positions on the template were matched to mixed-base or deoxyinosine residues on both the primers and the TaqMan™ probes. Additional assays distinguish between Sabin vaccine strains and non-Sabin strains. This chapter also describes the use of generic poliovirus specific primers, along with degenerate and inosine-containing primers, for routine VP1 sequencing of poliovirus isolates. These primers, along with nondegenerate serotype-specific Sabin primers, can also be used to sequence individual polioviruses in mixtures.
Mitochondrial DNA sequence variation and phylogeography of the scarlet kingsnake (Lampropeltis elapsoides).

PubMed

Friedman, Michael; Schaffer, Les

2011-02-01

BACKGROUND AND AIMS. With the goal of assessing population structure and geographic distribution of haplotype lineages among Lampropeltis elapsoides, we sequenced the ND4 mitochondrial DNA locus from 96 specimens of this snake across its area of distribution. MATERIALS AND METHODS. We relied heavily on formalin-fixed museum specimens to accomplish this analysis. RESULTS. The sequence alignment consisted of 491 bp of the selected gene, with 28% missing data. A simulation used to assess the effect of missing data on population genetic and phylogenetic resolution indicated increased character conflict, but with minimal loss of phylogenetic structure. CONCLUSION. This limited dataset suggests that L. elapsoides constitutes a largely unstructured population, with both widespread haplotypes and large number of private haplotypes, a moderate level of nucleotide diversity, and a low, but significant, degree of north-south population differentiation. Haplotype structure and frequency, nucleotide frequency, and values for Tajima's D and Fu's F(S) indicate a recent range or population expansion following a historic bottleneck.
A comprehensive bioinformatic analysis of hepatitis D virus full-length genomes.

PubMed

Delfino, C M; Cerrudo, C S; Biglione, M; Oubiña, J R; Ghiringhelli, P D; Mathet, V L

2018-02-06

In association with hepatitis B virus (HBV), hepatitis delta virus (HDV) is a subviral agent that may promote severe acute and chronic forms of liver disease. Based on the percentage of nucleotide identity of the genome, HDV was initially classified into three genotypes. However, since 2006, the original classification has been further expanded into eight clades/genotypes. The intergenotype divergence may be as high as 35%-40% over the entire RNA genome, whereas sequence heterogeneity among the isolates of a given genotype is <20%; furthermore, HDV recombinants have been clearly demonstrated. The genetic diversity of HDV is related to the geographic origin of the isolates. This study shows the first comprehensive bioinformatic analysis of the complete available set of HDV sequences, using both nucleotide and protein phylogenies (based on an evolutionary model selection, gamma distribution estimation, tree inference and phylogenetic distance estimation), protein composition analysis and comparison (based on the presence of invariant residues, molecular signatures, amino acid frequencies and mono- and di-amino acid compositional distances), as well as amino acid changes in sequence evolution. Taking into account the congruent and consistent results of both nucleotide and amino acid analyses of GenBank available sequences (recorded as of January, 2017), we propose that the eight hepatitis D virus genotypes may be grouped into three large genogroups fully supported by their shared characteristics. © 2018 John Wiley & Sons Ltd.
An outbreak of respiratory tularemia caused by diverse clones of Francisella tularensis.

PubMed

Johansson, Anders; Lärkeryd, Adrian; Widerström, Micael; Mörtberg, Sara; Myrtännäs, Kerstin; Ohrman, Caroline; Birdsell, Dawn; Keim, Paul; Wagner, David M; Forsman, Mats; Larsson, Pär

2014-12-01

The bacterium Francisella tularensis is recognized for its virulence, infectivity, genetic homogeneity, and potential as a bioterrorism agent. Outbreaks of respiratory tularemia, caused by inhalation of this bacterium, are poorly understood. Such outbreaks are exceedingly rare, and F. tularensis is seldom recovered from clinical specimens. A localized outbreak of tularemia in Sweden was investigated. Sixty-seven humans contracted laboratory-verified respiratory tularemia. F. tularensis subspecies holarctica was isolated from the blood or pleural fluid of 10 individuals from July to September 2010. Using whole-genome sequencing and analysis of single-nucleotide polymorphisms (SNPs), outbreak isolates were compared with 110 archived global isolates. There were 757 SNPs among the genomes of the 10 outbreak isolates and the 25 most closely related archival isolates (all from Sweden/Finland). Whole genomes of outbreak isolates were >99.9% similar at the nucleotide level and clustered into 3 distinct genetic clades. Unexpectedly, high-sequence similarity grouped some outbreak and archival isolates that originated from patients from different geographic regions and up to 10 years apart. Outbreak and archival genomes frequently differed by only 1-3 of 1 585 229 examined nucleotides. The outbreak was caused by diverse clones of F. tularensis that occurred concomitantly, were widespread, and apparently persisted in the environment. Multiple independent acquisitions of F. tularensis from the environment over a short time period suggest that natural outbreaks of respiratory tularemia are triggered by environmental cues. The findings additionally caution against interpreting genome sequence identity for this pathogen as proof of a direct epidemiological link. © The Author 2014. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Computed Energetics of Nucleotides in Spatial Ribozyme Structures: An Accurate Identification of Functional Regions from Structure

PubMed Central

Torshin, Ivan Y.

2004-01-01

Ribozymes are functionally diverse RNA molecules with intrinsic catalytic activity. Multiple structural and biochemical studies are required to establish which nucleotide bases are involved in the catalysis. The relative energetic properties of the nucleotide bases have been analyzed in a set of the known ribozyme structures. It was found that many of the known catalytic nucleotides can be identified using only the structure without any additional biochemical data. The results of the calculations compare well with the available biochemical data on RNA stability. Extensive in silico mutagenesis suggests that most of the nucleotides in ribozymes stabilize the RNA. The calculations show that relative contribution of the catalytic bases to RNA stability observably differs from contributions of the noncatalytic bases. Distinction between the concepts of “relative stability” and “mutational stability” is suggested. As results of prediction for several models of ribozymes appear to be in agreement with the published data on the potential active site regions, the method can potentially be used for prediction of functional nucleotides from nucleic sequence. PMID:15105962
Mitochondrial control-region sequence variation in aboriginal Australians.

PubMed Central

van Holst Pellekaan, S; Frommer, M; Sved, J; Boettcher, B

1998-01-01

The mitochondrial D-loop hypervariable segment 1 (mt HVS1) between nucleotides 15997 and 16377 has been examined in aboriginal Australian people from the Darling River region of New South Wales (riverine) and from Yuendumu in central Australia (desert). Forty-seven unique HVS1 types were identified, varying at 49 nucleotide positions. Pairwise analysis by calculation of BEPPI (between population proportion index) reveals statistically significant structure in the populations, although some identical HVS1 types are seen in the two contrasting regions. mt HVS1 types may reflect more-ancient distributions than do linguistic diversity and other culturally distinguishing attributes. Comparison with sequences from five published global studies reveals that these Australians demonstrate greatest divergence from some Africans, least from Papua New Guinea highlanders, and only slightly more from some Pacific groups (Indonesian, Asian, Samoan, and coastal Papua New Guinea), although the HVS1 types vary at different nucleotide sites. Construction of a median network, displaying three main groups, suggests that several hypervariable nucleotide sites within the HVS1 are likely to have undergone mutation independently, making phylogenetic comparison with global samples by conventional methods difficult. Specific nucleotide-site variants are major separators in median networks constructed from Australian HVS1 types alone and for one global selection. The distribution of these, requiring extended study, suggests that they may be signatures of different groups of prehistoric colonizers into Australia, for which the time of colonization remains elusive. PMID:9463317
Shallow Population Genetic Structures of Thread-sail Filefish (Stephanolepis cirrhifer) Populations from Korean Coastal Waters.

PubMed

Yoon, M; Park, W; Nam, Y K; Kim, D S

2012-02-01

Genetic diversities, population genetic structures and demographic histories of the thread-sail filefish Stephanolepis cirrhifer were investigated by nucleotide sequencing of 336 base pairs of the mitochondrial DNA (mtDNA) control region in 111 individuals collected from six populations in Korean coastal waters. A total of 70 haplotypes were defined by 58 variable nucleotide sites. The neighbor-joining tree of the 70 haplotypes was shallow and did not provide evidence of geographical associations. Expansion of S. cirrhifer populations began approximate 51,000 to 102,000 years before present, correlating with the period of sea level rise since the late Pleistocene glacial maximum. High levels of haplotype diversities (0.974±0.029 to 1.000±0.076) and nucleotide diversities (0.014 to 0.019), and low levels of genetic differentiation among populations inferred from pairwise population F ST values (-0.007 to 0.107), support an expansion of the S. cirrhifer population. Hierarchical analysis of molecular variance (AMOVA) revealed weak but significant genetic structures among three groups (F CT = 0.028, p<0.05), and no genetic variation within groups (0.53%; F SC = 0.005, p = 0.23). These results may help establish appropriate fishery management strategies for stocks of S. cirrhifer and related species.
Shallow Population Genetic Structures of Thread-sail Filefish (Stephanolepis cirrhifer) Populations from Korean Coastal Waters

PubMed Central

Yoon, M.; Park, W.; Nam, Y. K.; Kim, D. S.

2012-01-01

Genetic diversities, population genetic structures and demographic histories of the thread-sail filefish Stephanolepis cirrhifer were investigated by nucleotide sequencing of 336 base pairs of the mitochondrial DNA (mtDNA) control region in 111 individuals collected from six populations in Korean coastal waters. A total of 70 haplotypes were defined by 58 variable nucleotide sites. The neighbor-joining tree of the 70 haplotypes was shallow and did not provide evidence of geographical associations. Expansion of S. cirrhifer populations began approximate 51,000 to 102,000 years before present, correlating with the period of sea level rise since the late Pleistocene glacial maximum. High levels of haplotype diversities (0.974±0.029 to 1.000±0.076) and nucleotide diversities (0.014 to 0.019), and low levels of genetic differentiation among populations inferred from pairwise population FST values (−0.007 to 0.107), support an expansion of the S. cirrhifer population. Hierarchical analysis of molecular variance (AMOVA) revealed weak but significant genetic structures among three groups (FCT = 0.028, p<0.05), and no genetic variation within groups (0.53%; FSC = 0.005, p = 0.23). These results may help establish appropriate fishery management strategies for stocks of S. cirrhifer and related species. PMID:25049547
Analysis of nucleotide diversity among alleles of the major bacterial blight resistance gene Xa27 in cultivars of rice (Oryza sativa) and its wild relatives.

PubMed

Bimolata, Waikhom; Kumar, Anirudh; Sundaram, Raman Meenakshi; Laha, Gouri Shankar; Qureshi, Insaf Ahmed; Reddy, Gajjala Ashok; Ghazi, Irfan Ahmad

2013-08-01

Xa27 is one of the important R-genes, effective against bacterial blight disease of rice caused by Xanthomonas oryzae pv. oryzae (Xoo). Using natural population of Oryza, we analyzed the sequence variation in the functionally important domains of Xa27 across the Oryza species. DNA sequences of Xa27 alleles from 27 rice accessions revealed higher nucleotide diversity among the reported R-genes of rice. Sequence polymorphism analysis revealed synonymous and non-synonymous mutations in addition to a number of InDels in non-coding regions of the gene. High sequence variation was observed in the promoter region including the 5'UTR with 'π' value 0.00916 and 'θ w ' = 0.01785. Comparative analysis of the identified Xa27 alleles with that of IRBB27 and IR24 indicated the operation of both positive selection (Ka/Ks > 1) and neutral selection (Ka/Ks ≈ 0). The genetic distances of alleles of the gene from Oryza nivara were nearer to IRBB27 as compared to IR24. We also found the presence of conserved and null UPT (upregulated by transcriptional activator) box in the isolated alleles. Considerable amino acid polymorphism was localized in the trans-membrane domain for which the functional significance is yet to be elucidated. However, the absence of functional UPT box in all the alleles except IRBB27 suggests the maintenance of single resistant allele throughout the natural population.
Isolation of a novel Orientia species (O. chuto sp. nov.) from a patient infected in Dubai.

PubMed

Izzard, Leonard; Fuller, Andrew; Blacksell, Stuart D; Paris, Daniel H; Richards, Allen L; Aukkanit, Nuntipa; Nguyen, Chelsea; Jiang, Ju; Fenwick, Stan; Day, Nicholas P J; Graves, Stephen; Stenos, John

2010-12-01

In July 2006, an Australian tourist returning from Dubai, in the United Arab Emirates (UAE), developed acute scrub typhus. Her signs and symptoms included fever, myalgia, headache, rash, and eschar. Orientia tsutsugamushi serology demonstrated a 4-fold rise in antibody titers in paired serum collections (1:512 to 1:8,192), with the sera reacting strongest against the Gilliam strain antigen. An Orientia species was isolated by the in vitro culture of the patient's acute blood taken prior to antibiotic treatment. The gene sequencing of the 16S rRNA gene (rrs), partial 56-kDa gene, and the full open reading frame 47-kDa gene was performed, and comparisons of this new Orientia sp. isolate to previously characterized strains demonstrated significant sequence diversity. The closest homology to the rrs sequence of the new Orientia sp. isolate was with three strains of O. tsutsugamushi (Ikeda, Kato, and Karp), with a nucleotide sequence similarity of 98.5%. The closest homology to the 47-kDa gene sequence was with O. tsutsugamushi strain Gilliam, with a nucleotide similarity of 82.3%, while the closest homology to the 56-kDa gene sequence was with O. tsutsugamushi strain TA686, with a nucleotide similarity of 53.1%. The molecular divergence and geographically unique origin lead us to believe that this organism should be considered a novel species. Therefore, we have proposed the name "Orientia chuto," and the prototype strain of this species is strain Dubai, named after the location in which the patient was infected.
Isolation of a Novel Orientia Species (O. chuto sp. nov.) from a Patient Infected in Dubai ▿

PubMed Central

Izzard, Leonard; Fuller, Andrew; Blacksell, Stuart D.; Paris, Daniel H.; Richards, Allen L.; Aukkanit, Nuntipa; Nguyen, Chelsea; Jiang, Ju; Fenwick, Stan; Day, Nicholas P. J.; Graves, Stephen; Stenos, John

2010-01-01

In July 2006, an Australian tourist returning from Dubai, in the United Arab Emirates (UAE), developed acute scrub typhus. Her signs and symptoms included fever, myalgia, headache, rash, and eschar. Orientia tsutsugamushi serology demonstrated a 4-fold rise in antibody titers in paired serum collections (1:512 to 1:8,192), with the sera reacting strongest against the Gilliam strain antigen. An Orientia species was isolated by the in vitro culture of the patient's acute blood taken prior to antibiotic treatment. The gene sequencing of the 16S rRNA gene (rrs), partial 56-kDa gene, and the full open reading frame 47-kDa gene was performed, and comparisons of this new Orientia sp. isolate to previously characterized strains demonstrated significant sequence diversity. The closest homology to the rrs sequence of the new Orientia sp. isolate was with three strains of O. tsutsugamushi (Ikeda, Kato, and Karp), with a nucleotide sequence similarity of 98.5%. The closest homology to the 47-kDa gene sequence was with O. tsutsugamushi strain Gilliam, with a nucleotide similarity of 82.3%, while the closest homology to the 56-kDa gene sequence was with O. tsutsugamushi strain TA686, with a nucleotide similarity of 53.1%. The molecular divergence and geographically unique origin lead us to believe that this organism should be considered a novel species. Therefore, we have proposed the name “Orientia chuto,” and the prototype strain of this species is strain Dubai, named after the location in which the patient was infected. PMID:20926708
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

PubMed

Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

2015-07-01

The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Landscape of Insertion Polymorphisms in the Human Genome

PubMed Central

Onozawa, Masahiro; Goldberg, Liat; Aplan, Peter D.

2015-01-01

Nucleotide substitutions, small (<50 bp) insertions or deletions (indels), and large (>50 bp) deletions are well-known causes of genetic variation within the human genome. We recently reported a previously unrecognized form of polymorphic insertions, termed templated sequence insertion polymorphism (TSIP), in which the inserted sequence was templated from a distant genomic region, and was inserted in the genome through reverse transcription of an RNA intermediate. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; class 1 TSIPs show target site duplication, polyadenylation, and preference for insertion at a 5′-TTTT/A-3′ sequence, suggesting a LINE-1 based insertion mechanism, whereas class 2 TSIPs show features consistent with repair of a DNA double strand break by nonhomologous end joining. To gain a more complete picture of TSIPs throughout the human population, we evaluated whole-genome sequence from 52 individuals, and identified 171 TSIPs. Most individuals had 25–30 TSIPs, and common (present in >20% of individuals) TSIPs were found in individuals throughout the world, whereas rare TSIPs tended to cluster in specific geographic regions. The number of rare TSIPs was greater than the number of common TSIPs, suggesting that TSIP generation is an ongoing process. Intriguingly, mitochondrial sequences were a frequent template for class 2 insertions, used more commonly than any nuclear chromosome. Similar to single nucleotide polymorphisms and indels, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases, and can be useful in tracking historical migration of populations. PMID:25745018
Salmonella enterica Prophage Sequence Profiles Reflect Genome Diversity and Can Be Used for High Discrimination Subtyping.

PubMed

Mottawea, Walid; Duceppe, Marc-Olivier; Dupras, Andrée A; Usongo, Valentine; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Hamel, Jeremie; Kukavica-Ibrulj, Irena; Boyle, Brian; Gill, Alexander; Burnett, Elton; Franz, Eelco; Arya, Gitanjali; Weadge, Joel T; Gruenheid, Samantha; Wiedmann, Martin; Huang, Hongsheng; Daigle, France; Moineau, Sylvain; Bekal, Sadjia; Levesque, Roger C; Goodridge, Lawrence D; Ogunremi, Dele

2018-01-01

Non-typhoidal Salmonella is a leading cause of foodborne illness worldwide. Prompt and accurate identification of the sources of Salmonella responsible for disease outbreaks is crucial to minimize infections and eliminate ongoing sources of contamination. Current subtyping tools including single nucleotide polymorphism (SNP) typing may be inadequate, in some instances, to provide the required discrimination among epidemiologically unrelated Salmonella strains. Prophage genes represent the majority of the accessory genes in bacteria genomes and have potential to be used as high discrimination markers in Salmonella . In this study, the prophage sequence diversity in different Salmonella serovars and genetically related strains was investigated. Using whole genome sequences of 1,760 isolates of S. enterica representing 151 Salmonella serovars and 66 closely related bacteria, prophage sequences were identified from assembled contigs using PHASTER. We detected 154 different prophages in S. enterica genomes. Prophage sequences were highly variable among S. enterica serovars with a median ± interquartile range (IQR) of 5 ± 3 prophage regions per genome. While some prophage sequences were highly conserved among the strains of specific serovars, few regions were lineage specific. Therefore, strains belonging to each serovar could be clustered separately based on their prophage content. Analysis of S . Enteritidis isolates from seven outbreaks generated distinct prophage profiles for each outbreak. Taken altogether, the diversity of the prophage sequences correlates with genome diversity. Prophage repertoires provide an additional marker for differentiating S. enterica subtypes during foodborne outbreaks.
Similar genomic proportions of copy number variation within gray wolves and modern dog breeds inferred from whole genome sequencing.

PubMed

Serres-Armero, Aitor; Povolotskaya, Inna S; Quilez, Javier; Ramirez, Oscar; Santpere, Gabriel; Kuderna, Lukas F K; Hernandez-Rodriguez, Jessica; Fernandez-Callejo, Marcos; Gomez-Sanchez, Daniel; Freedman, Adam H; Fan, Zhenxin; Novembre, John; Navarro, Arcadi; Boyko, Adam; Wayne, Robert; Vilà, Carles; Lorente-Galdos, Belen; Marques-Bonet, Tomas

2017-12-19

Whole genome re-sequencing data from dogs and wolves are now commonly used to study how natural and artificial selection have shaped the patterns of genetic diversity. Single nucleotide polymorphisms, microsatellites and variants in mitochondrial DNA have been interrogated for links to specific phenotypes or signals of domestication. However, copy number variation (CNV), despite its increasingly recognized importance as a contributor to phenotypic diversity, has not been extensively explored in canids. Here, we develop a new accurate probabilistic framework to create fine-scale genomic maps of segmental duplications (SDs), compare patterns of CNV across groups and investigate their role in the evolution of the domestic dog by using information from 34 canine genomes. Our analyses show that duplicated regions are enriched in genes and hence likely possess functional importance. We identify 86 loci with large CNV differences between dogs and wolves, enriched in genes responsible for sensory perception, immune response, metabolic processes, etc. In striking contrast to the observed loss of nucleotide diversity in domestic dogs following the population bottlenecks that occurred during domestication and breed creation, we find a similar proportion of CNV loci in dogs and wolves, suggesting that other dynamics are acting to particularly select for CNVs with potentially functional impacts. This work is the first comparison of genome wide CNV patterns in domestic and wild canids using whole-genome sequencing data and our findings contribute to study the impact of novel kinds of genetic changes on the evolution of the domestic dog.
Comparative sequence analyses of sixteen reptilian paramyxoviruses

USGS Publications Warehouse

Ahne, W.; Batts, W.N.; Kurath, G.; Winton, J.R.

1999-01-01

Viral genomic RNA of Fer-de-Lance virus (FDLV), a paramyxovirus highly pathogenic for reptiles, was reverse transcribed and cloned. Plasmids with significant sequence similarities to the hemagglutinin-neuraminidase (HN) and polymerase (L) genes of mammalian paramyxoviruses were identified by BLAST search. Partial sequences of the FDLV genes were used to design primers for amplification by nested polymerase chain reaction (PCR) and sequencing of 518-bp L gene and 352-bp HN gene fragments from a collection of 15 previously uncharacterized reptilian paramyxoviruses. Phylogenetic analyses of the partial L and HN sequences produced similar trees in which there were two distinct subgroups of isolates that were supported with maximum bootstrap values, and several intermediate isolates. Within each subgroup the nucleotide divergence values were less than 2.5%, while the divergence between the two subgroups was 20-22%. This indicated that the two subgroups represent distinct virus species containing multiple virus strains. The five intermediate isolates had nucleotide divergence values of 11-20% and may represent additional distinct species. In addition to establishing diversity among reptilian paramyxoviruses, the phylogenetic groupings showed some correlation with geographic location, and clearly demonstrated a low level of host species-specificity within these viruses. Copyright (C) 1999 Elsevier Science B.V.
Oligonucleotide fingerprinting of rRNA genes for analysis of fungal community composition.

PubMed

Valinsky, Lea; Della Vedova, Gianluca; Jiang, Tao; Borneman, James

2002-12-01

Thorough assessments of fungal diversity are currently hindered by technological limitations. Here we describe a new method for identifying fungi, oligonucleotide fingerprinting of rRNA genes (OFRG). ORFG sorts arrayed rRNA gene (ribosomal DNA [rDNA]) clones into taxonomic clusters through a series of hybridization experiments, each using a single oligonucleotide probe. A simulated annealing algorithm was used to design an OFRG probe set for fungal rDNA. Analysis of 1,536 fungal rDNA clones derived from soil generated 455 clusters. A pairwise sequence analysis showed that clones with average sequence identities of 99.2% were grouped into the same cluster. To examine the accuracy of the taxonomic identities produced by this OFRG experiment, we determined the nucleotide sequences for 117 clones distributed throughout the tree. For all but two of these clones, the taxonomic identities generated by this OFRG experiment were consistent with those generated by a nucleotide sequence analysis. Eighty-eight percent of the clones were affiliated with Ascomycota, while 12% belonged to BASIDIOMYCOTA: A large fraction of the clones were affiliated with the genera Fusarium (404 clones) and Raciborskiomyces (176 clones). Smaller assemblages of clones had high sequence identities to the Alternaria, Ascobolus, Chaetomium, Cryptococcus, and Rhizoctonia clades.
Genome-wide-analyses of Listeria monocytogenes from food-processing plants reveal clonal diversity and date the emergence of persisting sequence types.

PubMed

Knudsen, Gitte M; Nielsen, Jesper Boye; Marvig, Rasmus L; Ng, Yin; Worning, Peder; Westh, Henrik; Gram, Lone

2017-08-01

Whole genome sequencing is increasing used in epidemiology, e.g. for tracing outbreaks of food-borne diseases. This requires in-depth understanding of pathogen emergence, persistence and genomic diversity along the food production chain including in food processing plants. We sequenced the genomes of 80 isolates of Listeria monocytogenes sampled from Danish food processing plants over a time-period of 20 years, and analysed the sequences together with 10 public available reference genomes to advance our understanding of interplant and intraplant genomic diversity of L. monocytogenes. Except for three persisting sequence types (ST) based on Multi Locus Sequence Typing being ST7, ST8 and ST121, long-term persistence of clonal groups was limited, and new clones were introduced continuously, potentially from raw materials. No particular gene could be linked to the persistence phenotype. Using time-based phylogenetic analyses of the persistent STs, we estimate the L. monocytogenes evolutionary rate to be 0.18-0.35 single nucleotide polymorphisms/year, suggesting that the persistent STs emerged approximately 100 years ago, which correlates with the onset of industrialization and globalization of the food market. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2011 CFR

2011-07-01

... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...
Evolutionary dynamics and genetic diversity from three genes of Anguillid rhabdovirus.

PubMed

Bellec, Laure; Cabon, Joelle; Bergmann, Sven; de Boisséson, Claire; Engelsma, Marc; Haenen, Olga; Morin, Thierry; Olesen, Niels Jørgen; Schuetze, Heike; Toffan, Anna; Way, Keith; Bigarré, Laurent

2014-11-01

Wild freshwater eel populations have dramatically declined in recent past decades in Europe and America, partially through the impact of several factors including the wide spread of infectious diseases. The anguillid rhabdoviruses eel virus European X (EVEX) and eel virus American (EVA) potentially play a role in this decline, even if their real contribution is still unclear. In this study, we investigate the evolutionary dynamics and genetic diversity of anguiillid rhabdoviruses by analysing sequences from the glycoprotein, nucleoprotein and phosphoprotein (P) genes of 57 viral strains collected from seven countries over 40 years using maximum-likelihood and Bayesian approaches. Phylogenetic trees from the three genes are congruent and allow two monophyletic groups, European and American, to be clearly distinguished. Results of nucleotide substitution rates per site per year indicate that the P gene is expected to evolve most rapidly. The nucleotide diversity observed is low (2-3 %) for the three genes, with a significantly higher variability within the P gene, which encodes multiple proteins from a single genomic RNA sequence, particularly a small C protein. This putative C protein is a potential molecular marker suitable for characterization of distinct genotypes within anguillid rhabdoviruses. This study provides, to our knowledge, the first molecular characterization of EVA, brings new insights to the evolutionary dynamics of two genotypes of Anguillid rhabdovirus, and is a baseline for further investigations on the tracking of its spread.
Molecular Comparison and Evolutionary Analyses of VP1 Nucleotide Sequences of New African Human Enterovirus 71 Isolates Reveal a Wide Genetic Diversity

PubMed Central

Nougairède, Antoine; Joffret, Marie-Line; Deshpande, Jagadish M.; Dubot-Pérès, Audrey; Héraud, Jean-Michel

2014-01-01

Most circulating strains of Human enterovirus 71 (EV-A71) have been classified primarily into three genogroups (A to C) on the basis of genetic divergence between the 1D gene, which encodes the VP1 capsid protein. The aim of the present study was to provide further insights into the diversity of the EV-A71 genogroups following the recent description of highly divergent isolates, in particular those from African countries, including Madagascar. We classified recent EV-A71 isolates by a large comparison of 3,346 VP1 nucleotidic sequences collected from GenBank. Analysis of genetic distances and phylogenetic investigations indicated that some recently-reported isolates did not fall into the genogroups A-C and clustered into three additional genogroups, including one Indian genogroup (genogroup D) and 2 African ones (E and F). Our Bayesian phylogenetic analysis provided consistent data showing that the genogroup D isolates share a recent common ancestor with the members of genogroup E, while the isolates of genogroup F evolved from a recent common ancestor shared with the members of the genogroup B. Our results reveal the wide diversity that exists among EV-A71 isolates and suggest that the number of circulating genogroups is probably underestimated, particularly in developing countries where EV-A71 epidemiology has been poorly studied. PMID:24598878
Assessment of the Geographic Origins of Pinewood Nematode Isolates via Single Nucleotide Polymorphism in Effector Genes

PubMed Central

Figueiredo, Joana; Simões, Maria José; Gomes, Paula; Barroso, Cristina; Pinho, Diogo; Conceição, Luci; Fonseca, Luís; Abrantes, Isabel; Pinheiro, Miguel; Egas, Conceição

2013-01-01

The pinewood nematode, Bursaphelenchus xylophilus, is native to North America but it only causes damaging pine wilt disease in those regions of the world where it has been introduced. The accurate detection of the species and its dispersal routes are thus essential to define effective control measures. The main goals of this study were to analyse the genetic diversity among B. xylophilus isolates from different geographic locations and identify single nucleotide polymorphism (SNPs) markers for geographic origin, through a comparative transcriptomic approach. The transcriptomes of seven B. xylophilus isolates, from Continental Portugal (4), China (1), Japan (1) and USA (1), were sequenced in the next generation platform Roche 454. Analysis of effector gene transcripts revealed inter-isolate nucleotide diversity that was validated by Sanger sequencing in the genomic DNA of the seven isolates and eight additional isolates from different geographic locations: Madeira Island (2), China (1), USA (1), Japan (2) and South Korea (2). The analysis identified 136 polymorphic positions in 10 effector transcripts. Pairwise comparison of the 136 SNPs through Neighbor-Joining and the Maximum Likelihood methods and 5-mer frequency analysis with the alignment-independent bilinear multivariate modelling approach correlated the SNPs with the isolates geographic origin. Furthermore, the SNP analysis indicated a closer proximity of the Portuguese isolates to the Korean and Chinese isolates than to the Japanese or American isolates. Each geographic cluster carried exclusive alleles that can be used as SNP markers for B. xylophilus isolate identification. PMID:24391785
Genetic analysis of paramyxovirus isolates from pacific salmon reveals two independently co-circulating lineages

USGS Publications Warehouse

Batts, W.N.; Falk, K.; Winton, J.R.

2008-01-01

Viruses with the morphological and biochemical characteristics of the family Paramyxoviridae (paramyxoviruses) have been isolated from adult salmon returning to rivers along the Pacific coast of North America since 1982. These Pacific salmon paramyxoviruses (PSPV), which have mainly been isolated from Chinook salmon Oncorhynchus tshawytscha, grow slowly in established fish cell lines and have not been associated with disease. Genetic analysis of a 505-base-pair region of the polymerase gene from 47 PsPV isolates produced 17 nucleotide sequence types that could be grouped into two major sublineages, designated A and B. The two independently co-circulating sublineages differed by 12.1-13.9% at the nucleotide level but by only 1.2% at the amino acid level. Isolates of PSPV from adult Pacific salmon returning to rivers from Alaska to California over a 25-year period showed little evidence of geographic or temporal grouping. Phylogenetic analyses revealed that these paramyxoviruses of Pacific salmon were most closely related to the Atlantic salmon paramyxovirus (ASPV) from Norway, having a maximum nucleotide diversity of 26.1 % and an amino acid diversity of 19.0%. When compared with homologous sequences of other paramyxoviruses, PSPV and ASPV were sufficiently distinct to suggest that they are not clearly members of any of the established genera in the family Paramyxoviridae. in the course of this study, a polymerase chain reaction assay was developed that can be used for confirmatory identification of PSPV. ?? Copyright by the American Fisheries Society 2008.
Sequence Analysis of IncA/C and IncI1 Plasmids Isolated from Multidrug-Resistant Salmonella Newport Using Single-Molecule Real-Time Sequencing.

PubMed

Cao, Guojie; Allard, Marc; Hoffmann, Maria; Muruvanda, Tim; Luo, Yan; Payne, Justin; Meng, Kevin; Zhao, Shaohua; McDermott, Patrick; Brown, Eric; Meng, Jianghong

2018-06-01

Multidrug-resistant (MDR) plasmids play an important role in disseminating antimicrobial resistance genes. To elucidate the antimicrobial resistance gene compositions in A/C incompatibility complex (IncA/C) plasmids carried by animal-derived MDR Salmonella Newport, and to investigate the spread mechanism of IncA/C plasmids, this study characterizes the complete nucleotide sequences of IncA/C plasmids by comparative analysis. Complete nucleotide sequencing of plasmids and chromosomes of six MDR Salmonella Newport strains was performed using PacBio RSII. Open reading frames were assigned using prokaryotic genome annotation pipeline (PGAP). To understand genomic diversity and evolutionary relationships among Salmonella Newport IncA/C plasmids, we included three complete IncA/C plasmid sequences with similar backbones from Salmonella Newport and Escherichia coli: pSN254, pAM04528, and peH4H, and additional 200 draft chromosomes. With the exception of canine isolate CVM22462, which contained an additional IncI1 plasmid, each of the six MDR Salmonella Newport strains contained only the IncA/C plasmid. These IncA/C plasmids (including references) ranged in size from 80.1 (pCVM21538) to 176.5 kb (pSN254) and carried various resistance genes. Resistance genes floR, tetA, tetR, strA, strB, sul, and mer were identified in all IncA/C plasmids. Additionally, bla CMY-2 and sugE were present in all IncA/C plasmids, excepting pCVM21538. Plasmid pCVM22462 was capable of being transferred by conjugation. The IncI1 plasmid pCVM22462b in CVM22462 carried bla CMY-2 and sugE. Our data showed that MDR Salmonella Newport strains carrying similar IncA/C plasmids clustered together in the phylogenetic tree using chromosome sequences and the IncA/C plasmids from animal-derived Salmonella Newport contained diverse resistance genes. In the current study, we analyzed genomic diversities and phylogenetic relationships among MDR Salmonella Newport using complete plasmids and chromosome sequences and provided possible spread mechanism of IncA/C plasmids in Salmonella Newport Lineage II.

Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21.

PubMed

Patil, N; Berno, A J; Hinds, D A; Barrett, W A; Doshi, J M; Hacker, C R; Kautzer, C R; Lee, D H; Marjoribanks, C; McDonough, D P; Nguyen, B T; Norris, M C; Sheehan, J B; Shen, N; Stern, D; Stokowski, R P; Thomas, D J; Trulson, M O; Vyas, K R; Frazer, K A; Fodor, S P; Cox, D R

2001-11-23

Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. We have used high-density oligonucleotide arrays, in combination with somatic cell genetics, to identify a large fraction of all common human chromosome 21 SNPs and to directly observe the haplotype structure defined by these SNPs. This structure reveals blocks of limited haplotype diversity in which more than 80% of a global human sample can typically be characterized by only three common haplotypes.
The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level.

PubMed

Rodriguez-R, Luis M; Gunturu, Santosh; Harvey, William T; Rosselló-Mora, Ramon; Tiedje, James M; Cole, James R; Konstantinidis, Konstantinos T

2018-06-14

The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.
Molecular analysis of carbon monoxide-oxidizing bacteria associated with recent Hawaiian volcanic deposits.

PubMed

Dunfield, Kari E; King, Gary M

2004-07-01

Genomic DNA extracts from four sites at Kilauea Volcano were used as templates for PCR amplification of the large subunit (coxL) of aerobic carbon monoxide dehydrogenase. The sites included a 42-year-old tephra deposit, a 108-year-old lava flow, a 212-year-old partially vegetated ash-and-tephra deposit, and an approximately 300-year-old forest. PCR primers amplified coxL sequences from the OMP clade of CO oxidizers, which includes isolates such as Oligotropha carboxidovorans, Mycobacterium tuberculosis, and Pseudomonas thermocarboxydovorans. PCR products were used to create clone libraries that provide the first insights into the diversity and phylogenetic affiliations of CO oxidizers in situ. On the basis of phylogenetic and statistical analyses, clone libraries for each site were distinct. Although some clone sequences were similar to coxL sequences from known organisms, many sequences appeared to represent phylogenetic lineages not previously known to harbor CO oxidizers. On the basis of average nucleotide diversity and average pairwise difference, a forested site supported the most diverse CO-oxidizing populations, while an 1894 lava flow supported the least diverse populations. Neither parameter correlated with previous estimates of atmospheric CO uptake rates, but both parameters correlated positively with estimates of microbial biomass and respiration. Collectively, the results indicate that the CO oxidizer functional group associated with recent volcanic deposits of the remote Hawaiian Islands contains substantial and previously unsuspected diversity.
Molecular Analysis of Carbon Monoxide-Oxidizing Bacteria Associated with Recent Hawaiian Volcanic Deposits†

PubMed Central

Dunfield, Kari E.; King, Gary M.

2004-01-01

Genomic DNA extracts from four sites at Kilauea Volcano were used as templates for PCR amplification of the large subunit (coxL) of aerobic carbon monoxide dehydrogenase. The sites included a 42-year-old tephra deposit, a 108-year-old lava flow, a 212-year-old partially vegetated ash-and-tephra deposit, and an approximately 300-year-old forest. PCR primers amplified coxL sequences from the OMP clade of CO oxidizers, which includes isolates such as Oligotropha carboxidovorans, Mycobacterium tuberculosis, and Pseudomonas thermocarboxydovorans. PCR products were used to create clone libraries that provide the first insights into the diversity and phylogenetic affiliations of CO oxidizers in situ. On the basis of phylogenetic and statistical analyses, clone libraries for each site were distinct. Although some clone sequences were similar to coxL sequences from known organisms, many sequences appeared to represent phylogenetic lineages not previously known to harbor CO oxidizers. On the basis of average nucleotide diversity and average pairwise difference, a forested site supported the most diverse CO-oxidizing populations, while an 1894 lava flow supported the least diverse populations. Neither parameter correlated with previous estimates of atmospheric CO uptake rates, but both parameters correlated positively with estimates of microbial biomass and respiration. Collectively, the results indicate that the CO oxidizer functional group associated with recent volcanic deposits of the remote Hawaiian Islands contains substantial and previously unsuspected diversity. PMID:15240307
AmericaPlex26: A SNaPshot Multiplex System for Genotyping the Main Human Mitochondrial Founder Lineages of the Americas

PubMed Central

Coutinho, Alexandra; Valverde, Guido; Fehren-Schmitz, Lars; Cooper, Alan; Barreto Romero, Maria Inés; Espinoza, Isabel Flores; Llamas, Bastien; Haak, Wolfgang

2014-01-01

Phylogeographic studies have described a reduced genetic diversity in Native American populations, indicative of one or more bottleneck events during the peopling and prehistory of the Americas. Classical sequencing approaches targeting the mitochondrial diversity have reported the presence of five major haplogroups, namely A, B, C, D and X, whereas the advent of complete mitochondrial genome sequencing has recently refined the number of founder lineages within the given diversity to 15 sub-haplogroups. We developed and optimized a SNaPshot assay to study the mitochondrial diversity in pre-Columbian Native American populations by simultaneous typing of 26 single nucleotide polymorphisms (SNPs) characterising Native American sub-haplogroups. Our assay proved to be highly sensitive with respect to starting concentrations of target DNA and could be applied successfully to a range of ancient human skeletal material from South America from various time periods. The AmericaPlex26 is a powerful assay with enhanced phylogenetic resolution that allows time- and cost-efficient mitochondrial DNA sub-typing from valuable ancient specimens. It can be applied in addition or alternative to standard sequencing of the D-loop region in forensics, ancestry testing, and population studies, or where full-resolution mitochondrial genome sequencing is not feasible. PMID:24671218
AmericaPlex26: a SNaPshot multiplex system for genotyping the main human mitochondrial founder lineages of the Americas.

PubMed

Coutinho, Alexandra; Valverde, Guido; Fehren-Schmitz, Lars; Cooper, Alan; Barreto Romero, Maria Inés; Espinoza, Isabel Flores; Llamas, Bastien; Haak, Wolfgang

2014-01-01

Phylogeographic studies have described a reduced genetic diversity in Native American populations, indicative of one or more bottleneck events during the peopling and prehistory of the Americas. Classical sequencing approaches targeting the mitochondrial diversity have reported the presence of five major haplogroups, namely A, B, C, D and X, whereas the advent of complete mitochondrial genome sequencing has recently refined the number of founder lineages within the given diversity to 15 sub-haplogroups. We developed and optimized a SNaPshot assay to study the mitochondrial diversity in pre-Columbian Native American populations by simultaneous typing of 26 single nucleotide polymorphisms (SNPs) characterising Native American sub-haplogroups. Our assay proved to be highly sensitive with respect to starting concentrations of target DNA and could be applied successfully to a range of ancient human skeletal material from South America from various time periods. The AmericaPlex26 is a powerful assay with enhanced phylogenetic resolution that allows time- and cost-efficient mitochondrial DNA sub-typing from valuable ancient specimens. It can be applied in addition or alternative to standard sequencing of the D-loop region in forensics, ancestry testing, and population studies, or where full-resolution mitochondrial genome sequencing is not feasible.
Genetic diversity of Taenia hydatigena in the northern part of the West Bank, Palestine as determined by mitochondrial DNA sequences.

PubMed

Adwan, Kamel; Jayousi, Alaa; Abuseir, Sameh; Abbasi, Ibrahim; Adwan, Ghaleb; Jarrar, Naser

2018-06-26

Cysticercus tenuicollis is the metacestode of canine tapeworm Taenia hydatigena, which has been reported in domestic and wild ruminants and is causing veterinary and economic losses in the meat industry. This study was conducted to determine the sequence variation in the mitochondrial cytochrome c oxidase subunit 1 (coxl) gene in 20 isolates of T. hydatigena metacestodes (cysticercus tenuicollis) collected from northern West Bank in Palestine. Nine haplotypes were detected, with one prevailing (55%). The total haplotype diversity (0.705) and the total nucleotide diversity (0.0045) displayed low genetic diversity among our isolates. Haplotype analysis showed a star-shaped network with a centrally positioned common haplotype. The Tajima's D, and Fu and Li's statistics in cysticercus tenuicollis population of this region showed a negative value, indicating deviations from neutrality and both suggested recent population expansion for the population. The findings of this study would greatly help to implement control and preventive measures for T. hydatigena larvae infection in Palestine.
The maize stripe virus major noncapsid protein messenger RNA transcripts contain heterogeneous leader sequences at their 5' termini.

PubMed

Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W

1993-12-01

Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.
Diversity and three-dimensional structures of the alpha Mcr of the methanogenic Archaea from the anoxic region of Tucuruí Lake, in Eastern Brazilian Amazonia

PubMed Central

Santana, Priscila Bessa; Junior, Rubens Ghilardi; Alves, Claudio Nahum; Silva, Jeronimo Lameira; McCulloch, John Anthony; Schneider, Maria Paula Cruz; da Costa da Silva, Artur

2012-01-01

Methanogenic archaeans are organisms of considerable ecological and biotechnological interest that produce methane through a restricted metabolic pathway, which culminates in the reaction catalyzed by the Methyl-coenzyme M reductase (Mcr) enzyme, and results in the release of methane. Using a metagenomic approach, the gene of the α subunit of mcr (mcrα) was isolated from sediment sample from an anoxic zone, rich in decomposing organic material, obtained from the Tucuruí hydroelectric dam reservoir in eastern Brazilian Amazonia. The partial nucleotide sequences obtained were 83 to 95% similar to those available in databases, indicating a low diversity of archaeans in the reservoir. Two orders were identified - the Methanomicrobiales, and a unique Operational Taxonomic Unit (OTU) forming a clade with the Methanosarcinales according to low bootstrap values. Homology modeling was used to determine the three-dimensional (3D) structures, for this the partial nucleotide sequence of the mcrα were isolated and translated on their partial amino acid sequences. The 3D structures of the archaean Mcrα observed in the present study varied little, and presented approximately 70% identity in comparison with the Mcrα of Methanopyrus klanderi. The results demonstrated that the community of methanogenic archaeans of the anoxic C1 region of the Tucurui reservoir is relatively homogeneous. PMID:22481885
Investigating intra-host and intra-herd sequence diversity of foot-and-mouth disease virus.

PubMed

King, David J; Freimanis, Graham L; Orton, Richard J; Waters, Ryan A; Haydon, Daniel T; King, Donald P

2016-10-01

Due to the poor-fidelity of the enzymes involved in RNA genome replication, foot-and-mouth disease (FMD) virus samples comprise of unique polymorphic populations. In this study, deep sequencing was utilised to characterise the diversity of FMD virus (FMDV) populations in 6 infected cattle present on a single farm during the series of outbreaks in the UK in 2007. A novel RT-PCR method was developed to amplify a 7.6kb nucleotide fragment encompassing the polyprotein coding region of the FMDV genome. Illumina sequencing of each sample identified the fine polymorphic structures at each nucleotide position, from consensus level changes to variants present at a 0.24% frequency. These data were used to investigate population dynamics of FMDV at both herd and host levels, evaluate the impact of host on the viral swarm structure and to identify transmission links with viruses recovered from other farms in the same series of outbreaks. In 7 samples, from 6 different animals, a total of 5 consensus level variants were identified, in addition to 104 sub-consensus variants of which 22 were shared between 2 or more animals. Further analysis revealed differences in swarm structures from samples derived from the same animal suggesting the presence of distinct viral populations evolving independently at different lesion sites within the same infected animal. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Epstein-Barr Virus Latent Membrane Protein 1 Genetic Variability in Peripheral Blood B Cells and Oropharyngeal Fluids

PubMed Central

Renzette, Nicholas; Somasundaran, Mohan; Brewster, Frank; Coderre, James; Weiss, Eric R.; McManus, Margaret; Greenough, Thomas; Tabak, Barbara; Garber, Manuel; Kowalik, Timothy F.

2014-01-01

ABSTRACT We report the diversity of latent membrane protein 1 (LMP1) gene founder sequences and the level of Epstein-Barr virus (EBV) genome variability over time and across anatomic compartments by using virus genomes amplified directly from oropharyngeal wash specimens and peripheral blood B cells during acute infection and convalescence. The intrahost nucleotide variability of the founder virus was 0.02% across the region sequences, and diversity increased significantly over time in the oropharyngeal compartment (P = 0.004). The LMP1 region showing the greatest level of variability in both compartments, and over time, was concentrated within the functional carboxyl-terminal activating regions 2 and 3 (CTAR2 and CTAR3). Interestingly, a deletion in a proline-rich repeat region (amino acids 274 to 289) of EBV commonly reported in EBV sequenced from cancer specimens was not observed in acute infectious mononucleosis (AIM) patients. Taken together, these data highlight the diversity in circulating EBV genomes and its potential importance in disease pathogenesis and vaccine design. IMPORTANCE This study is among the first to leverage an improved high-throughput deep-sequencing methodology to investigate directly from patient samples the degree of diversity in Epstein-Barr virus (EBV) populations and the extent to which viral genome diversity develops over time in the infected host. Significant variability of circulating EBV latent membrane protein 1 (LMP1) gene sequences was observed between cellular and oral wash samples, and this variability increased over time in oral wash samples. The significance of EBV genetic diversity in transmission and disease pathogenesis are discussed. PMID:24429365
Epstein-Barr virus latent membrane protein 1 genetic variability in peripheral blood B cells and oropharyngeal fluids.

PubMed

Renzette, Nicholas; Somasundaran, Mohan; Brewster, Frank; Coderre, James; Weiss, Eric R; McManus, Margaret; Greenough, Thomas; Tabak, Barbara; Garber, Manuel; Kowalik, Timothy F; Luzuriaga, Katherine

2014-04-01

We report the diversity of latent membrane protein 1 (LMP1) gene founder sequences and the level of Epstein-Barr virus (EBV) genome variability over time and across anatomic compartments by using virus genomes amplified directly from oropharyngeal wash specimens and peripheral blood B cells during acute infection and convalescence. The intrahost nucleotide variability of the founder virus was 0.02% across the region sequences, and diversity increased significantly over time in the oropharyngeal compartment (P = 0.004). The LMP1 region showing the greatest level of variability in both compartments, and over time, was concentrated within the functional carboxyl-terminal activating regions 2 and 3 (CTAR2 and CTAR3). Interestingly, a deletion in a proline-rich repeat region (amino acids 274 to 289) of EBV commonly reported in EBV sequenced from cancer specimens was not observed in acute infectious mononucleosis (AIM) patients. Taken together, these data highlight the diversity in circulating EBV genomes and its potential importance in disease pathogenesis and vaccine design. This study is among the first to leverage an improved high-throughput deep-sequencing methodology to investigate directly from patient samples the degree of diversity in Epstein-Barr virus (EBV) populations and the extent to which viral genome diversity develops over time in the infected host. Significant variability of circulating EBV latent membrane protein 1 (LMP1) gene sequences was observed between cellular and oral wash samples, and this variability increased over time in oral wash samples. The significance of EBV genetic diversity in transmission and disease pathogenesis are discussed.
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
Limited Genetic Diversity Preceded Extinction of the Tasmanian Tiger

PubMed Central

Menzies, Brandon R.; Renfree, Marilyn B.; Heider, Thomas; Mayer, Frieder; Hildebrandt, Thomas B.; Pask, Andrew J.

2012-01-01

The Tasmanian tiger or thylacine was the largest carnivorous marsupial when Europeans first reached Australia. Sadly, the last known thylacine died in captivity in 1936. A recent analysis of the genome of the closely related and extant Tasmanian devil demonstrated limited genetic diversity between individuals. While a similar lack of diversity has been reported for the thylacine, this analysis was based on just two individuals. Here we report the sequencing of an additional 12 museum-archived specimens collected between 102 and 159 years ago. We examined a portion of the mitochondrial DNA hyper-variable control region and determined that all sequences were on average 99.5% identical at the nucleotide level. As a measure of accuracy we also sequenced mitochondrial DNA from a mother and two offspring. As expected, these samples were found to be 100% identical, validating our methods. We also used 454 sequencing to reconstruct 2.1 kilobases of the mitochondrial genome, which shared 99.91% identity with the two complete thylacine mitochondrial genomes published previously. Our thylacine genomic data also contained three highly divergent putative nuclear mitochondrial sequences, which grouped phylogenetically with the published thylacine mitochondrial homologs but contained 100-fold more polymorphisms than the conserved fragments. Together, our data suggest that the thylacine population in Tasmania had limited genetic diversity prior to its extinction, possibly as a result of their geographic isolation from mainland Australia approximately 10,000 years ago. PMID:22530022
Genome-wide nucleotide diversity of hatchery-reared Atlantic and Mediterranean strains of brown trout Salmo trutta compared to wild Mediterranean populations.

PubMed

Leitwein, M; Gagnaire, P-A; Desmarais, E; Guendouz, S; Rohmer, M; Berrebi, P; Guinand, B

2016-12-01

A genome-wide assessment of diversity is provided for wild Mediterranean brown trout Salmo trutta populations from headwater tributaries of the Orb River and from Atlantic and Mediterranean hatchery-reared strains that have been used for stocking. Double-digest restriction-site-associated DNA sequencing (dd-RADseq) was performed and the efficiency of de novo and reference-mapping approaches to obtain individual genotypes was compared. Large numbers of single nucleotide polymorphism (SNP) markers with similar genome-wide distributions were discovered using both approaches (196 639 v. 121 016 SNPs, respectively), with c. 80% of the loci detected de novo being also found with reference mapping, using the Atlantic salmon Salmo salar genome as a reference. Lower mapping density but larger nucleotide diversity (π) was generally observed near extremities of linkage groups, consistent with regions of residual tetrasomic inheritance observed in salmonids. Genome-wide diversity estimates revealed reduced polymorphism in hatchery strains (π = 0·0040 and π = 0·0029 in Atlantic and Mediterranean strains, respectively) compared to wild populations (π = 0·0049), a pattern that was congruent with allelic richness estimated from microsatellite markers. Finally, pronounced heterozygote deficiency was found in hatchery strains (Atlantic F IS = 0·18; Mediterranean F IS = 0·42), indicating that stocking practices may affect the genetic diversity in wild populations. These new genomic resources will provide important tools to define better conservation strategies in S. trutta. © 2016 The Fisheries Society of the British Isles.
RECOVIR Software for Identifying Viruses

NASA Technical Reports Server (NTRS)

Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui

2013-01-01

Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.
Sequence and phylogenetic analysis of chicken anaemia virus obtained from backyard and commercial chickens in Nigeria.

PubMed

Oluwayelu, D O; Todd, D; Olaleye, O D

2008-12-01

This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.
Identification and phylogenetic diversity of parvovirus circulating in commercial chicken and turkey flocks in Croatia.

PubMed

Bidin, M; Lojkić, I; Bidin, Z; Tiljar, M; Majnarić, D

2011-12-01

Phylogenetic diversity of parvovirus detected in commercial chicken and turkey flocks is described. Nine chicken and six turkey flocks from Croatian farms were tested for parvovirus presence. Intestinal samples from one turkey and seven chicken flocks were found positive, and were sequenced. Natural parvovirus infection was more frequently detected in chickens than in turkeys examined in this study. Sequence analysis of 400 nucleotide fragments of the nonstructural gene (NS) showed that our sequences had more similarity with chicken parvovirus (ChPV) (92.3%-99.7%) than turkey parvovirus (TuPV) (89.5%-98.9%) strains. Phylogenetic analysis grouped our sequences in two clades. Also, the higher prevalence of ChPV than TuPV in tested flocks was defined. The necropsy findings suggested a malabsorption syndrome followed by a preascitic condition. Further research of parvovirus infection, pathogenesis, and the possibility of its association with poult enteritis and mortality syndrome (PEMS) and runting and stunting syndrome (RSS) is needed to clarify its significance as an agent of enteric disease.
Nucleotide Diversity and Selection Signature in the Domesticated Silkworm, Bombyx mori, and Wild Silkworm, Bombyx mandarina

PubMed Central

Guo, Yi; Shen, Yi-Hong; Sun, Wei; Kishino, Hirohisa; Xiang, Zhong-Huai; Zhang, Ze

2011-01-01

To investigate the patterns of nucleotide diversity in domesticated silkworm, Bombyx mori L. (Lepidoptera: Bombycidae) and its wild relative, Chinese wild silkworm, Bombyx mandarina Moore, we sequenced nine nuclear genes. Neutrality test and coalescent simulation for these genes were performed to look at bottleneck intensity and selection signature; linkage disequilibrium (LD) within and between loci was employed to investigate allele association. As a result, B. mori lost 33–49% of nucleotide diversity relative to wild silkworm, which is similar to the loss levels found in major cultivated crops. Diversity of B. mori is significantly lower than that of B. mandarina measured as πtotal (0.01166 vs. 0.1741) or θW(0.01124 vs. 0.02206). Bottleneck intensity of domesticated silkworm is 1.5 (in terms of k = Nb/d, Nb-bottleneck population size; d-bottleneck duration) with different durations. Gene DefA showed signature of artificial selection by all analysis methods and might experience strong artificial selection in B. mori during domestication. For nine loci, both curves of LD decay rapidly within 200 bp and drop slowly when distance is > 200 bp, although that of B. mori decays slower than B. mandarina at loci investigated. However, LD could not be estimated at DefA in B. mori and at ER in both silkworms. Elevated LD observed in B. mori may be indicator of selection and demographic events. PMID:22239062

Detection and molecular characterization of tomato yellow leaf curl virus naturally infecting Lycopersicon esculentum in Egypt.

PubMed

Rabie, M; Ratti, C; Abdel Aleem, E; Fattouh, F

Tomato yellow leaf curl virus (TYLCV) infections of tomato crops in Egypt were widely spread in 2014. Infected symptomatic tomato plants from different governorates were sampled. TYLCV strains Israel and Mild (TYLCV-IL, TYLCV-Mild) were identified by multiplex and real-time PCR. In addition, nucleotide sequence analysis of the V1 and V2 protein genes, revealed ten TYLCV Egyptian isolates (TYLCV from TY1 to 10). Phylogenetic analysis showed their high degree of relatedness with TYLCV-IL Jordan isolate (98%). Here we have showed the complete nucleotide sequence of the TYLCV Egyptian isolate TY10, sampled from El Beheira. A high degree of similarity to other previously reported Egyptian isolates and isolates from Jordan and Japan reflect the importance of phylogenetic analysis in monitoring virus genetic diversity and possibilities for divergence of more virulent strains or genotypes.
Detection of microRNAs in color space.

PubMed

Marco, Antonio; Griffiths-Jones, Sam

2012-02-01

Deep sequencing provides inexpensive opportunities to characterize the transcriptional diversity of known genomes. The AB SOLiD technology generates millions of short sequencing reads in color-space; that is, the raw data is a sequence of colors, where each color represents 2 nt and each nucleotide is represented by two consecutive colors. This strategy is purported to have several advantages, including increased ability to distinguish sequencing errors from polymorphisms. Several programs have been developed to map short reads to genomes in color space. However, a number of previously unexplored technical issues arise when using SOLiD technology to characterize microRNAs. Here we explore these technical difficulties. First, since the sequenced reads are longer than the biological sequences, every read is expected to contain linker fragments. The color-calling error rate increases toward the 3(') end of the read such that recognizing the linker sequence for removal becomes problematic. Second, mapping in color space may lead to the loss of the first nucleotide of each read. We propose a sequential trimming and mapping approach to map small RNAs. Using our strategy, we reanalyze three published insect small RNA deep sequencing datasets and characterize 22 new microRNAs. A bash shell script to perform the sequential trimming and mapping procedure, called SeqTrimMap, is available at: http://www.mirbase.org/tools/seqtrimmap/ antonio.marco@manchester.ac.uk Supplementary data are available at Bioinformatics online.
Tales of diversity: Genomic and morphological characteristics of forty-six Arthrobacter phages

PubMed Central

Adair, Tamarah L.; Afram, Patricia; Allen, Katherine G.; Archambault, Megan L.; Aziz, Rahat M.; Bagnasco, Filippa G.; Ball, Sarah L.; Barrett, Natalie A.; Benjamin, Robert C.; Blasi, Christopher J.; Borst, Katherine; Braun, Mary A.; Broomell, Haley; Brown, Conner B.; Brynell, Zachary S.; Bue, Ashley B.; Burke, Sydney O.; Casazza, William; Cautela, Julia A.; Chen, Kevin; Chimalakonda, Nitish S.; Chudoff, Dylan; Connor, Jade A.; Cross, Trevor S.; Curtis, Kyra N.; Dahlke, Jessica A.; Deaton, Bethany M.; Degroote, Sarah J.; DeNigris, Danielle M.; DeRuff, Katherine C.; Dolan, Milan; Dunbar, David; Egan, Marisa S.; Evans, Daniel R.; Fahnestock, Abby K.; Farooq, Amal; Finn, Garrett; Fratus, Christopher R.; Gaffney, Bobby L.; Garlena, Rebecca A.; Garrigan, Kelly E.; Gibbon, Bryan C.; Goedde, Michael A.; Guerrero Bustamante, Carlos A.; Harrison, Melinda; Hartwell, Megan C.; Heckman, Emily L.; Huang, Jennifer; Hughes, Lee E.; Hyduchak, Kathryn M.; Jacob, Aswathi E.; Kaku, Machika; Karstens, Allen W.; Kenna, Margaret A.; Khetarpal, Susheel; King, Rodney A.; Kobokovich, Amanda L.; Kolev, Hannah; Konde, Sai A.; Kriese, Elizabeth; Lamey, Morgan E.; Lantz, Carter N.; Lapin, Jonathan S.; Lawson, Temiloluwa O.; Lee, In Young; Lee, Scott M.; Lee-Soety, Julia Y.; Lehmann, Emily M.; London, Shawn C.; Lopez, A. Javier; Lynch, Kelly C.; Mageeney, Catherine M.; Martynyuk, Tetyana; Mathew, Kevin J.; Mavrich, Travis N.; McDaniel, Christopher M.; McDonald, Hannah; McManus, C. Joel; Medrano, Jessica E.; Mele, Francis E.; Menninger, Jennifer E.; Miller, Sierra N.; Minick, Josephine E.; Nabua, Courtney T.; Napoli, Caroline K.; Nkangabwa, Martha; Oates, Elizabeth A.; Ott, Cassandra T.; Pellerino, Sarah K.; Pinamont, William J.; Pirnie, Ross T.; Pizzorno, Marie C.; Plautz, Emilee J.; Pope, Welkin H.; Pruett, Katelyn M.; Rickstrew, Gabbi; Rimple, Patrick A.; Rinehart, Claire A.; Robinson, Kayla M.; Rose, Victoria A.; Russell, Daniel A.; Schick, Amelia M.; Schlossman, Julia; Schneider, Victoria M.; Sells, Chloe A.; Sieker, Jeremy W.; Silva, Morgan P.; Silvi, Marissa M.; Simon, Stephanie E.; Staples, Amanda K.; Steed, Isabelle L.; Stowe, Emily L.; Stueven, Noah A.; Swartz, Porter T.; Sweet, Emma A.; Sweetman, Abigail T.; Tender, Corrina; Terry, Katrina; Thomas, Chrystal; Thomas, Daniel S.; Thompson, Allison R.; Vanderveen, Lorianna; Varma, Rohan; Vaught, Hannah L.; Vo, Quynh D.; Vonberg, Zachary T.; Ware, Vassie C.; Warrad, Yasmene M.; Wathen, Kaitlyn E.; Weinstein, Jonathan L.; Wyper, Jacqueline F.; Yankauskas, Jakob R.; Zhang, Christine

2017-01-01

The vast bacteriophage population harbors an immense reservoir of genetic information. Almost 2000 phage genomes have been sequenced from phages infecting hosts in the phylum Actinobacteria, and analysis of these genomes reveals substantial diversity, pervasive mosaicism, and novel mechanisms for phage replication and lysogeny. Here, we describe the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain. These phages include representatives of all three virion morphologies, and Jasmine is the first sequenced podovirus of an actinobacterial host. The phages also span considerable sequence diversity, and can be grouped into 10 clusters according to their nucleotide diversity, and two singletons each with no close relatives. However, the clusters/singletons appear to be genomically well separated from each other, and relatively few genes are shared between clusters. Genome size varies from among the smallest of siphoviral phages (15,319 bp) to over 70 kbp, and G+C contents range from 45–68%, compared to 63.4% for the host genome. Although temperate phages are common among other actinobacterial hosts, these Arthrobacter phages are primarily lytic, and only the singleton Galaxy is likely temperate. PMID:28715480
Molecular characterization of domestic and exotic potato virus S isolates and a global analysis of genomic sequences.

PubMed

Lin, Y-H; Abad, J A; Maroon-Lango, C J; Perry, K L; Pappu, H R

2014-08-01

Five potato virus S (PVS) isolates from the USA and three isolates from Chile were characterized based on biological and molecular properties to delineate these PVS isolates into either ordinary (PVS(O)) or Andean (PVS(A)) strains. Five isolates - 41956, Cosimar, Galaxy, ND2492-2R, and Q1 - were considered ordinary strains, as they induced local lesions on the inoculated leaves of Chenopodium quinoa, whereas the remaining three (FL206-1D, Q3, and Q5) failed to induce symptoms. Considerable variability of symptom expression and severity was observed among these isolates when tested on additional indicator plants and potato cv. Defender. Additionally, all eight isolates were characterized by determining the nucleotide sequences of their coat protein (CP) genes. Based on their biological and genetic properties, the 41956, Cosimar, Galaxy, ND2492-2R, and Q1 isolates were identified as PVS(O). PVS-FL206-1D and the two Chilean isolates (PVS-Q3 and PVS-Q5) could not be identified based on phenotype alone; however, based on sequence comparisons, PVS-FL206-1D was identified as PVS(O), while Q3 and Q5 clustered with known PVS(A) strains. C. quinoa may not be a reliable indicator for distinguishing PVS strains. Sequences of the CP gene should be used as an additional criterion for delineating PVS strains. A global genetic analysis of known PVS sequences from GenBank was carried out to investigate nucleotide substitution, population selection, and genetic recombination and to assess the genetic diversity and evolution of PVS. A higher degree of nucleotide diversity (π value) of the CP gene compared to that of the 11K gene suggested greater variation in the CP gene. When comparing PVS(A) and PVS(O) strains, a higher π value was found for PVS(A). Statistical tests of the neutrality hypothesis indicated a negative selection pressure on both the CP and 11K proteins of PVS(O), whereas a balancing selection pressure was found on PVS(A).
Phylogenetic Distribution of the Capsid Assembly Protein Gene (g20) of Cyanophages in Paddy Floodwaters in Northeast China

PubMed Central

Jing, Ruiyong; Liu, Junjie; Yu, Zhenhua; Liu, Xiaobing; Wang, Guanghua

2014-01-01

Numerous studies have revealed the high diversity of cyanophages in marine and freshwater environments, but little is currently known about the diversity of cyanophages in paddy fields, particularly in Northeast (NE) China. To elucidate the genetic diversity of cyanophages in paddy floodwaters in NE China, viral capsid assembly protein gene (g20) sequences from five floodwater samples were amplified with the primers CPS1 and CPS8. Denaturing gradient gel electrophoresis (DGGE) was applied to distinguish different g20 clones. In total, 54 clones differing in g20 nucleotide sequences were obtained in this study. Phylogenetic analysis showed that the distribution of g20 sequences in this study was different from that in Japanese paddy fields, and all the sequences were grouped into Clusters α, β, γ and ε. Within Clusters α and β, three new small clusters (PFW-VII∼-IX) were identified. UniFrac analysis of g20 clone assemblages demonstrated that the community compositions of cyanophage varied among marine, lake and paddy field environments. In paddy floodwater, community compositions of cyanophage were also different between NE China and Japan. PMID:24533125
Lactobacillus strain diversity based on partial hsp60 gene sequences and design of PCR-restriction fragment length polymorphism assays for species identification and differentiation.

PubMed

Blaiotta, Giuseppe; Fusco, Vincenzina; Ercolini, Danilo; Aponte, Maria; Pepe, Olimpia; Villani, Francesco

2008-01-01

A phylogenetic tree showing diversities among 116 partial (499-bp) Lactobacillus hsp60 (groEL, encoding a 60-kDa heat shock protein) nucleotide sequences was obtained and compared to those previously described for 16S rRNA and tuf gene sequences. The topology of the tree produced in this study showed a Lactobacillus species distribution similar, but not identical, to those previously reported. However, according to the most recent systematic studies, a clear differentiation of 43 single-species clusters was detected/identified among the sequences analyzed. The slightly higher variability of the hsp60 nucleotide sequences than of the 16S rRNA sequences offers better opportunities to design or develop molecular assays allowing identification and differentiation of either distant or very closely related Lactobacillus species. Therefore, our results suggest that hsp60 can be considered an excellent molecular marker for inferring the taxonomy and phylogeny of members of the genus Lactobacillus and that the chosen primers can be used in a simple PCR procedure allowing the direct sequencing of the hsp60 fragments. Moreover, in this study we performed a computer-aided restriction endonuclease analysis of all 499-bp hsp60 partial sequences and we showed that the PCR-restriction fragment length polymorphism (RFLP) patterns obtainable by using both endonucleases AluI and TacI (in separate reactions) can allow identification and differentiation of all 43 Lactobacillus species considered, with the exception of the pair L. plantarum/L. pentosus. However, the latter species can be differentiated by further analysis with Sau3AI or MseI. The hsp60 PCR-RFLP approach was efficiently applied to identify and to differentiate a total of 110 wild Lactobacillus strains (including closely related species, such as L. casei and L. rhamnosus or L. plantarum and L. pentosus) isolated from cheese and dry-fermented sausages.
Lactobacillus Strain Diversity Based on Partial hsp60 Gene Sequences and Design of PCR-Restriction Fragment Length Polymorphism Assays for Species Identification and Differentiation▿ †

PubMed Central

Blaiotta, Giuseppe; Fusco, Vincenzina; Ercolini, Danilo; Aponte, Maria; Pepe, Olimpia; Villani, Francesco

2008-01-01

A phylogenetic tree showing diversities among 116 partial (499-bp) Lactobacillus hsp60 (groEL, encoding a 60-kDa heat shock protein) nucleotide sequences was obtained and compared to those previously described for 16S rRNA and tuf gene sequences. The topology of the tree produced in this study showed a Lactobacillus species distribution similar, but not identical, to those previously reported. However, according to the most recent systematic studies, a clear differentiation of 43 single-species clusters was detected/identified among the sequences analyzed. The slightly higher variability of the hsp60 nucleotide sequences than of the 16S rRNA sequences offers better opportunities to design or develop molecular assays allowing identification and differentiation of either distant or very closely related Lactobacillus species. Therefore, our results suggest that hsp60 can be considered an excellent molecular marker for inferring the taxonomy and phylogeny of members of the genus Lactobacillus and that the chosen primers can be used in a simple PCR procedure allowing the direct sequencing of the hsp60 fragments. Moreover, in this study we performed a computer-aided restriction endonuclease analysis of all 499-bp hsp60 partial sequences and we showed that the PCR-restriction fragment length polymorphism (RFLP) patterns obtainable by using both endonucleases AluI and TacI (in separate reactions) can allow identification and differentiation of all 43 Lactobacillus species considered, with the exception of the pair L. plantarum/L. pentosus. However, the latter species can be differentiated by further analysis with Sau3AI or MseI. The hsp60 PCR-RFLP approach was efficiently applied to identify and to differentiate a total of 110 wild Lactobacillus strains (including closely related species, such as L. casei and L. rhamnosus or L. plantarum and L. pentosus) isolated from cheese and dry-fermented sausages. PMID:17993558
Inference of purifying and positive selection in three subspecies of chimpanzees (Pan troglodytes) from exome sequencing.

PubMed

Bataillon, Thomas; Duan, Jinjie; Hvilsom, Christina; Jin, Xin; Li, Yingrui; Skov, Laurits; Glemin, Sylvain; Munch, Kasper; Jiang, Tao; Qian, Yu; Hobolth, Asger; Wang, Jun; Mailund, Thomas; Siegismund, Hans R; Schierup, Mikkel H

2015-03-30

We study genome-wide nucleotide diversity in three subspecies of extant chimpanzees using exome capture. After strict filtering, Single Nucleotide Polymorphisms and indels were called and genotyped for greater than 50% of exons at a mean coverage of 35× per individual. Central chimpanzees (Pan troglodytes troglodytes) are the most polymorphic (nucleotide diversity, θw = 0.0023 per site) followed by Eastern (P. t. schweinfurthii) chimpanzees (θw = 0.0016) and Western (P. t. verus) chimpanzees (θw = 0.0008). A demographic scenario of divergence without gene flow fits the patterns of autosomal synonymous nucleotide diversity well except for a signal of recent gene flow from Western into Eastern chimpanzees. The striking contrast in X-linked versus autosomal polymorphism and divergence previously reported in Central chimpanzees is also found in Eastern and Western chimpanzees. We show that the direction of selection statistic exhibits a strong nonmonotonic relationship with the strength of purifying selection S, making it inappropriate for estimating S. We instead use counts in synonymous versus nonsynonymous frequency classes to infer the distribution of S coefficients acting on nonsynonymous mutations in each subspecies. The strength of purifying selection we infer is congruent with the differences in effective sizes of each subspecies: Central chimpanzees are undergoing the strongest purifying selection followed by Eastern and Western chimpanzees. Coding indels show stronger selection against indels changing the reading frame than observed in human populations. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Deep sequencing of the Trypanosoma cruzi GP63 surface proteases reveals diversity and diversifying selection among chronic and congenital Chagas disease patients.

PubMed

Llewellyn, Martin S; Messenger, Louisa A; Luquetti, Alejandro O; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B N; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A

2015-04-01

Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target--ND5--was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I gene family suggests a link between genetic diversity within this gene family and survival in the mammalian host.
Abundance and Genetic Diversity of Microbial Polygalacturonase and Pectate Lyase in the Sheep Rumen Ecosystem

PubMed Central

Wang, Yaru; Luo, Huiying; Huang, Huoqing; Shi, Pengjun; Bai, Yingguo; Yang, Peilong; Yao, Bin

2012-01-01

Background Efficient degradation of pectin in the rumen is necessary for plant-based feed utilization. The objective of this study was to characterize the diversity, abundance, and functions of pectinases from microorganisms in the sheep rumen. Methodology/Principal Findings A total of 103 unique fragments of polygalacturonase (PF00295) and pectate lyase (PF00544 and PF09492) genes were retrieved from microbial DNA in the rumen of a Small Tail Han sheep, and 66% of the sequences of these fragments had low identities (<65%) with known sequences. Phylogenetic tree building separated the PF00295, PF00544, and PF09492 sequences into five, three, and three clades, respectively. Cellulolytic and noncellulolytic Butyrivibrio, Prevotella, and Fibrobacter species were the major sources of the pectinases. The two most abundant pectate lyase genes were cloned, and their protein products, expressed in Escherichia coli, were characterized. Both enzymes probably act extracellularly as their nucleotide sequences contained signal sequences, and they had optimal activities at the ruminal physiological temperature and complementary pH-dependent activity profiles. Conclusion/Significance This study reveals the specificity, diversity, and abundance of pectinases in the rumen ecosystem and provides two additional ruminal pectinases for potential industrial use under physiological conditions. PMID:22815874
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-10-29

... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...
A novel rhabdovirus, related to Merida virus, in field-collected mosquitoes from Anatolia and Thrace.

PubMed

Ergünay, Koray; Brinkmann, Annika; Litzba, Nadine; Günay, Filiz; Kar, Sırrı; Öter, Kerem; Örsten, Serra; Sarıkaya, Yasemen; Alten, Bülent; Nitsche, Andreas; Linton, Yvonne-Marie

2017-07-01

Next-generation sequencing technologies have significantly facilitated the discovery of novel viruses, and metagenomic surveillance of arthropods has enabled exploration of the diversity of novel or known viral agents. We have identified a novel rhabdovirus that is genetically related to the recently described Merida virus via next-generation sequencing in a mosquito pool from Thrace. The complete viral genome contains 11,798 nucleotides with 83% genome-wide nucleotide sequence similarity to Merida virus. Five major putative open reading frames that follow the canonical rhabdovirus genome organization were identified. A total of 1380 mosquitoes comprising 13 species, collected from Thrace and the Mediterranean and Aegean regions of Anatolia were screened for the novel virus using primers based on the N and L genes of the prototype genome. Eight positive pools (6.2%) exclusively comprised Culex pipiens sensu lato specimens originating from all study regions. Infections were observed in pools with female as well as male or mixed-sex individuals. The overall and Cx. pipiens-specific minimal infection rates were calculated to be 5.7 and 14.8, respectively. Sequencing of the PCR products revealed marked diversity within a portion of the N gene, with up to 4% divergence and distinct amino acid substitutions that were unrelated to the collection site. Phylogenetic analysis of the complete and partial viral polymerase (L gene) amino acid sequences placed the novel virus and Merida virus in a distinct group, indicating that these strains are closely related. The strain is tentatively named "Merida-like virus Turkey". Studies are underway to isolate and further explore the host range and distribution of this new strain.
Nucleotide substitutions in dengue virus serotypes from Asian and American countries: insights into intracodon recombination and purifying selection

PubMed Central

2013-01-01

Background Dengue virus (DENV) infection represents a significant public health problem in many subtropical and tropical countries. Although genetically closely related, the four serotypes of DENV differ in antigenicity for which cross protection among serotypes is limited. It is also believed that both multi-serotype infection as well as the evolution of viral antigenicity may have confounding effects in increased dengue epidemics. Numerous studies have been performed that investigated genetic diversity of DENV, but the precise mechanism(s) of dengue virus evolution are not well understood. Results We investigated genome-wide genetic diversity and nucleotide substitution patterns in the four serotypes among samples collected from different countries in Asia and Central and South America and sequenced as part of the Genome Sequencing Center for Infectious Diseases at the Broad Institute. We applied bioinformatics, statistical and coalescent simulation methods to investigate diversity of codon sequences of DENV samples representing the four serotypes. We show that fixation of nucleotide substitutions is more prominent among the inter-continental isolates (Asian and American) of serotypes 1, 2 and 3 compared to serotype 4 isolates (South and Central America) and are distributed in a non-random manner among the genes encoded by the virus. Nearly one third of the negatively selected sites are associated with fixed mutation sites within serotypes. Our results further show that of all the sites showing evidence of recombination, the majority (~84%) correspond to sites under purifying selection in the four serotypes. The analysis further shows that genetic recombination occurs within specific codons, albeit with low frequency (< 5% of all recombination sites) throughout the DENV genome of the four serotypes and reveals significant enrichment (p < 0.05) among sites under purifying selection in the virus. Conclusion The study provides the first evidence for intracodon recombination in DENV and suggests that within codons, genetic recombination has a significant role in maintaining extensive purifying selection of DENV in natural populations. Our study also suggests that fixation of beneficial mutations may lead to virus evolution via translational selection of specific sites in the DENV genome. PMID:23410119
A gp41-based heteroduplex mobility assay provides rapid and accurate assessment of intrasubtype epidemiological linkage in HIV type 1 heterosexual transmission Pairs.

PubMed

Manigart, Olivier; Boeras, Debrah I; Karita, Etienne; Hawkins, Paulina A; Vwalika, Cheswa; Makombe, Nathan; Mulenga, Joseph; Derdeyn, Cynthia A; Allen, Susan; Hunter, Eric

2012-12-01

A critical step in HIV-1 transmission studies is the rapid and accurate identification of epidemiologically linked transmission pairs. To date, this has been accomplished by comparison of polymerase chain reaction (PCR)-amplified nucleotide sequences from potential transmission pairs, which can be cost-prohibitive for use in resource-limited settings. Here we describe a rapid, cost-effective approach to determine transmission linkage based on the heteroduplex mobility assay (HMA), and validate this approach by comparison to nucleotide sequencing. A total of 102 HIV-1-infected Zambian and Rwandan couples, with known linkage, were analyzed by gp41-HMA. A 400-base pair fragment within the envelope gp41 region of the HIV proviral genome was PCR amplified and HMA was applied to both partners' amplicons separately (autologous) and as a mixture (heterologous). If the diversity between gp41 sequences was low (<5%), a homoduplex was observed upon gel electrophoresis and the transmission was characterized as having occurred between partners (linked). If a new heteroduplex formed, within the heterologous migration, the transmission was determined to be unlinked. Initial blind validation of gp-41 HMA demonstrated 90% concordance between HMA and sequencing with 100% concordance in the case of linked transmissions. Following validation, 25 newly infected partners in Kigali and 12 in Lusaka were evaluated prospectively using both HMA and nucleotide sequences. Concordant results were obtained in all but one case (97.3%). The gp41-HMA technique is a reliable and feasible tool to detect linked transmissions in the field. All identified unlinked results should be confirmed by sequence analyses.
In silico identification of conserved microRNAs in large number of diverse plant species

PubMed Central

Sunkar, Ramanjulu; Jagadeeswaran, Guru

2008-01-01

Background MicroRNAs (miRNAs) are recently discovered small non-coding RNAs that play pivotal roles in gene expression, specifically at the post-transcriptional level in plants and animals. Identification of miRNAs in large number of diverse plant species is important to understand the evolution of miRNAs and miRNA-targeted gene regulations. Now-a-days, publicly available databases play a central role in the in-silico biology. Because, at least ~21 miRNA families are conserved in higher plants, a homology based search using these databases can help identify orthologs or paralogs in plants. Results We searched all publicly available nucleotide databases of genome survey sequences (GSS), high-throughput genomics sequences (HTGS), expressed sequenced tags (ESTs) and nonredundant (NR) nucleotides and identified 682 miRNAs in 155 diverse plant species. We found more than 15 conserved miRNA families in 11 plant species, 10 to14 families in 10 plant species and 5 to 9 families in 29 plant species. Nineteen conserved miRNA families were identified in important model legumes such as Medicago, Lotus and soybean. Five miRNA families – miR319, miR156/157, miR169, miR165/166 and miR394 – were found in 51, 45, 41, 40 and 40 diverse plant species, respectively. miR403 homologs were found in 16 dicots, whereas miR437 and miR444 homologs, as well as the miR396d/e variant of the miR396 family, were found only in monocots, thus providing large-scale authenticity for the dicot- and monocot-specific miRNAs. Furthermore, we provide computational and/or experimental evidence for the conservation of 6 newly found Arabidopsis miRNA homologs (miR158, miR391, miR824, miR825, miR827 and miR840) and 2 small RNAs (small-85 and small-87) in Brassica spp. Conclusion Using all publicly available nucleotide databases, 682 miRNAs were identified in 155 diverse plant species. By combining the expression analysis with the computational approach, we found that 6 miRNAs and 2 small RNAs that have been identified only in Arabidopsis thus far, are also conserved in Brassica spp. These findings will be useful for tracing the evolution of small RNAs by examining their expression in common ancestors of the Arabidopsis-Brassica lineage. PMID:18416839
A new approach for detecting adventitious viruses shows Sf-rhabdovirus-negative Sf-RVN cells are suitable for safe biologicals production.

PubMed

Geisler, Christoph

2018-02-07

Adventitious viral contamination in cell substrates used for biologicals production is a major safety concern. A powerful new approach that can be used to identify adventitious viruses is a combination of bioinformatics tools with massively parallel sequencing technology. Typically, this involves mapping or BLASTN searching individual reads against viral nucleotide databases. Although extremely sensitive for known viruses, this approach can easily miss viruses that are too dissimilar to viruses in the database. Moreover, it is computationally intensive and requires reference cell genome databases. To avoid these drawbacks, we set out to develop an alternative approach. We reasoned that searching genome and transcriptome assemblies for adventitious viral contaminants using TBLASTN with a compact viral protein database covering extant viral diversity as the query could be fast and sensitive without a requirement for high performance computing hardware. We tested our approach on Spodoptera frugiperda Sf-RVN, a recently isolated insect cell line, to determine if it was contaminated with one or more adventitious viruses. We used Illumina reads to assemble the Sf-RVN genome and transcriptome and searched them for adventitious viral contaminants using TBLASTN with our viral protein database. We found no evidence of viral contamination, which was substantiated by the fact that our searches otherwise identified diverse sequences encoding virus-like proteins. These sequences included Maverick, R1 LINE, and errantivirus transposons, all of which are common in insect genomes. We also identified previously described as well as novel endogenous viral elements similar to ORFs encoded by diverse insect viruses. Our results demonstrate TBLASTN searching massively parallel sequencing (MPS) assemblies with a compact, manually curated viral protein database is more sensitive for adventitious virus detection than BLASTN, as we identified various sequences that encoded virus-like proteins, but had no similarity to viral sequences at the nucleotide level. Moreover, searches were fast without requiring high performance computing hardware. Our study also documents the enhanced biosafety profile of Sf-RVN as compared to other Sf cell lines, and supports the notion that Sf-RVN is highly suitable for the production of safe biologicals.
Archaeon and archaeal virus diversity classification via sequence entropy and fractal dimension

NASA Astrophysics Data System (ADS)

Tremberger, George, Jr.; Gallardo, Victor; Espinoza, Carola; Holden, Todd; Gadura, N.; Cheung, E.; Schneider, P.; Lieberman, D.; Cheung, T.

2010-09-01

Archaea are important potential candidates in astrobiology as their metabolism includes solar, inorganic and organic energy sources. Archaeal viruses would also be expected to be present in a sustainable archaeal exobiological community. Genetic sequence Shannon entropy and fractal dimension can be used to establish a two-dimensional measure for classification and phylogenetic study of these organisms. A sequence fractal dimension can be calculated from a numerical series consisting of the atomic numbers of each nucleotide. Archaeal 16S and 23S ribosomal RNA sequences were studied. Outliers in the 16S rRNA fractal dimension and entropy plot were found to be halophilic archaea. Positive correlation (R-square ~ 0.75, N = 18) was observed between fractal dimension and entropy across the studied species. The 16S ribosomal RNA sequence entropy correlates with the 23S ribosomal RNA sequence entropy across species with R-square 0.93, N = 18. Entropy values correspond positively with branch lengths of a published phylogeny. The studied archaeal virus sequences have high fractal dimensions of 2.02 or more. A comparison of selected extremophile sequences with archaeal sequences from the Humboldt Marine Ecosystem database (Wood-Hull Oceanography Institute, MIT) suggests the presence of continuous sequence expression as inferred from distributions of entropy and fractal dimension, consistent with the diversity expected in an exobiological archaeal community.
Mitochondrial Genome Diversity of Native Americans Supports a Single Early Entry of Founder Populations into America

PubMed Central

Silva Jr., Wilson A.; Bonatto, Sandro L.; Holanda, Adriano J.; Ribeiro-dos-Santos, Andrea K.; Paixão, Beatriz M.; Goldman, Gustavo H.; Abe-Sandes, Kiyoko; Rodriguez-Delfin, Luis; Barbosa, Marcela; Paçó-Larson, Maria Luiza; Petzl-Erler, Maria Luiza; Valente, Valeria; Santos, Sidney E. B.; Zago, Marco A.

2002-01-01

There is general agreement that the Native American founder populations migrated from Asia into America through Beringia sometime during the Pleistocene, but the hypotheses concerning the ages and the number of these migrations and the size of the ancestral populations are surrounded by controversy. DNA sequence variations of several regions of the genome of Native Americans, especially in the mitochondrial DNA (mtDNA) control region, have been studied as a tool to help answer these questions. However, the small number of nucleotides studied and the nonclocklike rate of mtDNA control-region evolution impose several limitations to these results. Here we provide the sequence analysis of a continuous region of 8.8 kb of the mtDNA outside the D-loop for 40 individuals, 30 of whom are Native Americans whose mtDNA belongs to the four founder haplogroups. Haplogroups A, B, and C form monophyletic clades, but the five haplogroup D sequences have unstable positions and usually do not group together. The high degree of similarity in the nucleotide diversity and time of differentiation (i.e., ∼21,000 years before present) of these four haplogroups support a common origin for these sequences and suggest that the populations who harbor them may also have a common history. Additional evidence supports the idea that this age of differentiation coincides with the process of colonization of the New World and supports the hypothesis of a single and early entry of the ancestral Asian population into the Americas. PMID:12022039
Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio

DOE PAGES

Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex; ...

2017-03-10

Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less

Identification of novel alleles of the rice blast resistance gene Pi54

NASA Astrophysics Data System (ADS)

Vasudevan, Kumar; Gruissem, Wilhelm; Bhullar, Navreet K.

2015-10-01

Rice blast is one of the most devastating rice diseases and continuous resistance breeding is required to control the disease. The rice blast resistance gene Pi54 initially identified in an Indian cultivar confers broad-spectrum resistance in India. We explored the allelic diversity of the Pi54 gene among 885 Indian rice genotypes that were found resistant in our screening against field mixture of naturally existing M. oryzae strains as well as against five unique strains. These genotypes are also annotated as rice blast resistant in the International Rice Genebank database. Sequence-based allele mining was used to amplify and clone the Pi54 allelic variants. Nine new alleles of Pi54 were identified based on the nucleotide sequence comparison to the Pi54 reference sequence as well as to already known Pi54 alleles. DNA sequence analysis of the newly identified Pi54 alleles revealed several single polymorphic sites, three double deletions and an eight base pair deletion. A SNP-rich region was found between a tyrosine kinase phosphorylation site and the nucleotide binding site (NBS) domain. Together, the newly identified Pi54 alleles expand the allelic series and are candidates for rice blast resistance breeding programs.
Methods for decoding Cas9 protospacer adjacent motif (PAM) sequences: A brief overview.

PubMed

Karvelis, Tautvydas; Gasiunas, Giedrius; Siksnys, Virginijus

2017-05-15

Recently the Cas9, an RNA guided DNA endonuclease, emerged as a powerful tool for targeted genome manipulations. Cas9 protein can be reprogrammed to cleave, bind or nick any DNA target by simply changing crRNA sequence, however a short nucleotide sequence, termed PAM, is required to initiate crRNA hybridization to the DNA target. PAM sequence is recognized by Cas9 protein and must be determined experimentally for each Cas9 variant. Exploration of Cas9 orthologs could offer a diversity of PAM sequences and novel biochemical properties that may be beneficial for genome editing applications. Here we briefly review and compare Cas9 PAM identification assays that can be adopted for other PAM-dependent CRISPR-Cas systems. Copyright © 2017 Elsevier Inc. All rights reserved.
Siberian population of the New Stone Age: mtDNA haplotype diversity in the ancient population from the Ust'-Ida I burial ground, dated 4020-3210 BC by 14C.

PubMed

Naumova O, Y u; Rychkov S, Y u

1998-03-01

On the basis of analysis of mtDNA from skeletal remains, dated by 14C 4020-3210 BC, from the Ust'-Ida I Neolithic burial ground in Cis-Baikal area of Siberia, we obtained genetic characteristics of the ancient Mongoloid population. Using the 7 restriction enzymes for the analysis of site's polymorphism in 16,106-16,545 region of mtDNA, we studied the structure of the most frequent DNA haplotypes, and estimated the intrapopulational nucleotide diversity of the Neolithic population. Comparison of the Neolithic and modern indigeneous populations from Siberia, Mongolia and Ural showed, that the ancient Siberian population is one of the ancestors of the modern population of Siberia. From genetic distance, in the assumption of constant nucleotide substitution rate, we estimated the divergence time between the Neolithic and the modern Siberian population. This divergence time (5572 years ago) is conformed to the age of skeletal remains (5542-5652 years). With use of the 14C dates of the skeletal remains, nucleotide substitution rate in mtDNA was estimated as 1% sequence divergence for 8938-9115 years.
Genetic variations in merozoite surface antigen genes of Babesia bovis detected in Vietnamese cattle and water buffaloes.

PubMed

Yokoyama, Naoaki; Sivakumar, Thillaiampalam; Tuvshintulga, Bumduuren; Hayashida, Kyoko; Igarashi, Ikuo; Inoue, Noboru; Long, Phung Thang; Lan, Dinh Thi Bich

2015-03-01

The genes that encode merozoite surface antigens (MSAs) in Babesia bovis are genetically diverse. In this study, we analyzed the genetic diversity of B. bovis MSA-1, MSA-2b, and MSA-2c genes in Vietnamese cattle and water buffaloes. Blood DNA samples from 258 cattle and 49 water buffaloes reared in the Thua Thien Hue province of Vietnam were screened with a B. bovis-specific diagnostic PCR assay. The B. bovis-positive DNA samples (23 cattle and 16 water buffaloes) were then subjected to PCR assays to amplify the MSA-1, MSA-2b, and MSA-2c genes. Sequencing analyses showed that the Vietnamese MSA-1 and MSA-2b sequences are genetically diverse, whereas MSA-2c is relatively conserved. The nucleotide identity values for these MSA gene sequences were similar in the cattle and water buffaloes. Consistent with the sequencing data, the Vietnamese MSA-1 and MSA-2b sequences were dispersed across several clades in the corresponding phylogenetic trees, whereas the MSA-2c sequences occurred in a single clade. Cattle- and water-buffalo-derived sequences also often clustered together on the phylogenetic trees. The Vietnamese MSA-1, MSA-2b, and MSA-2c sequences were then screened for recombination with automated methods. Of the seven recombination events detected, five and two were associated with the MSA-2b and MSA-2c recombinant sequences, respectively, whereas no MSA-1 recombinants were detected among the sequences analyzed. Recombination between the sequences derived from cattle and water buffaloes was very common, and the resultant recombinant sequences were found in both host animals. These data indicate that the genetic diversity of the MSA sequences does not differ between cattle and water buffaloes in Vietnam. They also suggest that recombination between the B. bovis MSA sequences in both cattle and water buffaloes might contribute to the genetic variation in these genes in Vietnam. Copyright © 2015 Elsevier B.V. All rights reserved.
[Phylogenetic and diversity analysis of Acidithiobacillus spp. based on 16S rRNA and RubisCO genes homologues].

PubMed

Liu, Minrui; Lin, Pengwu; Qi, Xing'e; Ni, Yongqing

2016-04-14

The purpose of the study was to reveal geographic region-related Acidithiobacillus spp. distribution and allopatric speciation. Phylogenetic and diversity analysis was done to expand our knowledge on microbial phylogeography, diversity-maintaining mechanisms and molecular biogeography. We amplified 16S rRNA gene and RubisCO genes to construct corresponding phylogenetic trees based on the sequence homology and analyzed genetic diversity of Acidithiobacillus spp.. Thirty-five strains were isolated from three different regions in China (Yunnan, Hubei, Xinjiang). The whole isolates were classified into five groups. Four strains were identified as A. ferrivorans, six as A. ferridurans, YNTR4-15 Leptspirillum ferrooxidans and HBDY3-31 as Leptospirillum ferrodiazotrophum. The remaining strains were identified as A. ferrooxidans. Analysis of cbbL and cbbM genes sequences of representative 26 strains indicated that cbbL gene of 19 were two copies (cbbL1 and cbbL2) and 7 possessed only cbbL1. cbbM gene was single copy. In nucleotide-based trees, cbbL1 gene sequences of strains were separated into three sequence types, and the cbbL2 was similar to cbbL1 with three types. Codon bias of RubisCO genes was not obvious in Acidithiobacillus spp.. Strains isolated from three different regions in China indicated a great genetic diversity in Acidithiobacillus spp. and their 16S rRNA/RubisCO genes sequence was of significant difference. Phylogenetic tree based on 16S rRNA genes and RubisCO genes was different in Acidithiobacillus spp..
CircularLogo: A lightweight web application to visualize intra-motif dependencies.

PubMed

Ye, Zhenqing; Ma, Tao; Kalmbach, Michael T; Dasari, Surendra; Kocher, Jean-Pierre A; Wang, Liguo

2017-05-22

The sequence logo has been widely used to represent DNA or RNA motifs for more than three decades. Despite its intelligibility and intuitiveness, the traditional sequence logo is unable to display the intra-motif dependencies and therefore is insufficient to fully characterize nucleotide motifs. Many methods have been developed to quantify the intra-motif dependencies, but fewer tools are available for visualization. We developed CircularLogo, a web-based interactive application, which is able to not only visualize the position-specific nucleotide consensus and diversity but also display the intra-motif dependencies. Applying CircularLogo to HNF6 binding sites and tRNA sequences demonstrated its ability to show intra-motif dependencies and intuitively reveal biomolecular structure. CircularLogo is implemented in JavaScript and Python based on the Django web framework. The program's source code and user's manual are freely available at http://circularlogo.sourceforge.net . CircularLogo web server can be accessed from http://bioinformaticstools.mayo.edu/circularlogo/index.html . CircularLogo is an innovative web application that is specifically designed to visualize and interactively explore intra-motif dependencies.
A novel flavivirus detected in two Aedes spp. collected near the demilitarized zone of the Republic of Korea.

PubMed

Korkusol, Achareeya; Takhampunya, Ratree; Hang, Jun; Jarman, Richard G; Tippayachai, Bousaraporn; Kim, Heung-Chul; Chong, Sung-Tae; Davidson, Silas A; Klein, Terry A

2017-05-01

Flaviviruses comprise a large and diverse group of positive-stranded RNA viruses, including tick-, mosquito- and unknown-vector-borne flaviviruses. A novel flavivirus was detected in pools of Aedes vexans nipponii (n=1) and Aedes esoensis (n=3) collected in 2012 and 2013 near the demilitarized zone (DMZ), Republic of Korea (ROK). Phylogenetic analyses of the NS5, E gene and complete polyprotein coding sequence (CDS) showed that the novel virus fell within the Aedes-borne flaviviruses (ABFVs), with nucleotide identity ranging from 57.8-75.1 %, 46.1-74.2 % and 51.1-76.2 %, respectively. While the novel ABFV was distant from other flaviviruses within the group, it formed a clade with Ilomantsi virus (ILOV). Sequence alignments of the partial NS5 gene, full-length E gene and polyprotein CDS between the novel virus and ILOV showed approximately 76.2 % nucleotide identity and 90 % amino acid identity, respectively. The ABFV identified in Aedes mosquitoes from the ROK is a novel ABFV based on the sequence analyses and is designated as Panmunjeom flavivirus (PANFV).
The chicken genome: some good news and some bad news.

PubMed

Dodgson, J B

2007-07-01

The sequencing of the chicken genome has generated a wealth of good news for poultry science. It allows the chicken to be a major player in 21st century biology by providing an entrée into an arsenal of new technologies that can be used to explore virtually any chicken phenotype of interest. The initial technological onslaught has been described in this symposium. The wealth of data available now or soon to be available cannot be explained by simplistic models and will force us to treat the inherent complexity of the chicken in ways that are more realistic but at the same time more difficult to comprehend. Initial single nucleotide polymorphism analyses suggest that broilers retain a remarkable amount of the genetic diversity of predomesticated Jungle Fowl, whereas commercial layer genomes display less diversity and broader linkage disequilibrium. Thus, intensive commercial selection has not fixed a genome rich in wide selective sweeps, at least within the broiler population. Rather, a complex assortment of combinations of ancient allelic diversity survives. Low levels of linkage disequilibrium will make association analysis in broilers more difficult. The wider disequilibrium observed in layers should facilitate the mapping of quantitative trait loci, and at the same time make it more difficult to identify the causative nucleotide change(s). In addition, many quantitative traits may be specific to the genetic background in which they arose and not readily transferable to, or detectable in, other line backgrounds. Despite the obstacles it presents, the genetic complexity of the chicken may also be viewed as good news because it insures that long-term genetic progress will continue via breeding using quantitative genetics, and it surely will keep poultry scientists busy for decades to come. It is now time to move from an emphasis on obtaining "THE" chicken genome sequence to obtaining multiple sequences, especially of foundation stocks, and a broader understanding of the full genetic and phenotypic diversity of the domesticated chicken.
Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

DOEpatents

McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Motin, Vladinir L [League City, TX

2009-02-24

Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Nucleotide sequences specific to Brucella and methods for the detection of Brucella

DOE Office of Scientific and Technical Information (OSTI.GOV)

McCready, Paula M; Radnedge, Lyndsay; Andersen, Gary L

Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Nucleotide diversity at two phytochrome loci along a latitudinal cline in Pinus sylvestris.

PubMed

García-Gil, M R; Mikkonen, M; Savolainen, O

2003-05-01

Forest tree species provide many examples of well-studied adaptive differentiation, where the search for the underlying genes might be possible. In earlier studies and in our common conditions in a greenhouse, northern populations set bud earlier than southern ones. A difference in latitude of origin of one degree corresponded to a change of 1.4 days in number of days to terminal bud set of seedlings. Earlier physiological and ecological genetics work in conifers and other plants have suggested that such variation could be governed by phytochromes. Nucleotide variation was examined at two phytochrome loci (PHYP and PHYO, homologues of the Arabidopsis thaliana PHYB and PHYA, respectively) in three populations: northern Finland, southern Finland and northern Spain. In our samples of 12-15 sequences (2980 and 1156 base pairs at the two loci) we found very low nonsynonymous variation; pi was 0.0003 and 0.0002 at PHYP and PHYO loci, respectively. There was no functional differentiation between populations at the photosensory domains of either locus. The overall silent variation was also low, only 0.0024 for the PHYP locus. The low estimates of silent variation are consistent with the estimated low synonymous substitution rates between Pinus sylvestris and Picea abies at the PHYO locus. Despite the low level of nucleotide variation, haplotypic diversity was relatively high (0.42 and 0.41 for fragments of 1156 nucleotides) at the two loci.
Genetic Diversity of Hepatitis A Virus in China: VP3-VP1-2A Genes and Evidence of Quasispecies Distribution in the Isolates

PubMed Central

Cao, Jingyuan; Zhou, Wenting; Yi, Yao; Jia, Zhiyuan; Bi, Shengli

2013-01-01

Hepatitis A virus (HAV) is the most common cause of infectious hepatitis throughout the world, spread largely by the fecal-oral route. To characterize the genetic diversity of the virus circulating in China where HAV in endemic, we selected the outbreak cases with identical sequences in VP1-2A junction region and compiled a panel of 42 isolates. The VP3-VP1-2A regions of the HAV capsid-coding genes were further sequenced and analyzed. The quasispecies distribution was evaluated by cloning the VP3 and VP1-2A genes in three clinical samples. Phylogenetic analysis demonstrated that the same genotyping results could be obtained whether using the complete VP3, VP1, or partial VP1-2A genes for analysis in this study, although some differences did exist. Most isolates clustered in sub-genotype IA, and fewer in sub-genotype IB. No amino acid mutations were found at the published neutralizing epitope sites, however, several unique amino acid substitutions in the VP3 or VP1 region were identified, with two amino acid variants closely located to the immunodominant site. Quasispecies analysis showed the mutation frequencies were in the range of 7.22x10-4 -2.33x10-3 substitutions per nucleotide for VP3, VP1, or VP1-2A. When compared with the consensus sequences, mutated nucleotide sites represented the minority of all the analyzed sequences sites. HAV replicated as a complex distribution of closely genetically related variants referred to as quasispecies, and were under negative selection. The results indicate that diverse HAV strains and quasispecies inside the viral populations are presented in China, with unique amino acid substitutions detected close to the immunodominant site, and that the possibility of antigenic escaping mutants cannot be ruled out and needs to be further analyzed. PMID:24069343
Receptor-like genes in the major resistance locus of lettuce are subject to divergent selection.

PubMed Central

Meyers, B C; Shen, K A; Rohani, P; Gaut, B S; Michelmore, R W

1998-01-01

Disease resistance genes in plants are often found in complex multigene families. The largest known cluster of disease resistance specificities in lettuce contains the RGC2 family of genes. We compared the sequences of nine full-length genomic copies of RGC2 representing the diversity in the cluster to determine the structure of genes within this family and to examine the evolution of its members. The transcribed regions range from at least 7.0 to 13.1 kb, and the cDNAs contain deduced open reading frames of approximately 5. 5 kb. The predicted RGC2 proteins contain a nucleotide binding site and irregular leucine-rich repeats (LRRs) that are characteristic of resistance genes cloned from other species. Unique features of the RGC2 gene products include a bipartite LRR region with >40 repeats. At least eight members of this family are transcribed. The level of sequence diversity between family members varied in different regions of the gene. The ratio of nonsynonymous (Ka) to synonymous (Ks) nucleotide substitutions was lowest in the region encoding the nucleotide binding site, which is the presumed effector domain of the protein. The LRR-encoding region showed an alternating pattern of conservation and hypervariability. This alternating pattern of variation was also found in all comparisons within families of resistance genes cloned from other species. The Ka /Ks ratios indicate that diversifying selection has resulted in increased variation at these codons. The patterns of variation support the predicted structure of LRR regions with solvent-exposed hypervariable residues that are potentially involved in binding pathogen-derived ligands. PMID:9811792
Genetic structure and genealogy in the Sphagnum subsecundum complex (Sphagnaceae: Bryophyta).

PubMed

Shaw, A J; Pokorny, L; Shaw, B; Ricca, M; Boles, S; Szövényi, P

2008-10-01

Allopolyploidy is probably the most extensively studied mode of plant speciation and allopolyploid species appear to be common in the mosses (Bryophyta). The Sphagnum subsecundum complex includes species known to be gametophytically haploid or diploid, and it has been proposed that the diploids (i.e., with tetraploid sporophytes) are allopolyploids. Nucleotide sequence and microsatellite variation among haploids and diploids from Newfoundland and Scandinavia indicate that (1) the diploids exhibit fixed or nearly fixed heterozygosity at the majority of loci sampled, and are clearly allopolyploids, (2) diploids originated independently in North America and Europe, (3) the European diploids appear to have the haploid species, S. subsecundum, as the maternal parent based on shared chloroplast DNA haplotypes, (4) the North American diploids do not have the chloroplast DNA of any sampled haploid, (5) both North American and European diploids share nucleotide and microsatellite similarities with S. subsecundum, (6) the diploids harbor more nucleotide and microsatellite diversity than the haploids, and (7) diploids exhibit higher levels of linkage disequilibrium among microsatellite loci. An experiment demonstrates significant artifactual recombination between interspecific DNAs coamplified by PCR, which may be a complicating factor in the interpretation of sequence-based analyses of allopolyploids.
Sequence polymorphism data of the hypervariable regions of mitochondrial DNA in the Yadav population of Haryana.

PubMed

Verma, Kapil; Sharma, Sapna; Sharma, Arun; Dalal, Jyoti; Bhardwaj, Tapeshwar

2018-06-01

Genetic variations among humans occur both within and among populations and range from single nucleotide changes to multiple-nucleotide variants. These multiple-nucleotide variants are useful for studying the relationships among individuals or various population groups. The study of human genetic variations can help scientists understand how different population groups are biologically related to one another. Sequence analysis of hypervariable regions of human mitochondrial DNA (mtDNA) has been successfully used for the genetic characterization of different population groups for forensic purposes. It is well established that different ethnic or population groups differ significantly in their mtDNA distributions. In the last decade, very little research has been conducted on mtDNA variations in the Indian population, although such data would be useful for elucidating the history of human population expansion across the world. Moreover, forensic studies on mtDNA variations in the Indian subcontinent are also scarce, particularly in the northern part of India. In this report, variations in the hypervariable regions of mtDNA were analyzed in the Yadav population of Haryana. Different molecular diversity indices were computed. Further, the obtained haplotypes were classified into different haplogroups and the phylogenetic relationship between different haplogroups was inferred.
Genetic diversity analysis of the oriental river prawn (Macrobrachium nipponense) in Huaihe River.

PubMed

Cui, Feng; Yu, Yanyan; Bao, Fangyin; Wang, Song; Xiao, Ming Song

2018-04-19

The oriental river prawn (Macrobrachium nipponense) is an economically and nutritionally important species of decapod crustaceans in China. Genetic structure and demographic history of Macrobrachium nipponense were examined using sequence data from portions of the mitochondrial DNA cytochrome oxidase subunit I (COI) gene. Samples of 191 individuals were collected from 10 localities in the upper to middle reaches of the Huaihe River. Variability was detected at a total of 42 nucleotide sites along 684 bp length of homologous sequence (6.14%), and base substitutions occurred mostly at the second codon position. Haplotype diversity (h) and nucleotide diversity (π) of all populations were 0.9136 ± 0.0116 and 0.0078 ± 0.0042, respectively. Phylogenetic tree constructed using the maximum-likelihood (ML) method showed that the 44 haplotypes were assigned to two obvious clades associated with geographic regions. Moreover, the median-joining network was similar to the topology of the phylogenetic tree with 44 haplotypes. The pairwise F ST values between the populations varied from -0.0298 to 0.2994. Generally, moderate genetic differentiation (F ST = 0.1598, p = .0000) among different geographic populations was detected, with the significant differentiation between the Huaibin (HB) and other Macrobrachium nipponense populations. Both mismatch distribution analyses and neutrality tests suggested the early stage of Late Pleistocene population expansion 85,500 years before present for the species, which was consistent with the palaeoclimatic condition of the Huaihe River Basin.
High Genetic Diversity Revealed by Variable-Number Tandem Repeat Genotyping and Analysis of hsp65 Gene Polymorphism in a Large Collection of “Mycobacterium canettii” Strains Indicates that the M. tuberculosis Complex Is a Recently Emerged Clone of “M. canettii”

PubMed Central

Fabre, Michel; Koeck, Jean-Louis; Le Flèche, Philippe; Simon, Fabrice; Hervé, Vincent; Vergnaud, Gilles; Pourcel, Christine

2004-01-01

We have analyzed, using complementary molecular methods, the diversity of 43 strains of “Mycobacterium canettii” originating from the Republic of Djibouti, on the Horn of Africa, from 1998 to 2003. Genotyping by multiple-locus variable-number tandem repeat analysis shows that all the strains belong to a single but very distant group when compared to strains of the Mycobacterium tuberculosis complex (MTBC). Thirty-one strains cluster into one large group with little variability and five strains form another group, whereas the other seven are more diverged. In total, 14 genotypes are observed. The DR locus analysis reveals additional variability, some strains being devoid of a direct repeat locus and others having unique spacers. The hsp65 gene polymorphism was investigated by restriction enzyme analysis and sequencing of PCR amplicons. Four new single nucleotide polymorphisms were discovered. One strain was characterized by three nucleotide changes in 441 bp, creating new restriction enzyme polymorphisms. As no sequence variability was found for hsp65 in the whole MTBC, and as a single point mutation separates M. tuberculosis from the closest “M. canettii” strains, this diversity within “M. canettii” subspecies strongly suggests that it is the most probable source species of the MTBC rather than just another branch of the MTBC. PMID:15243089
Genotypic diversity of stress response in Lactobacillus plantarum, Lactobacillus paraplantarum and Lactobacillus pentosus.

PubMed

Ricciardi, Annamaria; Parente, Eugenio; Guidone, Angela; Ianniello, Rocco Gerardo; Zotta, Teresa; Abu Sayem, S M; Varcamonti, Mario

2012-07-02

Lactobacillus plantarum, Lactobacillus pentosus and Lactobacillus paraplantarum are three closely related species which are widespread in food and non-food environments, and are important as starter bacteria or probiotics. In order to evaluate the phenotypic diversity of stress tolerance in the L. plantarum group and the ability to mount an adaptive heat shock response, the survival of exponential and stationary phase and of heat adapted exponential phase cells of six L. plantarum subsp. plantarum, one L. plantarum subsp. argentoratensis, one L. pentosus and two L. paraplantarum strains selected in a previous work upon exposure to oxidative, heat, detergent, starvation and acid stresses was compared to that of the L. plantarum WCFS1 strain. Furthermore, to evaluate the genotypic diversity in stress response genes, ten genes (encoding for chaperones DnaK, GroES and GroEL, regulators CtsR, HrcA and CcpA, ATPases/proteases ClpL, ClpP, ClpX and protease FtsH) were amplified using primers derived from the WCFS1 genome sequence and submitted to restriction with one or two endonucleases. The results were compared by univariate and multivariate statistical methods. In addition, the amplicons for hrcA and ctsR were sequenced and compared by multiple sequence alignment and polymorphism analysis. Although there was evidence of a generalized stress response in the stationary phase, with increase of oxidative, heat, and, to a lesser extent, starvation stress tolerance, and for adaptive heat stress response, with increased tolerance to heat, acid and detergent, different growth phases and adaptation patterns were found. Principal component analysis showed that while heat, acid and detergent stresses respond similarly to growth phase and adaptation, tolerance to oxidative and starvation stresses implies completely unrelated mechanisms. A dendrogram obtained using the data from multilocus restriction typing (MLRT) of stress response genes clearly separated two groups of L. plantarum strains from the other species but there was no correlation between genotypic grouping and grouping obtained on the basis of the stress response pattern, nor with the phylograms obtained from hrcA and ctsR sequences. Differences in sequence in L. plantarum strains were mostly due to single nucleotide polymorphisms with a high frequency of synonymous nucleotide changes and, while hrcA was characterized by an excess of low frequency polymorphism, very low diversity was found in ctsR sequences. Sequence alignment of hrcA allowed a correct discrimination of the strains at the species level, thus confirming the relevance of stress response genes for taxonomy. Copyright © 2012 Elsevier B.V. All rights reserved.
Enabling multiplexed testing of pooled donor cells through whole-genome sequencing.

PubMed

Chan, Yingleong; Chan, Ying Kai; Goodman, Daniel B; Guo, Xiaoge; Chavez, Alejandro; Lim, Elaine T; Church, George M

2018-04-19

We describe a method that enables the multiplex screening of a pool of many different donor cell lines. Our method accurately predicts each donor proportion from the pool without requiring the use of unique DNA barcodes as markers of donor identity. Instead, we take advantage of common single nucleotide polymorphisms, whole-genome sequencing, and an algorithm to calculate the proportions from the sequencing data. By testing using simulated and real data, we showed that our method robustly predicts the individual proportions from a mixed-pool of numerous donors, thus enabling the multiplexed testing of diverse donor cells en masse.More information is available at https://pgpresearch.med.harvard.edu/poolseq/.
Genetic diversity of ORF3 and spike genes of porcine epidemic diarrhea virus in Thailand.

PubMed

Temeeyasen, Gun; Srijangwad, Anchalee; Tripipat, Thitima; Tipsombatboon, Pavita; Piriyapongsa, Jittima; Phoolcharoen, Waranyoo; Chuanasa, Taksina; Tantituvanont, Angkana; Nilubol, Dachrit

2014-01-01

Porcine epidemic diarrhea virus (PEDV) has become endemic in the Thai swine industry, causing economic losses and repeated outbreaks since its first emergence in 2007. In the present study, 69 Thai PEDV isolates were obtained from 50 swine herds across Thailand during the period 2008-2012. Both partial and complete nucleotide sequences of the spike (S) glycoprotein and the nucleotide sequences of ORF3 genes were determined to investigate the genetic diversity and molecular epidemiology of Thai PEDV. Based on the analysis of the partial S glycoprotein genes, the Thai PEDV isolates were clustered into 2 groups related to Korean and Chinese field isolates. The results for the complete spike genes, however, demonstrated that both groups were grouped in the same cluster. Interestingly, both groups of Thai PEDV isolates had a 4-aa (GENQ) insertion between positions 55 and 56, a 1-aa insertion between positions 135 and 136, and a 2-aa deletion between positions 155 and 156, making them identical to the Korean KNU series and isolates responsible for outbreaks in China in recent years. In addition to the complete S sequences, the ORF3 gene analyses suggested that the isolates responsible for outbreaks in Thailand are not vaccine related. The results of this study suggest that the PEDV isolates responsible for outbreaks in Thailand since its emergence represent a variant of PEDV that was previously reported in China and Korea. Copyright © 2013 Elsevier B.V. All rights reserved.

Geographic distribution of hepatitis C virus genotype 6 subtypes in Thailand.

PubMed

Akkarathamrongsin, Srunthron; Praianantathavorn, Kesmanee; Hacharoen, Nisachol; Theamboonlers, Apiradee; Tangkijvanich, Pisit; Tanaka, Yasuhito; Mizokami, Masashi; Poovorawan, Yong

2010-02-01

The nucleotide sequence of hepatitis C virus (HCV) genotype 6 found mostly in south China and south-east Asia, displays profound genetic diversity. The aim of this study to determine the genetic variability of HCV genotype 6 (HCV-6) in Thailand and locate the subtype distribution of genotype 6 in various geographic areas. Four hundred nineteen anti-HCV positive serum samples were collected from patients residing in - the central part of the country. HCV RNA positive samples based on reverse transcriptase- polymerase chain reaction (RT-PCR) of the 5'UTR were amplified with primers specific for the core and NS5B regions. Nucleotide sequences of both regions were analyzed for the genotype by phylogenetic analysis. To determine geographic distribution of HCV-6 subtypes, a search of the international database on subtype distribution in the respective countries was conducted. Among 375 HCV RNA positive samples, 71 had HCV-6 based on phylogenetic analysis of partial core and NS5B regions. The subtype distribution in order of predominance was 6f (56%), 6n (22%), 6i (11%), 6j (10%), and 6e (1%). Among the 13 countries with different subtypes of HCV-6, most sequences have been reported from Vietnam. Subtype 6f was found exclusively in Thailand where five distinct HCV-6 subtypes are circulating. HCV-6, which is endemic in south China and south-east Asia, displays profound genetic diversity and may have evolved over a considerable period of time. (c) 2009 Wiley-Liss, Inc.
Characterization of a mini core collection of Japanese wheat varieties using single-nucleotide polymorphisms generated by genotyping-by-sequencing.

PubMed

Kobayashi, Fuminori; Tanaka, Tsuyoshi; Kanamori, Hiroyuki; Wu, Jianzhong; Katayose, Yuichi; Handa, Hirokazu

2016-03-01

A core collection of Japanese wheat varieties (JWC) consisting of 96 accessions was established based on their passport data and breeding pedigrees. To clarify the molecular basis of the JWC collection, genome-wide single-nucleotide polymorphism (SNP) genotyping was performed using the genotyping-by-sequencing (GBS) approach. Phylogenetic tree and population structure analyses using these SNP data revealed the genetic diversity and relationships among the JWC accessions, classifying them into four groups; "varieties in the Hokkaido area", "modern varieties in the northeast part of Japan", "modern varieties in the southwest part of Japan" and "classical varieties including landraces". This clustering closely reflected the history of wheat breeding in Japan. Furthermore, to demonstrate the utility of the JWC collection, we performed a genome-wide association study (GWAS) for three traits, namely, "days to heading in autumn sowing", "days to heading in spring sowing" and "culm length". We found significantly associated SNP markers with each trait, and some of these were closely linked to known major genes for heading date or culm length on the genetic map. Our study indicates that this JWC collection is a useful set of germplasm for basic and applied research aimed at understanding and utilizing the genetic diversity among Japanese wheat varieties.
Molecular evolution of type 2 porcine reproductive and respiratory syndrome viruses circulating in Vietnam from 2007 to 2015.

PubMed

Do, Hai Quynh; Trinh, Dinh Thau; Nguyen, Thi Lan; Vu, Thi Thu Hang; Than, Duc Duong; Van Lo, Thi; Yeom, Minjoo; Song, Daesub; Choe, SeEun; An, Dong-Jun; Le, Van Phan

2016-11-17

Porcine respiratory and reproductive syndrome (PRRS) virus is one of the most economically significant pathogens in the Vietnamese swine industry. ORF5, which participates in many functional processes, including virion assembly, entry of the virus into the host cell, and viral adaptation to the host immune response, has been widely used in molecular evolution and phylogeny studies. Knowing of molecular evolution of PRRSV fields strains might contribute to PRRS control in Vietnam. The results showed that phylogenetic analysis indicated that all strains belonged to sub-lineages 8.7 and 5.1. The nucleotide and amino acid identities between strains were 84.5-100% and 82-100%, respectively. Furthermore, the results revealed differences in nucleotide and amino acid identities between the 2 sub-lineage groups. N-glycosylation prediction identified 7 potential N-glycosylation sites and 11 glycotypes. Analyses of the GP5 sequences, revealed 7 sites under positive selective pressure and 25 under negative selective pressure. Phylogenetic analysis based on ORF5 sequence indicated the diversity of PRRSV in Vietnam. Furthermore, the variance of N-glycosylation sites and position under selective pressure were demonstrated. This study expands existing knowledge on the genetic diversity and evolution of PRRSV in Vietnam and assists the effective strategies for PRRS vaccine development in Vietnam.
Development and Characterization of 1,906 EST-SSR Markers from Unigenes in Jute (Corchorus spp.)

PubMed Central

Zhang, Liwu; Li, Yanru; Tao, Aifen; Fang, Pingping; Qi, Jianmin

2015-01-01

Jute, comprising white and dark jute, is the second important natural fiber crop after cotton worldwide. However, the lack of expressed sequence tag-derived simple sequence repeat (EST-SSR) markers has resulted in a large gap in the improvement of jute. Previously, de novo 48,914 unigenes from white jute were assembled. In this study, 1,906 EST-SSRs were identified from these assembled uingenes. Among these markers, di-, tri- and tetra-nucleotide repeat types were the abundant types (12.0%, 56.9% and 21.6% respectively). The AG-rich or GA-rich nucleotide repeats were the predominant. Subsequently, a sample of 116 SSRs, located in genes encoding transcription factors and cellulose synthases, were selected to survey polymorphisms among12 diverse jute accessions. Of these, 83.6% successfully amplified at least one fragment and detected polymorphism among the 12diverse genotypes, indicating that the newly developed SSRs are of good quality. Furthermore, the genetic similarity coefficients of all the 12 accessions were evaluated using 97 polymorphic SSRs. The cluster analysis divided the jute accessions into two main groups with genetic similarity coefficient of 0.61. These EST-SSR markers not only enrich molecular markers of jute genome, but also facilitate genetic and genomic researches in jute. PMID:26512891
A-to-I RNA Editing Contributes to Proteomic Diversity in Cancer.

PubMed

Peng, Xinxin; Xu, Xiaoyan; Wang, Yumeng; Hawke, David H; Yu, Shuangxing; Han, Leng; Zhou, Zhicheng; Mojumdar, Kamalika; Jeong, Kang Jin; Labrie, Marilyne; Tsang, Yiu Huen; Zhang, Minying; Lu, Yiling; Hwu, Patrick; Scott, Kenneth L; Liang, Han; Mills, Gordon B

2018-05-14

Adenosine (A) to inosine (I) RNA editing introduces many nucleotide changes in cancer transcriptomes. However, due to the complexity of post-transcriptional regulation, the contribution of RNA editing to proteomic diversity in human cancers remains unclear. Here, we performed an integrated analysis of TCGA genomic data and CPTAC proteomic data. Despite limited site diversity, we demonstrate that A-to-I RNA editing contributes to proteomic diversity in breast cancer through changes in amino acid sequences. We validate the presence of editing events at both RNA and protein levels. The edited COPA protein increases proliferation, migration, and invasion of cancer cells in vitro. Our study suggests an important contribution of A-to-I RNA editing to protein diversity in cancer and highlights its translational potential. Copyright © 2018 Elsevier Inc. All rights reserved.
A genetic variation map for chicken with 2.8 million single nucleotide polymorphisms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wong, G K; Hillier, L; Brandstrom, M

2005-02-20

We describe a genetic variation map for the chicken genome containing 2.8 million single nucleotide polymorphisms (SNPs), based on a comparison of the sequences of 3 domestic chickens (broiler, layer, Silkie) to their wild ancestor Red Jungle Fowl (RJF). Subsequent experiments indicate that at least 90% are true SNPs, and at least 70% are common SNPs that segregate in many domestic breeds. Mean nucleotide diversity is about 5 SNP/kb for almost every possible comparison between RJF and domestic lines, between two different domestic lines, and within domestic lines--contrary to the idea that domestic animals are highly inbred relative to theirmore » wild ancestors. In fact, most of the SNPs originated prior to domestication, and there is little to no evidence of selective sweeps for adaptive alleles on length scales of greater than 100 kb.« less
Complete nucleotide sequence of a monopartite Begomovirus and associated satellites infecting Carica papaya in Nepal.

PubMed

Shahid, M S; Yoshida, S; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

2013-06-01

Carica papaya (papaya) is a fruit crop that is cultivated mostly in kitchen gardens throughout Nepal. Leaf samples of C. papaya plants with leaf curling, vein darkening, vein thickening, and a reduction in leaf size were collected from a garden in Darai village, Rampur, Nepal in 2010. Full-length clones of a monopartite Begomovirus, a betasatellite and an alphasatellite were isolated. The complete nucleotide sequence of the Begomovirus showed the arrangement of genes typical of Old World begomoviruses with the highest nucleotide sequence identity (>99 %) to an isolate of Ageratum yellow vein virus (AYVV), confirming it as an isolate of AYVV. The complete nucleotide sequence of betasatellite showed greater than 89 % nucleotide sequence identity to an isolate of Tomato leaf curl Java betasatellite originating from Indonesian. The sequence of the alphasatellite displayed 92 % nucleotide sequence identity to Sida yellow vein China alphasatellite. This is the first identification of these components in Nepal and the first time they have been identified in papaya.
Target Site Recognition by a Diversity-Generating Retroelement

PubMed Central

Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.

2011-01-01

Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701
AgdbNet – antigen sequence database software for bacterial typing

PubMed Central

Jolley, Keith A; Maiden, Martin CJ

2006-01-01

Background Bacterial typing schemes based on the sequences of genes encoding surface antigens require databases that provide a uniform, curated, and widely accepted nomenclature of the variants identified. Due to the differences in typing schemes, imposed by the diversity of genes targeted, creating these databases has typically required the writing of one-off code to link the database to a web interface. Here we describe agdbNet, widely applicable web database software that facilitates simultaneous BLAST querying of multiple loci using either nucleotide or peptide sequences. Results Databases are described by XML files that are parsed by a Perl CGI script. Each database can have any number of loci, which may be defined by nucleotide and/or peptide sequences. The software is currently in use on at least five public databases for the typing of Neisseria meningitidis, Campylobacter jejuni and Streptococcus equi and can be set up to query internal isolate tables or suitably-configured external isolate databases, such as those used for multilocus sequence typing. The style of the resulting website can be fully configured by modifying stylesheets and through the use of customised header and footer files that surround the output of the script. Conclusion The software provides a rapid means of setting up customised Internet antigen sequence databases. The flexible configuration options enable typing schemes with differing requirements to be accommodated. PMID:16790057
mtDNA variation of the critically endangered hawksbill turtle (Eretmochelys imbricata) nesting on Iranian islands of the Persian Gulf.

PubMed

Tabib, M; Zolgharnein, H; Mohammadi, M; Salari-Aliabadi, M A; Qasemi, A; Roshani, S; Rajabi-Maham, H; Frootan, F

2011-01-01

Genetic diversity of sea turtles (hawksbill turtle) was studied using sequencing of mitochondrial DNA (mtDNA, D-loop region). Thirty dead embryos were collected from the Kish and Qeshm Islands in the Persian Gulf. Analysis of sequence variation over 890 bp of the mtDNA control region revealed five haplotypes among 30 individuals. This is the first time that Iranian haplotypes have been recorded. Nucleotide and haplotype diversity was 0.77 and 0.001 for Qeshm Island and 0.64 and 0.002 for Kish Island, respectively. Total haplotype diversity was calculated as 0.69, which demonstrates low genetic diversity in this area. The data also indicated very high rates of migration between the populations of these two islands. A comparison of our data with data from previous studies downloaded from a gene bank showed that turtles of the Persian Gulf migrated from the Pacific and the Sea of Oman into this area. On the other hand, evidence of migration from populations to the West was not found.
Molecular diversity of Rice grassy stunt virus in Vietnam.

PubMed

Ta, Hoang-Anh; Nguyen, Doan-Phuong; Causse, Sandrine; Nguyen, Thanh-Duc; Ngo, Vinh-Vien; Hébrard, Eugénie

2013-04-01

Rice grassy stunt virus (RGSV, Tenuivirus) recently emerged on rice in Vietnam, causing high yield losses during 2006-2009. The genetic diversity of RGSV is poorly documented. In this study, the two genes encoded by each ambisense segment RNA3 and RNA5 of RGSV isolates from six provinces of South Vietnam were sequenced. P3 and Pc3 (RNA3) have unknown function, P5 (RNA5) encodes the putative silencing suppressor, and Pc5 (RNA5) encodes the nucleocapsid protein (N). The sequences of 17 Vietnamese isolates were compared with reference isolates from North and South Philippines. The average nucleotide diversity among the isolates was low. We confirmed a higher variability of RNA3 than RNA5 and Pc3 than P3. No relationships between the genetic diversity and the geographic distribution of RGSV isolates could be ascertained, likely because of the long-distance migration of the insect vector. This data will contribute to a better understanding on the RGSV epidemiology in South Vietnam, a prerequisite for further management of the disease and rice breeding for resistance.
Nucleotide variation in genes invloved in wood formation in two pine species

Treesearch

David Pot; Lisa McMillan; Craig Echt; Gregoire Le Provost; Pauline Garnier-Gere; Sheree Cato; Christophe Plomion

2005-01-01

Nucleotide diversity in eight genes related to wood formation was investigated in two pine species, Pinus pinaster and P. radiata. The nucleotide diversity patterns observed and their properties were compared between the two species according to the specific characteristics of the samples analysed. A lower diversity was observed in P. radiata...
Nucleotide sequences encoding a thermostable alkaline protease

DOEpatents

Wilson, David B.; Lao, Guifang

1998-01-01

Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.
Diversity of thermophilic fungi in Tengchong Rehai National Park revealed by ITS nucleotide sequence analyses.

PubMed

Pan, Wen-Zheng; Huang, Xiao-Wei; Wei, Kang-Bi; Zhang, Chun-Mei; Yang, Dong-Mei; Ding, Jun-Mei; Zhang, Ke-Qin

2010-04-01

The geothermal sites near neutral and alkalescent thermal springs in Tengchong Rehai National Park were examined through cultivation-dependent approach to determine the diversity of thermophilic fungi in these environments. Here, we collected soils samples in this area, plated on agar media conducive for fungal growth, obtained pure cultures, and then employed the method of internal transcribed spacer (ITS) sequencing combined with morphological analysis for identification of thermophilic fungi to the species level. In total, 102 strains were isolated and identified as Rhizomucor miehei, Chaetomium sp., Talaromyces thermophilus, Talaromyces byssochlamydoides, Thermoascus aurantiacus Miehe var. levisporus, Thermomyces lanuginosus, Scytalidium thermophilum, Malbranchea flava, Myceliophthora sp. 1, Myceliophthora sp. 2, Myceliophthora sp. 3, and Coprinopsis sp. Two species, T. lanuginosus and S. thermophilum were the dominant species, representing 34.78% and 28.26% of the sample, respectively. Our results indicated a greater diversity of thermophilic fungi in neutral and alkaline geothermal sites than acidic sites around hot springs reported in previous studies. Most of our strains thrived at alkaline growth conditions.
Diversity of Pneumolysin and Pneumococcal Histidine Triad Protein D of Streptococcus pneumoniae Isolated from Invasive Diseases in Korean Children.

PubMed

Yun, Ki Wook; Lee, Hyunju; Choi, Eun Hwa; Lee, Hoan Jong

2015-01-01

Pneumolysin (Ply) and pneumococcal histidine triad protein D (PhtD) are candidate proteins for a next-generation pneumococcal vaccine. We aimed to analyze the genetic diversity and antigenic heterogeneity of Ply and PhtD for 173 pneumococci isolated from invasive diseases in Korean children. Allele was designated based on the variation of amino acid sequence. Antigenicity was predicted by the amino acid hydrophobicity of the region. There were seven and 39 allele types for the ply and phtD genes, respectively. The nucleotide sequence identity was 97.2%-99.9% for ply and 91.4%-98.0% for phtD gene. Only minor variations in hydrophobicity were noted among the antigenicity plots of Ply and PhtD. Overall, the allele types of the ply and phtD genes were remarkably homogeneous, and the antigenic diversity of the corresponding proteins was very limited. The Ply and PhtD could be useful antigens for universal pneumococcal vaccines.
Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

DOEpatents

McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Vitalis, Elizabeth A [Livermore, CA

2007-02-06

Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

DOEpatents

McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Vitalis, Elizabeth A [Livermore, CA

2009-02-24

Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Genotyping of ancient Mycobacterium tuberculosis strains reveals historic genetic diversity.

PubMed

Müller, Romy; Roberts, Charlotte A; Brown, Terence A

2014-04-22

The evolutionary history of the Mycobacterium tuberculosis complex (MTBC) has previously been studied by analysis of sequence diversity in extant strains, but not addressed by direct examination of strain genotypes in archaeological remains. Here, we use ancient DNA sequencing to type 11 single nucleotide polymorphisms and two large sequence polymorphisms in the MTBC strains present in 10 archaeological samples from skeletons from Britain and Europe dating to the second-nineteenth centuries AD. The results enable us to assign the strains to groupings and lineages recognized in the extant MTBC. We show that at least during the eighteenth-nineteenth centuries AD, strains of M. tuberculosis belonging to different genetic groups were present in Britain at the same time, possibly even at a single location, and we present evidence for a mixed infection in at least one individual. Our study shows that ancient DNA typing applied to multiple samples can provide sufficiently detailed information to contribute to both archaeological and evolutionary knowledge of the history of tuberculosis.
High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast

PubMed Central

Liachko, Ivan; Youngblood, Rachel A.; Keich, Uri; Dunham, Maitreya J.

2013-01-01

DNA replication origins are necessary for the duplication of genomes. In addition, plasmid-based expression systems require DNA replication origins to maintain plasmids efficiently. The yeast autonomously replicating sequence (ARS) assay has been a valuable tool in dissecting replication origin structure and function. However, the dearth of information on origins in diverse yeasts limits the availability of efficient replication origin modules to only a handful of species and restricts our understanding of origin function and evolution. To enable rapid study of origins, we have developed a sequencing-based suite of methods for comprehensively mapping and characterizing ARSs within a yeast genome. Our approach finely maps genomic inserts capable of supporting plasmid replication and uses massively parallel deep mutational scanning to define molecular determinants of ARS function with single-nucleotide resolution. In addition to providing unprecedented detail into origin structure, our data have allowed us to design short, synthetic DNA sequences that retain maximal ARS function. These methods can be readily applied to understand and modulate ARS function in diverse systems. PMID:23241746
Mitochondrial DNA variation of indigenous goats in Narok and Isiolo counties of Kenya.

PubMed

Kibegwa, F M; Githui, K E; Jung'a, J O; Badamana, M S; Nyamu, M N

2016-06-01

Phylogenetic relationships among and genetic variability within 60 goats from two different indigenous breeds in Narok and Isiolo counties in Kenya and 22 published goat samples were analysed using mitochondrial control region sequences. The results showed that there were 54 polymorphic sites in a 481-bp sequence and 29 haplotypes were determined. The mean haplotype diversity and nucleotide diversity were 0.981 ± 0.006 and 0.019 ± 0.001, respectively. The phylogenetic analysis in combination with goat haplogroup reference sequences from GenBank showed that all goat sequences were clustered into two haplogroups (A and G), of which haplogroup A was the commonest in the two populations. A very high percentage (99.90%) of the genetic variation was distributed within the regions, and a smaller percentage (0.10%) distributed among regions as revealed by the analysis of molecular variance (amova). This amova results showed that the divergence between regions was not statistically significant. We concluded that the high levels of intrapopulation diversity in Isiolo and Narok goats and the weak phylogeographic structuring suggested that there existed strong gene flow among goat populations probably caused by extensive transportation of goats in history. © 2015 Blackwell Verlag GmbH.

Genetic variations in two seahorse species (Hippocampus mohnikei and Hippocampus trimaculatus): evidence for middle Pleistocene population expansion.

PubMed

Zhang, Yanhong; Pham, Nancy Kim; Zhang, Huixian; Lin, Junda; Lin, Qiang

2014-01-01

Population genetic of seahorses is confidently influenced by their species-specific ecological requirements and life-history traits. In the present study, partial sequences of mitochondrial cytochrome b (cytb) and control region (CR) were obtained from 50 Hippocampus mohnikei and 92 H. trimaculatus from four zoogeographical zones. A total of 780 base pairs of cytb gene were sequenced to characterize mitochondrial DNA (mtDNA) diversity. The mtDNA marker revealed high haplotype diversity, low nucleotide diversity, and a lack of population structure across both populations of H. mohnikei and H. trimaculatus. A neighbour-joining (NJ) tree of cytb gene sequences showed that H. mohnikei haplotypes formed one cluster. A maximum likelihood (ML) tree of cytb gene sequences showed that H. trimaculatus belonged to one lineage. The star-like pattern median-joining network of cytb and CR markers indicated a previous demographic expansion of H. mohnikei and H. trimaculatus. The cytb and CR data sets exhibited a unimodal mismatch distribution, which may have resulted from population expansion. Mismatch analysis suggested that the expansion was initiated about 276,000 years ago for H. mohnikei and about 230,000 years ago for H. trimaculatus during the middle Pleistocene period. This study indicates a possible signature of genetic variation and population expansion in two seahorses under complex marine environments.
Genetic Diversity of Crimean Congo Hemorrhagic Fever Virus Strains from Iran

PubMed Central

Chinikar, Sadegh; Bouzari, Saeid; Shokrgozar, Mohammad Ali; Mostafavi, Ehsan; Jalali, Tahmineh; Khakifirouz, Sahar; Nowotny, Norbert; Fooks, Anthony R.; Shah-Hosseini, Nariman

2016-01-01

Background: Crimean Congo hemorrhagic fever virus (CCHFV) is a member of the Bunyaviridae family and Nairovirus genus. It has a negative-sense, single stranded RNA genome approximately 19.2 kb, containing the Small, Medium, and Large segments. CCHFVs are relatively divergent in their genome sequence and grouped in seven distinct clades based on S-segment sequence analysis and six clades based on M-segment sequences. Our aim was to obtain new insights into the molecular epidemiology of CCHFV in Iran. Methods: We analyzed partial and complete nucleotide sequences of the S and M segments derived from 50 Iranian patients. The extracted RNA was amplified using one-step RT-PCR and then sequenced. The sequences were analyzed using Mega5 software. Results: Phylogenetic analysis of partial S segment sequences demonstrated that clade IV-(Asia 1), clade IV-(Asia 2) and clade V-(Europe) accounted for 80 %, 4 % and 14 % of the circulating genomic variants of CCHFV in Iran respectively. However, one of the Iranian strains (Iran-Kerman/22) was associated with none of other sequences and formed a new clade (VII). The phylogenetic analysis of complete S-segment nucleotide sequences from selected Iranian CCHFV strains complemented with representative strains from GenBank revealed similar topology as partial sequences with eight major clusters. A partial M segment phylogeny positioned the Iranian strains in either association with clade III (Asia-Africa) or clade V (Europe). Conclusion: The phylogenetic analysis revealed subtle links between distant geographic locations, which we propose might originate either from international livestock trade or from long-distance carriage of CCHFV by infected ticks via bird migration. PMID:27308271
Complete genome analysis of dengue virus type 3 isolated from the 2013 dengue outbreak in Yunnan, China.

PubMed

Wang, Xiaodan; Ma, Dehong; Huang, Xinwei; Li, Lihua; Li, Duo; Zhao, Yujiao; Qiu, Lijuan; Pan, Yue; Chen, Junying; Xi, Juemin; Shan, Xiyun; Sun, Qiangming

2017-06-15

In the past few decades, dengue has spread rapidly and is an emerging disease in China. An unexpected dengue outbreak occurred in Xishuangbanna, Yunnan, China, resulting in 1331 patients in 2013. In order to obtain the complete genome information and perform mutation and evolutionary analysis of causative agent related to this largest outbreak of dengue fever. The viruses were isolated by cell culture and evaluated by genome sequence analysis. Phylogenetic trees were then constructed by Neighbor-Joining methods (MEGA6.0), followed by analysis of nucleotide mutation and amino acid substitution. The analysis of the diversity of secondary structure for E and NS1 protein were also performed. Then selection pressures acting on the coding sequences were estimated by PAML software. The complete genome sequences of two isolated strains (YNSW1, YNSW2) were 10,710 and 10,702 nucleotides in length, respectively. Phylogenetic analysis revealed both strain were classified as genotype II of DENV-3. The results indicated that both isolated strains of Xishuangbanna in 2013 and Laos 2013 stains (KF816161.1, KF816158.1, LC147061.1, LC147059.1, KF816162.1) were most similar to Bangladesh (AY496873.2) in 2002. After comparing with the DENV-3SS (H87) 62 amino acid substitutions were identified in translated regions, and 38 amino acid substitutions were identified in translated regions compared with DENV-3 genotype II stains Bangladesh (AY496873.2). 27(YNSW1) or 28(YNSW2) single nucleotide changes were observed in structural protein sequences with 7(YNSW1) or 8(YNSW2) non-synonymous mutations compared with AY496873.2. Of them, 4 non-synonymous mutations were identified in E protein sequences with (2 in the β-sheet, 2 in the coil). Meanwhile, 117(YNSW1) or 115 (YNSW2) single nucleotide changes were observed in non-structural protein sequences with 31(YNSW1) or 30 (YNSW2) non-synonymous mutations. Particularly, 14 single nucleotide changes were observed in NS1 sequences with 4/14 non-synonymous substitutions (4 in the coil). Selection pressure analysis revealed no positive selection in the amino acid sites of the genes encoding for structural and non-structural proteins. This study may help understand the intrinsic geographical relatedness of dengue virus 3 and contributes further to research on their infectivity, pathogenicity and vaccine development. Copyright © 2017 Elsevier B.V. All rights reserved.
Nucleotide sequences encoding a thermostable alkaline protease

DOEpatents

Wilson, D.B.; Lao, G.

1998-01-06

Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.
Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

PubMed

Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

2010-07-16

Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

PubMed

Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

2013-01-01

Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.
Phylogenetic analysis of Tibetan mastiffs based on mitochondrial hypervariable region I.

PubMed

Ren, Zhanjun; Chen, Huiling; Yang, Xuejiao; Zhang, Chengdong

2017-03-01

Recently, the number of Tibetan mastiffs, which is a precious germplasm resource and cultural heritage, is decreasing sharply. Therefore, the genetic diversity of Tibetan mastiffs needs to be studied to clarify its phylogenetics relationships and lay the foundation for resource protection, rational development and utilization of Tibetan mastiffs. We sequenced hypervariable region I of mitochondrial DNA (mtDNA) of 110 individuals from Tibet region and Gansu province. A total of 12 polymorphic sites were identified which defined eight haplotypes of which H4 and H8 were unique to Tibetan population with H8 being identified first. The haplotype diversity (Hd: 0.808), nucleotide diversity (Pi: 0.603%), the average number of nucleotide difference (K: 3.917) of Tibetan mastiffs from Gansu were higher than those from Tibet region (Hd: 0.794; Pi: 0.589%; K: 3.831), which revealed higher genetic diversity in Gansu. In terms of total population, the genetic variation was low. The median-joining network and phylogenetic tree based on the mtDNA hypervariable region I showed that Tibetan mastiffs originated from grey wolves, as the other domestic dogs and had different history of maternal origin. The mismatch distribution analysis and neutrality tests indicated that Tibetan mastiffs were in genetic equilibrium or in a population decline.
Considerable MHC Diversity Suggests That the Functional Extinction of Baiji Is Not Related to Population Genetic Collapse

PubMed Central

Xu, Shixia; Ju, Jianfeng; Zhou, Xuming; Wang, Lian; Zhou, Kaiya; Yang, Guang

2012-01-01

To further extend our understanding of the mechanism causing the current nearly extinct status of the baiji (Lipotes vexillifer), one of the most critically endangered species in the world, genetic diversity at the major histocompatibility complex (MHC) class II DRB locus was investigated in the baiji. Nine highly divergent DRB alleles were identified in 17 samples, with an average of 28.4 (13.2%) nucleotide difference and 16.7 (23.5%) amino acid difference between alleles. The unexpectedly high levels of DRB allelic diversity in the baiji may partly be attributable to its evolutionary adaptations to the freshwater environment which is regarded to have a higher parasite diversity compared to the marine environment. In addition, balancing selection was found to be the main mechanisms in generating sequence diversity at baiji DRB gene. Considerable sequence variation at the adaptive MHC genes despite of significant loss of neutral genetic variation in baiji genome might suggest that intense selection has overpowered random genetic drift as the main evolutionary forces, which further suggested that the critically endangered or nearly extinct status of the baiji is not an outcome of genetic collapse. PMID:22272349
Lack of genetic structure in the jellyfish Pelagia noctiluca (Cnidaria: Scyphozoa: Semaeostomeae) across European seas.

PubMed

Stopar, Katja; Ramsak, Andreja; Trontelj, Peter; Malej, Alenka

2010-10-01

The genetic structure of the holopelagic scyphozoan Pelagia noctiluca was inferred based on the study of 144 adult medusae. The areas of study were five geographic regions in two European seas (Eastern Atlantic and Mediterranean Sea). A 655-bp sequence of mitochondrial cytochrome c oxidase subunit I (COI), and a 645-bp sequence of two nuclear internal transcribed spacers (ITS1 and ITS2) were analyzed. The protein coding COI gene showed a higher level of divergence than the combined nuclear ITS fragment (haplotype diversity 0.962 vs. 0.723, nucleotide diversity 1.16% vs. 0.31%). Phylogeographic analysis on COI gene revealed two clades, the larger consisting of specimens from all sampling sites, and the smaller mostly formed of specimens from the Mediterranean Sea. Haplotype diversity was very high throughout the sampled area, and within sample diversity was higher than diversity among geographical regions. No strongly supported genetically or geographically distinct groups of P. noctiluca were found. The results - long distance dispersal, insignificant F(ST) values, lack of isolation by distance - pointed toward an admixture among Mediterranean and East Atlantic populations. Copyright 2010 Elsevier Inc. All rights reserved.
Identification of Novel Sequence Types among Staphylococcus haemolyticus Isolated from Variety of Infections in India.

PubMed

Panda, Sasmita; Jena, Smrutiti; Sharma, Savitri; Dhawan, Benu; Nath, Gopal; Singh, Durg Vijai

2016-01-01

The aim of this study was to determine sequence types of 34 S. haemolyticus strains isolated from a variety of infections between 2013 and 2016 in India by MLST. The MEGA5.2 software was used to align and compare the nucleotide sequences. The advanced cluster analysis was performed to define the clonal complexes. MLST analysis showed 24 new sequence types (ST) among S. haemolyticus isolates, irrespective of sources and place of isolation. The finding of this study allowed to set up an MLST database on the PubMLST.org website using BIGSdb software and made available at http://pubmlst.org/shaemolyticus/. The data of this study thus suggest that MLST can be used to study population structure and diversity among S. haemolyticus isolates.
Identification of Novel Sequence Types among Staphylococcus haemolyticus Isolated from Variety of Infections in India

PubMed Central

Panda, Sasmita; Jena, Smrutiti; Sharma, Savitri; Dhawan, Benu; Nath, Gopal

2016-01-01

The aim of this study was to determine sequence types of 34 S. haemolyticus strains isolated from a variety of infections between 2013 and 2016 in India by MLST. The MEGA5.2 software was used to align and compare the nucleotide sequences. The advanced cluster analysis was performed to define the clonal complexes. MLST analysis showed 24 new sequence types (ST) among S. haemolyticus isolates, irrespective of sources and place of isolation. The finding of this study allowed to set up an MLST database on the PubMLST.org website using BIGSdb software and made available at http://pubmlst.org/shaemolyticus/. The data of this study thus suggest that MLST can be used to study population structure and diversity among S. haemolyticus isolates. PMID:27824930
Molecular phylogeny, population genetics, and evolution of heterocystous cyanobacteria using nifH gene sequences.

PubMed

Singh, Prashant; Singh, Satya Shila; Elster, Josef; Mishra, Arun Kumar

2013-06-01

In order to assess phylogeny, population genetics, and approximation of future course of cyanobacterial evolution based on nifH gene sequences, 41 heterocystous cyanobacterial strains collected from all over India have been used in the present study. NifH gene sequence analysis data confirm that the heterocystous cyanobacteria are monophyletic while the stigonematales show polyphyletic origin with grave intermixing. Further, analysis of nifH gene sequence data using intricate mathematical extrapolations revealed that the nucleotide diversity and recombination frequency is much greater in Nostocales than the Stigonematales. Similarly, DNA divergence studies showed significant values of divergence with greater gene conversion tracts in the unbranched (Nostocales) than the branched (Stigonematales) strains. Our data strongly support the origin of true branching cyanobacterial strains from the unbranched strains.
Genetic Diversity and Population Structure of F3:6 Nebraska Winter Wheat Genotypes Using Genotyping-By-Sequencing.

PubMed

Eltaher, Shamseldeen; Sallam, Ahmed; Belamkar, Vikas; Emara, Hamdy A; Nower, Ahmed A; Salem, Khaled F M; Poland, Jesse; Baenziger, Peter S

2018-01-01

The availability of information on the genetic diversity and population structure in wheat ( Triticum aestivum L.) breeding lines will help wheat breeders to better use their genetic resources and manage genetic variation in their breeding program. The recent advances in sequencing technology provide the opportunity to identify tens or hundreds of thousands of single nucleotide polymorphism (SNPs) in large genome species (e.g., wheat). These SNPs can be utilized for understanding genetic diversity and performing genome wide association studies (GWAS) for complex traits. In this study, the genetic diversity and population structure were investigated in a set of 230 genotypes (F 3:6 ) derived from various crosses as a prerequisite for GWAS and genomic selection. Genotyping-by-sequencing provided 25,566 high-quality SNPs. The polymorphism information content (PIC) across chromosomes ranged from 0.09 to 0.37 with an average of 0.23. The distribution of SNPs markers on the 21 chromosomes ranged from 319 on chromosome 3D to 2,370 on chromosome 3B. The analysis of population structure revealed three subpopulations (G1, G2, and G3). Analysis of molecular variance identified 8% variance among and 92% within subpopulations. Of the three subpopulations, G2 had the highest level of genetic diversity based on three genetic diversity indices: Shannon's information index ( I ) = 0.494, diversity index ( h ) = 0.328 and unbiased diversity index (uh) = 0.331, while G3 had lowest level of genetic diversity ( I = 0.348, h = 0.226 and uh = 0.236). This high genetic diversity identified among the subpopulations can be used to develop new wheat cultivars.
Genetic Diversity and Population Structure of F3:6 Nebraska Winter Wheat Genotypes Using Genotyping-By-Sequencing

PubMed Central

Eltaher, Shamseldeen; Sallam, Ahmed; Belamkar, Vikas; Emara, Hamdy A.; Nower, Ahmed A.; Salem, Khaled F. M.; Poland, Jesse; Baenziger, Peter S.

2018-01-01

The availability of information on the genetic diversity and population structure in wheat (Triticum aestivum L.) breeding lines will help wheat breeders to better use their genetic resources and manage genetic variation in their breeding program. The recent advances in sequencing technology provide the opportunity to identify tens or hundreds of thousands of single nucleotide polymorphism (SNPs) in large genome species (e.g., wheat). These SNPs can be utilized for understanding genetic diversity and performing genome wide association studies (GWAS) for complex traits. In this study, the genetic diversity and population structure were investigated in a set of 230 genotypes (F3:6) derived from various crosses as a prerequisite for GWAS and genomic selection. Genotyping-by-sequencing provided 25,566 high-quality SNPs. The polymorphism information content (PIC) across chromosomes ranged from 0.09 to 0.37 with an average of 0.23. The distribution of SNPs markers on the 21 chromosomes ranged from 319 on chromosome 3D to 2,370 on chromosome 3B. The analysis of population structure revealed three subpopulations (G1, G2, and G3). Analysis of molecular variance identified 8% variance among and 92% within subpopulations. Of the three subpopulations, G2 had the highest level of genetic diversity based on three genetic diversity indices: Shannon’s information index (I) = 0.494, diversity index (h) = 0.328 and unbiased diversity index (uh) = 0.331, while G3 had lowest level of genetic diversity (I = 0.348, h = 0.226 and uh = 0.236). This high genetic diversity identified among the subpopulations can be used to develop new wheat cultivars. PMID:29593779
Patterns of diversity and recombination along chromosome 1 of maize (Zea mays ssp. mays L.).

PubMed Central

Tenaillon, Maud I; Sawkins, Mark C; Anderson, Lorinda K; Stack, Stephen M; Doebley, John; Gaut, Brandon S

2002-01-01

We investigate the interplay between genetic diversity and recombination in maize (Zea mays ssp. mays). Genetic diversity was measured in three types of markers: single-nucleotide polymorphisms, indels, and microsatellites. All three were examined in a sample of previously published DNA sequences from 21 loci on maize chromosome 1. Small indels (1-5 bp) were numerous and far more common than large indels. Furthermore, large indels (>100 bp) were infrequent in the population sample, suggesting they are slightly deleterious. The 21 loci also contained 47 microsatellites, of which 33 were polymorphic. Diversity in SNPs, indels, and microsatellites was compared to two measures of recombination: C (=4Nc) estimated from DNA sequence data and R based on a quantitative recombination nodule map of maize synaptonemal complex 1. SNP diversity was correlated with C (r = 0.65; P = 0.007) but not with R (r = -0.10; P = 0.69). Given the lack of correlation between R and SNP diversity, the correlation between SNP diversity and C may be driven by demography. In contrast to SNP diversity, microsatellite diversity was correlated with R (r = 0.45; P = 0.004) but not C (r = -0.025; P = 0.55). The correlation could arise if recombination is mutagenic for microsatellites, or it may be consistent with background selection that is apparent only in this class of rapidly evolving markers. PMID:12454083
Genetic diversity of three surface protein genes in Plasmodium malariae from three Asian countries.

PubMed

Srisutham, Suttipat; Saralamba, Naowarat; Sriprawat, Kanlaya; Mayxay, Mayfong; Smithuis, Frank; Nosten, Francois; Pukrittayakamee, Sasithon; Day, Nicholas P J; Dondorp, Arjen M; Imwong, Mallika

2018-01-11

Genetic diversity of the three important antigenic proteins, namely thrombospondin-related anonymous protein (TRAP), apical membrane antigen 1 (AMA1), and 6-cysteine protein (P48/45), all of which are found in various developmental stages of Plasmodium parasites is crucial for targeted vaccine development. While studies related to the genetic diversity of these proteins are available for Plasmodium falciparum and Plasmodium vivax, barely enough information exists regarding Plasmodium malariae. The present study aims to demonstrate the genetic variations existing among these three genes in P. malariae by analysing their diversity at nucleotide and protein levels. Three surface protein genes were isolated from 45 samples collected in Thailand (N = 33), Myanmar (N = 8), and Lao PDR (N = 4), using conventional polymerase chain reaction (PCR) assay. Then, the PCR products were sequenced and analysed using BioEdit, MEGA6, and DnaSP programs. The average pairwise nucleotide diversities (π) of P. malariae trap, ama1, and p48/45 were 0.00169, 0.00413, and 0.00029, respectively. The haplotype diversities (Hd) of P. malariae trap, ama1, and p48/45 were 0.919, 0.946, and 0.130, respectively. Most of the nucleotide substitutions were non-synonymous, which indicated that the genetic variations of these genes were maintained by positive diversifying selection, thus, suggesting their role as a potential target of protective immune response. Amino acid substitutions of P. malariae TRAP, AMA1, and P48/45 could be categorized to 17, 20, and 2 unique amino-acid variants, respectively. For further vaccine development, carboxyl terminal of P48/45 would be a good candidate according to conserved amino acid at low genetic diversity (π = 0.2-0.3). High mutational diversity was observed in P. malariae trap and ama1 as compared to p48/45 in P. malariae samples isolated from Thailand, Myanmar, and Lao PDR. Taken together, these results suggest that P48/45 might be a good vaccine candidate against P. malariae infection because of its sufficiently low genetic diversity and highly conserved amino acids especially on the carboxyl end.
Structure of yeast Argonaute with guide RNA

PubMed Central

Nakanishi, Kotaro; Weinberg, David E.; Bartel, David P.; Patel, Dinshaw J.

2012-01-01

The RNA-induced silencing complex, comprising Argonaute and guide RNA, mediates RNA interference. Here we report the 3.2 Å crystal structure of Kluyveromyces Argonaute (KpAGO) fortuitously complexed with guide RNA originating from small-RNA duplexes autonomously loaded and processed by recombinant KpAGO. Despite their diverse sequences, guide-RNA nucleotides 1–8 are positioned similarly, with sequence-independent contacts to bases, phosphates and 2′-hydroxyl groups pre-organizing the backbone of nucleotides 2–8 in a near–A-form conformation. Compared with prokaryotic Argonautes, KpAGO has numerous surface-exposed insertion segments, with a cluster of conserved insertions repositioning the N domain to enable full propagation of guide–target pairing. Compared with Argonautes in inactive conformations, KpAGO has a hydrogen-bond network that stabilizes an expanded and repositioned loop, which inserts an invariant glutamate into the catalytic pocket. Mutation analyses and analogies to Ribonuclease H indicate that insertion of this glutamate finger completes a universally conserved catalytic tetrad, thereby activating Argonaute for RNA cleavage. PMID:22722195
Differentiation of closely related but biologically distinct cherry isolates of Prunus necrotic ringspot virus by polymerase chain reaction.

PubMed

Hammond, R W; Crosslin, J M; Pasini, R; Howell, W E; Mink, G I

1999-07-01

Prunus necrotic ringspot ilarvirus (PNRSV) exists as a number of biologically distinct variants which differ in host specificity, serology, and pathology. Previous nucleotide sequence alignment and phylogenetic analysis of cloned reverse transcription-polymerase chain reaction (RT-PCR) products of several biologically distinct sweet cherry isolates revealed correlations between symptom type and the nucleotide and amino acid sequences of the 3a (putative movement protein) and 3b (coat protein) open reading frames. Based upon this analysis, RT-PCR assays have been developed that can identify isolates displaying different symptoms and serotypes. The incorporation of primers in a multiplex PCR protocol permits rapid detection and discrimination among the strains. The results of PCR amplification using type-specific primers that amplify a portion of the coat protein gene demonstrate that the primer-selection procedure developed for PNRSV constitutes a reliable method of viral strain discrimination in cherry for disease control and will also be useful for examining biological diversity within the PNRSV virus group.
Frequency and genetic characterization of V(DD)J recombinants in the human peripheral blood antibody repertoire.

PubMed

Briney, Bryan S; Willis, Jordan R; Hicar, Mark D; Thomas, James W; Crowe, James E

2012-09-01

Antibody heavy-chain recombination that results in the incorporation of multiple diversity (D) genes, although uncommon, contributes substantially to the diversity of the human antibody repertoire. Such recombination allows the generation of heavy chain complementarity determining region 3 (HCDR3) regions of extreme length and enables junctional regions that, because of the nucleotide bias of N-addition regions, are difficult to produce through normal V(D)J recombination. Although this non-classical recombination process has been observed infrequently, comprehensive analysis of the frequency and genetic characteristics of such events in the human peripheral blood antibody repertoire has not been possible because of the rarity of such recombinants and the limitations of traditional sequencing technologies. Here, through the use of high-throughput sequencing of the normal human peripheral blood antibody repertoire, we analysed the frequency and genetic characteristics of V(DD)J recombinants. We found that these recombinations were present in approximately 1 in 800 circulating B cells, and that the frequency was severely reduced in memory cell subsets. We also found that V(DD)J recombination can occur across the spectrum of diversity genes, indicating that virtually all recombination signal sequences that flank diversity genes are amenable to V(DD)J recombination. Finally, we observed a repertoire bias in the diversity gene repertoire at the upstream (5') position, and discovered that this bias was primarily attributable to the order of diversity genes in the genomic locus. © 2012 The Authors. Immunology © 2012 Blackwell Publishing Ltd.
High Diversity of CTX-M Extended-Spectrum β-Lactamases in Municipal Wastewater and Urban Wetlands

PubMed Central

Borgogna, Timothy R.; Borgogna, Joanna-Lynn; Mielke, Jenna A.; Brown, Celeste J.; Top, Eva M.; Botts, Ryan T.

2016-01-01

The CTX-M-type extended-spectrum β-lactamases (ESBLs) present a serious public health threat as they have become nearly ubiquitous among clinical gram-negative pathogens, particularly the enterobacteria. To aid in the understanding and eventual control of the spread of such resistance genes, we sought to determine the diversity of CTX-M ESBLs not among clinical isolates, but in the environment, where weaker and more diverse selective pressures may allow greater enzyme diversification. This was done by examining the CTX-M diversity in municipal wastewater and urban coastal wetlands in southern California, United States, by Sanger sequencing of polymerase chain reaction amplicons. Of the five known CTX-M phylogroups (1, 2, 8, 9, and 25), only genes from groups 1 and 2 were detected in both wastewater treatment plants (WWTPs), and group 1 genes were also detected in one of the two wetlands after a winter rain. The highest relative abundance of blaCTX-M group 1 genes was in the sludge of one WWTP (2.1 × 10−4 blaCTX-M copies/16S rRNA gene copy). Gene libraries revealed surprisingly high nucleotide sequence diversity, with 157 new variants not found in GenBank, representing 99 novel amino acid sequences. Our results indicate that the resistomes of WWTPs and urban wetlands contain diverse blaCTX-M ESBLs, which may constitute a mobile reservoir of clinically relevant resistance genes. PMID:26670020

Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts

PubMed Central

Cheng, Bing; Furtado, Agnelo

2017-01-01

Abstract Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5΄UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee. PMID:29048540
Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat.

PubMed

Harris, J Kirk; Caporaso, J Gregory; Walker, Jeffrey J; Spear, John R; Gold, Nicholas J; Robertson, Charles E; Hugenholtz, Philip; Goodrich, Julia; McDonald, Daniel; Knights, Dan; Marshall, Paul; Tufo, Henry; Knight, Rob; Pace, Norman R

2013-01-01

The microbial mats of Guerrero Negro (GN), Baja California Sur, Mexico historically were considered a simple environment, dominated by cyanobacteria and sulfate-reducing bacteria. Culture-independent rRNA community profiling instead revealed these microbial mats as among the most phylogenetically diverse environments known. A preliminary molecular survey of the GN mat based on only ∼1500 small subunit rRNA gene sequences discovered several new phylum-level groups in the bacterial phylogenetic domain and many previously undetected lower-level taxa. We determined an additional ∼119,000 nearly full-length sequences and 28,000 >200 nucleotide 454 reads from a 10-layer depth profile of the GN mat. With this unprecedented coverage of long sequences from one environment, we confirm the mat is phylogenetically stratified, presumably corresponding to light and geochemical gradients throughout the depth of the mat. Previous shotgun metagenomic data from the same depth profile show the same stratified pattern and suggest that metagenome properties may be predictable from rRNA gene sequences. We verify previously identified novel lineages and identify new phylogenetic diversity at lower taxonomic levels, for example, thousands of operational taxonomic units at the family-genus levels differ considerably from known sequences. The new sequences populate parts of the bacterial phylogenetic tree that previously were poorly described, but indicate that any comprehensive survey of GN diversity has only begun. Finally, we show that taxonomic conclusions are generally congruent between Sanger and 454 sequencing technologies, with the taxonomic resolution achieved dependent on the abundance of reference sequences in the relevant region of the rRNA tree of life.
Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.

PubMed

Panwar, Bharat; Raghava, Gajendra P S

2015-04-01

The RNA-protein interactions play a diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVM(light)) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew's correlation coefficient by SVM(light) based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, and UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (http://crdd.osdd.net/raghava/rnapin/). Copyright © 2015 Elsevier Inc. All rights reserved.
Variability and transmission by Aphis glycines of North American and Asian Soybean mosaic virus isolates.

PubMed

Domier, L L; Latorre, I J; Steinlage, T A; McCoppin, N; Hartman, G L

2003-10-01

The variability of North American and Asian strains and isolates of Soybean mosaic virus was investigated. First, polymerase chain reaction (PCR) products representing the coat protein (CP)-coding regions of 38 SMVs were analyzed for restriction fragment length polymorphisms (RFLP). Second, the nucleotide and predicted amino acid sequence variability of the P1-coding region of 18 SMVs and the helper component/protease (HC/Pro) and CP-coding regions of 25 SMVs were assessed. The CP nucleotide and predicted amino acid sequences were the most similar and predicted phylogenetic relationships similar to those obtained from RFLP analysis. Neither RFLP nor sequence analyses of the CP-coding regions grouped the SMVs by geographical origin. The P1 and HC/Pro sequences were more variable and separated the North American and Asian SMV isolates into two groups similar to previously reported differences in pathogenic diversity of the two sets of SMV isolates. The P1 region was the most informative of the three regions analyzed. To assess the biological relevance of the sequence differences in the HC/Pro and CP coding regions, the transmissibility of 14 SMV isolates by Aphis glycines was tested. All field isolates of SMV were transmitted efficiently by A. glycines, but the laboratory isolates analyzed were transmitted poorly. The amino acid sequences from most, but not all, of the poorly transmitted isolates contained mutations in the aphid transmission-associated DAG and/or KLSC amino acid sequence motifs of CP and HC/Pro, respectively.
Natural variations in OsγTMT contribute to diversity of the α-tocopherol content in rice.

PubMed

Wang, Xiao-Qiang; Yoon, Min-Young; He, Qiang; Kim, Tae-Sung; Tong, Wei; Choi, Bu-Woong; Lee, Young-Sang; Park, Yong-Jin

2015-12-01

Tocopherols and tocotrienols, collectively known as tocochromanols, are lipid-soluble molecules that belong to the group of vitamin E compounds. Among them, α-tocopherol (αΤ) is one of the antioxidants with diverse functions and benefits for humans and animals. Thus, understanding the genetic basis of these traits would be valuable to improve nutritional quality by breeding in rice. Genome-wide association study (GWAS) has emerged as a powerful strategy for identifying genes or quantitative trait loci (QTL) underlying complex traits in plants. To discover the genes or QTLs underlying the naturally occurring variations of αΤ content in rice, we performed GWAS using 1.44 million high-quality single-nucleotide polymorphisms acquired from re-sequencing of 137 accessions from a diverse rice core collection. Thirteen candidate genes were found across 2-year phenotypic data, among which gamma-tocopherol methyltransferase (OsγTMT) was identified as the major factor responsible for the αΤ content among rice accessions. Nucleotide variations in the coding region of OsγTMT were significantly associated with the αΤ content variations, while nucleotide polymorphisms in the promoter region of OsγTMT also could partly demonstrate the correlation with αΤ content variations, according to our RNA expression analyses. This study provides useful information for genetic factors underlying αΤ content variations in rice, which will significantly contribute the research on αΤ biosynthesis mechanisms and αΤ improvement of rice.
Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

PubMed Central

Huang, Yongjie; Mrázek, Jan

2014-01-01

Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877
Significant genetic differentiation between native and introduced silver carp (Hypophthalmichthys molitrix) inferred from mtDNA analysis

USGS Publications Warehouse

Li, S.-F.; Xu, J.-W.; Yang, Q.-L.; Wang, C.H.; Chapman, D.C.; Lu, G.

2011-01-01

Silver carp Hypophthalmichthys molitrix (Cyprinidae) is native to China and has been introduced to over 80 countries. The extent of genetic diversity in introduced silver carp and the genetic divergence between introduced and native populations remain largely unknown. In this study, 241 silver carp sampled from three major native rivers and two non-native rivers (Mississippi River and Danube River) were analyzed using nucleotide sequences of mitochondrial COI gene and D-loop region. A total of 73 haplotypes were observed, with no haplotype found common to all the five populations and eight haplotypes shared by two to four populations. As compared with introduced populations, all native populations possess both higher haplotype diversity and higher nucleotide diversity, presumably a result of the founder effect. Significant genetic differentiation was revealed between native and introduced populations as well as among five sampled populations, suggesting strong selection pressures might have occurred in introduced populations. Collectively, this study not only provides baseline information for sustainable use of silver carp in their native country (i.e., China), but also offers first-hand genetic data for the control of silver carp in countries (e.g., the United States) where they are considered invasive.
Deep Sequencing of the Trypanosoma cruzi GP63 Surface Proteases Reveals Diversity and Diversifying Selection among Chronic and Congenital Chagas Disease Patients

PubMed Central

Llewellyn, Martin S.; Messenger, Louisa A.; Luquetti, Alejandro O.; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B. N.; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A.

2015-01-01

Background Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. Methodology/ Principal Findings A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target—ND5—was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Conclusions/Significance Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I gene family suggests a link between genetic diversity within this gene family and survival in the mammalian host. PMID:25849488
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

PubMed Central

Pruitt, Kim D.; Tatusova, Tatiana; Maglott, Donna R.

2005-01-01

The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff. PMID:15608248
Analysis of the beak and feather disease viral genome indicates the existence of several genotypes which have a complex psittacine host specificity.

PubMed

de Kloet, E; de Kloet, S R

2004-12-01

A study was made of the phylogenetic relationships between fifteen complete nucleotide sequences as well as 43 nucleotide sequences of the putative coat protein gene of different strains belonging to the virus species Beak and feather disease virus obtained from 39 individuals of 16 psittacine species. The species included among others, cockatoos ( Cacatuini), African grey parrots ( Psittacus erithacus) and peach-faced lovebirds ( Agapornis roseicollis), which were infected at different geographical locations, within and outside Australia, the native origin of the virus. The derived amino acid sequences of the putative coat protein were highly diverse, with differences between some strains amounting to 50 of the 250 amino acids. Phylogenetic analysis demonstrated that the putative coat gene sequences form six clusters which show a varying degree of psittacine species specificity. Most, but not all strains infecting African grey parrots formed a single cluster as did the strains infecting the cockatoos. Strains infecting the lovebirds clustered with those infecting such Australasian species as Eclectus roratus, Psittacula kramerii and Psephotus haematogaster. Although individual birds included in this study were, where studied, often infected by closely related strains, infection by highly diverged trains was also detected. The possible relationship between BFD viral strains and clinical disease signs is discussed.
IG and TR single chain fragment variable (scFv) sequence analysis: a new advanced functionality of IMGT/V-QUEST and IMGT/HighV-QUEST.

PubMed

Giudicelli, Véronique; Duroux, Patrice; Kossida, Sofia; Lefranc, Marie-Paule

2017-06-26

IMGT®, the international ImMunoGeneTics information system® ( http://www.imgt.org ), was created in 1989 in Montpellier, France (CNRS and Montpellier University) to manage the huge and complex diversity of the antigen receptors, and is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. Immunoglobulins (IG) or antibodies and T cell receptors (TR) are managed and described in the IMGT® databases and tools at the level of receptor, chain and domain. The analysis of the IG and TR variable (V) domain rearranged nucleotide sequences is performed by IMGT/V-QUEST (online since 1997, 50 sequences per batch) and, for next generation sequencing (NGS), by IMGT/HighV-QUEST, the high throughput version of IMGT/V-QUEST (portal begun in 2010, 500,000 sequences per batch). In vitro combinatorial libraries of engineered antibody single chain Fragment variable (scFv) which mimic the in vivo natural diversity of the immune adaptive responses are extensively screened for the discovery of novel antigen binding specificities. However the analysis of NGS full length scFv (~850 bp) represents a challenge as they contain two V domains connected by a linker and there is no tool for the analysis of two V domains in a single chain. The functionality "Analyis of single chain Fragment variable (scFv)" has been implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST for the analysis of the two V domains of IG and TR scFv. It proceeds in five steps: search for a first closest V-REGION, full characterization of the first V-(D)-J-REGION, then search for a second V-REGION and full characterization of the second V-(D)-J-REGION, and finally linker delimitation. For each sequence or NGS read, positions of the 5'V-DOMAIN, linker and 3'V-DOMAIN in the scFv are provided in the 'V-orientated' sense. Each V-DOMAIN is fully characterized (gene identification, sequence description, junction analysis, characterization of mutations and amino changes). The functionality is generic and can analyse any IG or TR single chain nucleotide sequence containing two V domains, provided that the corresponding species IMGT reference directory is available. The "Analysis of single chain Fragment variable (scFv)" implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST provides the identification and full characterization of the two V domains of full-length scFv (~850 bp) nucleotide sequences from combinatorial libraries. The analysis can also be performed on concatenated paired chains of expressed antigen receptor IG or TR repertoires.
Genomic diversity is similar between Atlantic Forest restorations and natural remnants for the native tree Casearia sylvestris Sw.

PubMed

Gomes Viana, João Paulo; Bohrer Monteiro Siqueira, Marcos Vinícius; Araujo, Fabiano Lucas; Grando, Carolina; Sanae Sujii, Patricia; Silvestre, Ellida de Aguiar; Novello, Mariana; Pinheiro, José Baldin; Cavallari, Marcelo Mattos; Brancalion, Pedro H S; Rodrigues, Ricardo Ribeiro; Pereira de Souza, Anete; Catchen, Julian; Zucchi, Maria I

2018-01-01

The primary focus of tropical forest restoration has been the recovery of forest structure and tree taxonomic diversity, with limited attention given to genetic conservation. Populations reintroduced through restoration plantings may have low genetic diversity and be genetically structured due to founder effects and genetic drift, which limit the potential of restoration to recover ecologically resilient plant communities. Here, we studied the genetic diversity, genetic structure and differentiation using single nucleotide polymorphisms (SNP) markers between restored and natural populations of the native tree Casearia sylvestris in the Atlantic Forest of Brazil. We sampled leaves from approximately 24 adult individuals in each of the study sites: two restoration plantations (27 and 62 years old) and two forest remnants. We prepared and sequenced a genotyping-by-sequencing library, SNP markers were identified de novo using Stacks pipeline, and genetic parameters and structure analyses were then estimated for populations. The sequencing step was successful for 80 sampled individuals. Neutral genetic diversity was similar among restored and natural populations (AR = 1.72 ± 0.005; HO = 0.135 ± 0.005; HE = 0.167 ± 0.005; FIS = 0.16 ± 0.022), which were not genetically structured by population subdivision. In spite of this absence of genetic structure by population we found genetic structure within populations but even so there is not spatial genetic structure in any population studied. Less than 1% of the neutral alleles were exclusive to a population. In general, contrary to our expectations, restoration plantations were then effective for conserving tree genetic diversity in human-modified tropical landscapes. Furthermore, we demonstrate that genotyping-by-sequencing can be a useful tool in restoration genetics.
C/N Ratio Drives Soil Actinobacterial Cellobiohydrolase Gene Diversity

PubMed Central

Prendergast-Miller, Miranda T.; Poonpatana, Pabhon; Farrell, Mark; Bissett, Andrew; Macdonald, Lynne M.; Toscas, Peter; Richardson, Alan E.; Thrall, Peter H.

2015-01-01

Cellulose accounts for approximately half of photosynthesis-fixed carbon; however, the ecology of its degradation in soil is still relatively poorly understood. The role of actinobacteria in cellulose degradation has not been extensively investigated despite their abundance in soil and known cellulose degradation capability. Here, the diversity and abundance of the actinobacterial glycoside hydrolase family 48 (cellobiohydrolase) gene in soils from three paired pasture-woodland sites were determined by using terminal restriction fragment length polymorphism (T-RFLP) analysis and clone libraries with gene-specific primers. For comparison, the diversity and abundance of general bacteria and fungi were also assessed. Phylogenetic analysis of the nucleotide sequences of 80 clones revealed significant new diversity of actinobacterial GH48 genes, and analysis of translated protein sequences showed that these enzymes are likely to represent functional cellobiohydrolases. The soil C/N ratio was the primary environmental driver of GH48 community compositions across sites and land uses, demonstrating the importance of substrate quality in their ecology. Furthermore, mid-infrared (MIR) spectrometry-predicted humic organic carbon was distinctly more important to GH48 diversity than to total bacterial and fungal diversity. This suggests a link between the actinobacterial GH48 community and soil organic carbon dynamics and highlights the potential importance of actinobacteria in the terrestrial carbon cycle. PMID:25710367
Application of RAD Sequencing for Evaluating the Genetic Diversity of Domesticated Panax notoginseng (Araliaceae)

PubMed Central

Pan, Yuezhi; Wang, Xueqin; Sun, Guiling; Li, Fusheng; Gong, Xun

2016-01-01

Panax notoginseng, a traditional Chinese medicinal plant, has been cultivated and domesticated for approximately 400 years, mainly in Yunnan and Guangxi, two provinces in southwest China. This species was named according to cultivated rather than wild individuals, and no wild populations had been found until now. The genetic resources available on farms are important for both breeding practices and resource conservation. In the present study, the recently developed technology RADseq, which is based on next-generation sequencing, was used to analyze the genetic variation and differentiation of P. notoginseng. The nucleotide diversity and heterozygosity results indicated that P. notoginseng had low genetic diversity at both the species and population levels. Almost no genetic differentiation has been detected, and all populations were genetically similar due to strong gene flow and insufficient splitting time. Although the genetic diversity of P. notoginseng was low at both species and population levels, several traditional plantations had relatively high genetic diversity, as revealed by the He and π values and by the private allele numbers. These valuable genetic resources should be protected as soon as possible to facilitate future breeding projects. The possible geographical origin of Sanqi domestication was discussed based on the results of the genetic diversity analysis. PMID:27846268
Comparative genomic analysis of the Lipase3 gene family in five plant species reveals distinct evolutionary origins.

PubMed

Wang, Dan; Zhang, Lin; Hu, JunFeng; Gao, Dianshuai; Liu, Xin; Sha, Yan

2018-04-01

Lipases are physiologically important and ubiquitous enzymes that share a conserved domain and are classified into eight different families based on their amino acid sequences and fundamental biological properties. The Lipase3 family of lipases was reported to possess a canonical fold typical of α/β hydrolases and a typical catalytic triad, suggesting a distinct evolutionary origin for this family. Genes in the Lipase3 family do not have the same functions, but maintain the conserved Lipase3 domain. There have been extensive studies of Lipase3 structures and functions, but little is known about their evolutionary histories. In this study, all lipases within five plant species were identified, and their phylogenetic relationships and genetic properties were analyzed and used to group them into distinct evolutionary families. Each identified lipase family contained at least one dicot and monocot Lipase3 protein, indicating that the gene family was established before the split of dicots and monocots. Similar intron/exon numbers and predicted protein sequence lengths were found within individual groups. Twenty-four tandem Lipase3 gene duplications were identified, implying that the distinctive function of Lipase3 genes appears to be a consequence of translocation and neofunctionalization after gene duplication. The functional genes EDS1, PAD4, and SAG101 that are reportedly involved in pathogen response were all located in the same group. The nucleotide diversity (Dxy) and the ratio of nonsynonymous to synonymous nucleotide substitutions rates (Ka/Ks) of the three genes were significantly greater than the average across the genomes. We further observed evidence for selection maintaining diversity on three genes in the Toll-Interleukin-1 receptor type of nucleotide binding/leucine-rich repeat immune receptor (TIR-NBS LRR) immunity-response signaling pathway, indicating that they could be vulnerable to pathogen effectors.
Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms.

PubMed

Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro

2010-04-27

To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be caused by a recent human selection in rice breeding. The definition of pedigree haplotypes by means of genome-wide SNPs will facilitate next-generation breeding of rice and other crops.
Extensive Within-Host Diversity in Fecally Carried Extended-Spectrum-Beta-Lactamase-Producing Escherichia coli Isolates: Implications for Transmission Analyses.

PubMed

Stoesser, N; Sheppard, A E; Moore, C E; Golubchik, T; Parry, C M; Nget, P; Saroeun, M; Day, N P J; Giess, A; Johnson, J R; Peto, T E A; Crook, D W; Walker, A S

2015-07-01

Studies of the transmission epidemiology of antimicrobial-resistant Escherichia coli, such as strains harboring extended-spectrum beta-lactamase (ESBL) genes, frequently use selective culture of rectal surveillance swabs to identify isolates for molecular epidemiological investigation. Typically, only single colonies are evaluated, which risks underestimating species diversity and transmission events. We sequenced the genomes of 16 E. coli colonies from each of eight fecal samples (n = 127 genomes; one failure), taken from different individuals in Cambodia, a region of high ESBL-producing E. coli prevalence. Sequence data were used to characterize both the core chromosomal diversity of E. coli isolates and their resistance/virulence gene content as a proxy measure of accessory genome diversity. The 127 E. coli genomes represented 31 distinct sequence types (STs). Seven (88%) of eight subjects carried ESBL-positive isolates, all containing blaCTX-M variants. Diversity was substantial, with a median of four STs/individual (range, 1 to 10) and wide genetic divergence at the nucleotide level within some STs. In 2/8 (25%) individuals, the same blaCTX-M variant occurred in different clones, and/or different blaCTX-M variants occurred in the same clone. Patterns of other resistance genes and common virulence factors, representing differences in the accessory genome, were also diverse within and between clones. The substantial diversity among intestinally carried ESBL-positive E. coli bacteria suggests that fecal surveillance, particularly if based on single-colony subcultures, will likely underestimate transmission events, especially in high-prevalence settings. Copyright © 2015, Stoesser et al.
Characterization of apple stem grooving virus and apple chlorotic leaf spot virus identified in a crab apple tree.

PubMed

Li, Yongqiang; Deng, Congliang; Bian, Yong; Zhao, Xiaoli; Zhou, Qi

2017-04-01

Apple stem grooving virus (ASGV), apple chlorotic leaf spot virus (ACLSV), and prunus necrotic ringspot virus (PNRSV) were identified in a crab apple tree by small RNA deep sequencing. The complete genome sequence of ACLSV isolate BJ (ACLSV-BJ) was 7554 nucleotides and shared 67.0%-83.0% nucleotide sequence identity with other ACLSV isolates. A phylogenetic tree based on the complete genome sequence of all available ACLSV isolates showed that ACLSV-BJ clustered with the isolates SY01 from hawthorn, MO5 from apple, and JB, KMS and YH from pear. The complete nucleotide sequence of ASGV-BJ was 6509 nucleotides (nt) long and shared 78.2%-80.7% nucleotide sequence identity with other isolates. ASGV-BJ and the isolate ASGV_kfp clustered together in the phylogenetic tree as an independent clade. Recombination analysis showed that isolate ASGV-BJ was a naturally occurring recombinant.
Trading genes along the silk road: mtDNA sequences and the origin of central Asian populations.

PubMed Central

Comas, D; Calafell, F; Mateu, E; Pérez-Lezaun, A; Bosch, E; Martínez-Arias, R; Clarimon, J; Facchini, F; Fiori, G; Luiselli, D; Pettener, D; Bertranpetit, J

1998-01-01

Central Asia is a vast region at the crossroads of different habitats, cultures, and trade routes. Little is known about the genetics and the history of the population of this region. We present the analysis of mtDNA control-region sequences in samples of the Kazakh, the Uighurs, the lowland Kirghiz, and the highland Kirghiz, which we have used to address both the population history of the region and the possible selective pressures that high altitude has on mtDNA genes. Central Asian mtDNA sequences present features intermediate between European and eastern Asian sequences, in several parameters-such as the frequencies of certain nucleotides, the levels of nucleotide diversity, mean pairwise differences, and genetic distances. Several hypotheses could explain the intermediate position of central Asia between Europe and eastern Asia, but the most plausible would involve extensive levels of admixture between Europeans and eastern Asians in central Asia, possibly enhanced during the Silk Road trade and clearly after the eastern and western Eurasian human groups had diverged. Lowland and highland Kirghiz mtDNA sequences are very similar, and the analysis of molecular variance has revealed that the fraction of mitochondrial genetic variance due to altitude is not significantly different from zero. Thus, it seems unlikely that altitude has exerted a major selective pressure on mitochondrial genes in central Asian populations. PMID:9837835
Isolation and Genomic Characterization of a Duck-Origin GPV-Related Parvovirus from Cherry Valley Ducklings in China.

PubMed

Chen, Hao; Dou, Yanguo; Tang, Yi; Zhang, Zhenjie; Zheng, Xiaoqiang; Niu, Xiaoyu; Yang, Jing; Yu, Xianglong; Diao, Youxiang

2015-01-01

A newly emerged duck parvovirus, which causes beak atrophy and dwarfism syndrome (BADS) in Cherry Valley ducks, has appeared in Northern China since March 2015. To explore the genetic diversity among waterfowl parvovirus isolates, the complete genome of an identified isolate designated SDLC01 was sequenced and analyzed in the present study. Genomic sequence analysis showed that SDLC01 shared 90.8%-94.6% of nucleotide identity with goose parvovirus (GPV) isolates and 78.6%-81.6% of nucleotide identity with classical Muscovy duck parvovirus (MDPV) isolates. Phylogenetic analysis of 443 nucleotides (nt) of the fragment A showed that SDLC01 was highly similar to a mule duck isolate (strain D146/02) and close to European GPV isolates but separate from Asian GPV isolates. Analysis of the left inverted terminal repeat regions revealed that SDLC01 had two major segments deleted between positions 160-176 and 306-322 nt compared with field GPV and MDPV isolates. Phylogenetic analysis of Rep and VP1 encoded by two major open reading frames of parvoviruses revealed that SDLC01 was distinct from all GPV and MDPV isolates. The viral pathogenicity and genome characterization of SDLC01 suggest that the novel GPV (N-GPV) is the causative agent of BADS and belongs to a distinct GPV-related subgroup. Furthermore, N-GPV sequences were detected in diseased ducks by polymerase chain reaction and viral proliferation was demonstrated in duck embryos and duck embryo fibroblast cells.

Genetic characterization of infectious hematopoietic necrosis virus of coastal salmonid stocks in Washington State

USGS Publications Warehouse

Emmenegger, E.J.; Kurath, G.

2002-01-01

Infectious hematopoietic necrosis virus (IHNV) is a pathogen that infects many Pacific salmonid stocks from the watersheds of North America. Previous studies have thoroughly characterized the genetic diversity of IHNV isolates from Alaska and the Hagerman Valley in Idaho. To enhance understanding of the evolution and viral transmission patterns of IHNV within the Pacific Northwest geographic range, we analyzed the G gene of IHNV isolates from the coastal watersheds of Washington State by ribonuclease protection assay (RPA) and nucleotide sequencing. The RPA analysis of 23 isolates indicated that the Skagit basin IHNV isolates were relatively homogeneous as a result of the dominance of one G gene haplotype (S). Sequence analysis of 303 bases in the middle of the G gene (midG region) of 61 isolates confirmed the high frequency of a Skagit River basin sequence and identified another sequence commonly found in isolates from the Lake Washington basin. Overall, both the RPA and sequence analysis showed that the Washington coastal IHNV isolates are genetically homogeneous and have little genetic diversity. This is similar to the genetic diversity pattern of IHNV from Alaska and contrasts sharply with the high genetic diversity demonstrated for IHNV isolates from fish farms along the Snake River in Idaho. The high degree of sequence and haplotype similarity between the Washington coastal IHNV isolates and those from Alaska and British Columbia suggests that they have a common viral ancestor. Phylogenetic analyses of the isolates we studied and those from different regions throughout the virus's geographic range confirms a conserved pattern of evolution of the virus in salmonid stocks north of the Columbia River, which forms Washington's southern border.
Composition for nucleic acid sequencing

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-08-26

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-06-06

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-05-30

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Determination of the genetic diversity of vegetable soybean [Glycine max (L.) Merr.] using EST-SSR markers*

PubMed Central

Zhang, Gu-wen; Xu, Sheng-chun; Mao, Wei-hua; Hu, Qi-zan; Gong, Ya-ming

2013-01-01

The development of expressed sequence tag-derived simple sequence repeats (EST-SSRs) provided a useful tool for investigating plant genetic diversity. In the present study, 22 polymorphic EST-SSRs from grain soybean were identified and used to assess the genetic diversity in 48 vegetable soybean accessions. Among the 22 EST-SSR loci, tri-nucleotides were the most abundant repeats, accounting for 50.00% of the total motifs. GAA was the most common motif among tri-nucleotide repeats, with a frequency of 18.18%. Polymorphic analysis identified a total of 71 alleles, with an average of 3.23 per locus. The polymorphism information content (PIC) values ranged from 0.144 to 0.630, with a mean of 0.386. Observed heterozygosity (H o) values varied from 0.0196 to 1.0000, with an average of 0.6092, while the expected heterozygosity (H e) values ranged from 0.1502 to 0.6840, with a mean value of 0.4616. Principal coordinate analysis and phylogenetic tree analysis indicated that the accessions could be assigned to different groups based to a large extent on their geographic distribution, and most accessions from China were clustered into the same groups. These results suggest that Chinese vegetable soybean accessions have a narrow genetic base. The results of this study indicate that EST-SSRs from grain soybean have high transferability to vegetable soybean, and that these new markers would be helpful in taxonomy, molecular breeding, and comparative mapping studies of vegetable soybean in the future. PMID:23549845
Sequence diversity and molecular evolutionary rates between buffalo and cattle.

PubMed

Moaeen-ud-Din, M; Bilal, G

2015-02-01

Identification of genes of importance regarding production traits in buffalo is impaired by a paucity of genomic resources. Choice to fill this gap is to exploit data available for cow. The cross-species application of comparative genomics tools is potential gear to investigate the buffalo genome. However, this is dependent on nucleotide sequences similarity. In this study, gene diversity between buffalo and cattle was determined using 86 gene orthologues. There was approximately 3% difference in all genes in terms of nucleotide diversity and 0.267 ± 0.134 in amino acids, indicating the possibility for successfully using cross-species strategies for genomic studies. There were significantly higher non-synonymous substitutions both in cattle and buffalo; however, there was similar difference in terms of dN- dS (4.414 versus 4.745) in buffalo and cattle, respectively. Higher rate of non-synonymous substitutions at similar level in buffalo and cattle indicated a similar positive selection pressure. Results for relative rate test were assessed with the chi-squared test. There was no significance difference on unique mutations between cattle and buffalo lineages at synonymous sites. However, there was a significance difference on unique mutations for non-synonymous sites, indicating ongoing mutagenic process that generates substitutional mutation at approximately the same rate at silent sites. Moreover, despite of common ancestry, our results indicate a different divergent time among genes of cattle and buffalo. This is the first demonstration that variable rates of molecular evolution may be present within the family Bovidae. © 2014 Blackwell Verlag GmbH.
Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.

PubMed

Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D

2012-10-05

Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Population Genetic Inference from Personal Genome Data: Impact of Ancestry and Admixture on Human Genomic Variation

PubMed Central

Kidd, Jeffrey M.; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D.; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F.; Peckham, Heather E.; Omberg, Larsson; Bormann Chung, Christina A.; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G.; Russell, Archie; Reynolds, Andy; Clark, Andrew G.; Reese, Martin G.; Lincoln, Stephen E.; Butte, Atul J.; De La Vega, Francisco M.; Bustamante, Carlos D.

2012-01-01

Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago. PMID:23040495
Genome-Wide Association Studies of 11 Agronomic Traits in Cassava (Manihot esculenta Crantz)

PubMed Central

Zhang, Shengkui; Chen, Xin; Lu, Cheng; Ye, Jianqiu; Zou, Meiling; Lu, Kundian; Feng, Subin; Pei, Jinli; Liu, Chen; Zhou, Xincheng; Ma, Ping’an; Li, Zhaogui; Liu, Cuijuan; Liao, Qi; Xia, Zhiqiang; Wang, Wenquan

2018-01-01

Cassava (Manihot esculenta Crantz) is a major tuberous crop produced worldwide. In this study, we sequenced 158 diverse cassava varieties and identified 349,827 single-nucleotide polymorphisms (SNPs) and indels. In each chromosome, the number of SNPs and the physical length of the respective chromosome were in agreement. Population structure analysis indicated that this panel can be divided into three subgroups. Genetic diversity analysis indicated that the average nucleotide diversity of the panel was 1.21 × 10-4 for all sampled landraces. This average nucleotide diversity was 1.97 × 10-4, 1.01 × 10-4, and 1.89 × 10-4 for subgroups 1, 2, and 3, respectively. Genome-wide linkage disequilibrium (LD) analysis demonstrated that the average LD was about ∼8 kb. We evaluated 158 cassava varieties under 11 different environments. Finally, we identified 36 loci that were related to 11 agronomic traits by genome-wide association analyses. Four loci were associated with two traits, and 62 candidate genes were identified in the peak SNP sites. We found that 40 of these genes showed different expression profiles in different tissues. Of the candidate genes related to storage roots, Manes.13G023300, Manes.16G000800, Manes.02G154700, Manes.02G192500, and Manes.09G099100 had higher expression levels in storage roots than in leaf and stem; on the other hand, of the candidate genes related to leaves, Manes.05G164500, Manes.05G164600, Manes.04G057300, Manes.01G202000, and Manes.03G186500 had higher expression levels in leaves than in storage roots and stem. This study provides basis for research on genetics and the genetic improvement of cassava. PMID:29725343
Genome-Wide Association Studies of 11 Agronomic Traits in Cassava (Manihot esculenta Crantz).

PubMed

Zhang, Shengkui; Chen, Xin; Lu, Cheng; Ye, Jianqiu; Zou, Meiling; Lu, Kundian; Feng, Subin; Pei, Jinli; Liu, Chen; Zhou, Xincheng; Ma, Ping'an; Li, Zhaogui; Liu, Cuijuan; Liao, Qi; Xia, Zhiqiang; Wang, Wenquan

2018-01-01

Cassava ( Manihot esculenta Crantz) is a major tuberous crop produced worldwide. In this study, we sequenced 158 diverse cassava varieties and identified 349,827 single-nucleotide polymorphisms (SNPs) and indels. In each chromosome, the number of SNPs and the physical length of the respective chromosome were in agreement. Population structure analysis indicated that this panel can be divided into three subgroups. Genetic diversity analysis indicated that the average nucleotide diversity of the panel was 1.21 × 10 -4 for all sampled landraces. This average nucleotide diversity was 1.97 × 10 -4 , 1.01 × 10 -4 , and 1.89 × 10 -4 for subgroups 1, 2, and 3, respectively. Genome-wide linkage disequilibrium (LD) analysis demonstrated that the average LD was about ∼8 kb. We evaluated 158 cassava varieties under 11 different environments. Finally, we identified 36 loci that were related to 11 agronomic traits by genome-wide association analyses. Four loci were associated with two traits, and 62 candidate genes were identified in the peak SNP sites. We found that 40 of these genes showed different expression profiles in different tissues. Of the candidate genes related to storage roots, Manes.13G023300, Manes.16G000800, Manes.02G154700, Manes.02G192500, and Manes.09G099100 had higher expression levels in storage roots than in leaf and stem; on the other hand, of the candidate genes related to leaves, Manes.05G164500, Manes.05G164600, Manes.04G057300, Manes.01G202000, and Manes.03G186500 had higher expression levels in leaves than in storage roots and stem. This study provides basis for research on genetics and the genetic improvement of cassava.
Genetic diversity in the 3'-terminal region of papaya ringspot virus (PRSV-W) isolates from watermelon in Oklahoma.

PubMed

Abdalla, Osama A; Ali, Akhtar

2012-03-01

The 3'-terminal region (1191 nt) containing part of the NIb gene, complete coat protein (CP) and poly-A tail of 64 papaya ringspot virus (PRSV-W) isolates collected during 2008-2009 from watermelon in commercial fields of four different counties of Oklahoma were cloned and sequenced. Nucleotide and amino acid sequence identities ranged from 95.2-100% and 97.1-100%, respectively, among the Oklahoman PRSV-W isolates. Phylogenetic analysis showed that PRSW-W isolates clustered according to the locations where they were collected within Oklahoma, and each cluster contained two subgroups. All subgroups of Oklahoman PRSV-W isolates were on separate branches when compared to 35 known isolates originating from other parts of the world, including the one reported previously from the USA. This study helps in our understanding about the genetic diversity of PRSV-W isolates infecting cucurbits in Oklahoma.
Higher-level phylogeny of paraneopteran insects inferred from mitochondrial genome sequences

PubMed Central

Li, Hu; Shao, Renfu; Song, Nan; Song, Fan; Jiang, Pei; Li, Zhihong; Cai, Wanzhi

2015-01-01

Mitochondrial (mt) genome data have been proven to be informative for animal phylogenetic studies but may also suffer from systematic errors, due to the effects of accelerated substitution rate and compositional heterogeneity. We analyzed the mt genomes of 25 insect species from the four paraneopteran orders, aiming to better understand how accelerated substitution rate and compositional heterogeneity affect the inferences of the higher-level phylogeny of this diverse group of hemimetabolous insects. We found substantial heterogeneity in base composition and contrasting rates in nucleotide substitution among these paraneopteran insects, which complicate the inference of higher-level phylogeny. The phylogenies inferred with concatenated sequences of mt genes using maximum likelihood and Bayesian methods and homogeneous models failed to recover Psocodea and Hemiptera as monophyletic groups but grouped, instead, the taxa that had accelerated substitution rates together, including Sternorrhyncha (a suborder of Hemiptera), Thysanoptera, Phthiraptera and Liposcelididae (a family of Psocoptera). Bayesian inference with nucleotide sequences and heterogeneous models (CAT and CAT + GTR), however, recovered Psocodea, Thysanoptera and Hemiptera each as a monophyletic group. Within Psocodea, Liposcelididae is more closely related to Phthiraptera than to other species of Psocoptera. Furthermore, Thysanoptera was recovered as the sister group to Hemiptera. PMID:25704094
Common and diverse features of cocirculating type 2 and 3 recombinant vaccine-derived polioviruses isolated from patients with poliomyelitis and healthy children.

PubMed

Joffret, Marie-Line; Jégouic, Sophie; Bessaud, Maël; Balanant, Jean; Tran, Coralie; Caro, Valerie; Holmblat, Barbara; Razafindratsimandresy, Richter; Reynes, Jean-Marc; Rakoto-Andrianarivelo, Mala; Delpeyroux, Francis

2012-05-01

Five cases of poliomyelitis due to type 2 or 3 recombinant vaccine-derived polioviruses (VDPVs) were reported in the Toliara province of Madagascar in 2005. We sequenced the genome of the VDPVs isolated from the patients and from 12 healthy children and characterized phenotypic aspects, including pathogenicity, in mice transgenic for the poliovirus receptor. We identified 6 highly complex mosaic recombinant lineages composed of sequences derived from different vaccine polioviruses and other species C human enteroviruses (HEV-Cs). Most had some recombinant genome features in common and contained nucleotide sequences closely related to certain cocirculating coxsackie A virus isolates. However, they differed in terms of their recombinant characteristics or nucleotide substitutions and phenotypic features. All VDPVs were neurovirulent in mice. This study confirms the genetic relationship between type 2 and 3 VDPVs, indicating that both types can be involved in a single outbreak of disease. Our results highlight the various ways in which a vaccine-derived poliovirus may become pathogenic in complex viral ecosystems, through frequent recombination events and mutations. Intertypic recombination between cocirculating HEV-Cs (including polioviruses) appears to be a common mechanism of genetic plasticity underlying transverse genetic variability.
Labeled nucleotide phosphate (NP) probes

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2009-02-03

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
The primary structure of the Saccharomyces cerevisiae gene for 3-phosphoglycerate kinase.

PubMed Central

Hitzeman, R A; Hagie, F E; Hayflick, J S; Chen, C Y; Seeburg, P H; Derynck, R

1982-01-01

The DNA sequence of the gene for the yeast glycolytic enzyme, 3-phosphoglycerate kinase (PGK), has been obtained by sequencing part of a 3.1 kbp HindIII fragment obtained from the yeast genome. The structural gene sequence corresponds to a reading frame of 1251 bp coding for 416 amino acids with no intervening DNA sequences. The amino acid sequence is approximately 65 percent homologous with human and horse PGK protein sequences and is in general agreement with the published protein sequence for yeast PGK. As for other highly expressed structural genes in yeast, the coding sequence is highly codon biased with 95 percent of the amino acids coded for by a select 25 codons (out of 61 possible). Besides structural DNA sequence, 291 bp of 5'-flanking sequence and 286 bp of 3'-flanking sequence were determined. Transcription starts 36 nucleotides upstream from the translational start and stops 86-93 nucleotides downstream from the translational stop. These results suggest a non-polyadenylated mRNA length of 1373 to 1380 nucleotides, which is consistent with the observed length of 1500 nucleotides for polyadenylated PGK mRNA. A sequence TATATATAAA is found at 145 nucleotides upstream from the translational start. This sequence resembles the TATAAA box that is possibly associated with RNA polymerase II binding. Images PMID:6296791
E6 and E7 Gene Polymorphisms in Human Papillomavirus Types-58 and 33 Identified in Southwest China

PubMed Central

Wen, Qiang; Wang, Tao; Mu, Xuemei; Chenzhang, Yuwei; Cao, Man

2017-01-01

Cancer of the cervix is associated with infection by certain types of human papillomavirus (HPV). The gene variants differ in immune responses and oncogenic potential. The E6 and E7 proteins encoded by high-risk HPV play a key role in cellular transformation. HPV-33 and HPV-58 types are highly prevalent among Chinese women. To study the gene intratypic variations, polymorphisms and positive selections of HPV-33 and HPV-58 E6/E7 in southwest China, HPV-33 (E6, E7: n = 216) and HPV-58 (E6, E7: n = 405) E6 and E7 genes were sequenced and compared to others submitted to GenBank. Phylogenetic trees were constructed by Maximum-likelihood and the Kimura 2-parameters methods by MEGA 6 (Molecular Evolutionary Genetics Analysis version 6.0). The diversity of secondary structure was analyzed by PSIPred software. The selection pressures acting on the E6/E7 genes were estimated by PAML 4.8 (Phylogenetic Analyses by Maximun Likelihood version4.8) software. The positive sites of HPV-33 and HPV-58 E6/E7 were contrasted by ClustalX 2.1. Among 216 HPV-33 E6 sequences, 8 single nucleotide mutations were observed with 6/8 non-synonymous and 2/8 synonymous mutations. The 216 HPV-33 E7 sequences showed 3 single nucleotide mutations that were non-synonymous. The 405 HPV-58 E6 sequences revealed 8 single nucleotide mutations with 4/8 non-synonymous and 4/8 synonymous mutations. Among 405 HPV-58 E7 sequences, 13 single nucleotide mutations were observed with 10/13 non-synonymous mutations and 3/13 synonymous mutations. The selective pressure analysis showed that all HPV-33 and 4/6 HPV-58 E6/E7 major non-synonymous mutations were sites of positive selection. All variations were observed in sites belonging to major histocompatibility complex and/or B-cell predicted epitopes. K93N and R145 (I/N) were observed in both HPV-33 and HPV-58 E6. PMID:28141822
37 CFR 5.31-5.33 - [Reserved

Code of Federal Regulations, 2011 CFR

2011-07-01

... from abandonment 1.135 Amino Acid Sequences. (See Nucleotide and/or Amino Acid Sequences) Appeal to... Appeals and Interference 41.47 Of rejection of an application 1.104(a) Nucleotide and/or Amino Acid...) Symbols for nucleotide and/or amino acid sequence data 1.822 T Tables in patent applications 1.58 Terminal...
Genetic structure of Plasmodium vivax using the merozoite surface protein 1 icb5-6 fragment reveals new hybrid haplotypes in southern Mexico

PubMed Central

2014-01-01

Background Plasmodium vivax is a protozoan parasite with an extensive worldwide distribution, being highly prevalent in Asia as well as in Mesoamerica and South America. In southern Mexico, P. vivax transmission has been endemic and recent studies suggest that these parasites have unique biological and genetic features. The msp1 gene has shown high rate of nucleotide substitutions, deletions, insertions, and its mosaic structure reveals frequent events of recombination, maybe between highly divergent parasite isolates. Methods The nucleotide sequence variation in the polymorphic icb5-6 fragment of the msp1 gene of Mexican and worldwide isolates was analysed. To understand how genotype diversity arises, disperses and persists in Mexico, the genetic structure and genealogical relationships of local isolates were examined. To identify new sequence hybrids and their evolutionary relationships with other P. vivax isolates circulating worldwide two haplotype networks were constructed questioning that two portions of the icb5-6 have different evolutionary history. Results Twelve new msp1 icb5-6 haplotypes of P. vivax from Mexico were identified. These nucleotide sequences show mosaic structure comprising three partially conserved and two variable subfragments and resulted into five different sequence types. The variable subfragment sV1 has undergone recombination events and resulted in hybrid sequences and the haplotype network allocated the Mexican haplotypes to three lineages, corresponding to the Sal I and Belem types, and other more divergent group. In contrast, the network from icb5-6 fragment but not sV1 revealed that the Mexican haplotypes belong to two separate lineages, none of which are closely related to Sal I or Belem sequences. Conclusions These results suggest that the new hybrid haplotypes from southern Mexico were the result of at least three different recombination events. These rearrangements likely resulted from the recombination between haplotypes of highly divergent lineages that are frequently distributed in South America and Asia and diversified rapidly. PMID:24472213
Sequence of a second gene encoding bovine submaxillary mucin: implication for mucin heterogeneity and cloning.

PubMed

Jiang, W; Woitach, J T; Gupta, D; Bhavanandan, V P

1998-10-20

Secreted epithelial mucins are extremely large and heterogeneous glycoproteins. We report the 5 kilobase DNA sequence of a second gene, BSM2, which encodes bovine submaxillary mucin. The determined nucleotide and deduced amino acid sequences of BSM2 are 95.2% and 92. 2% identical, respectively, to those of the previously described BSM1 gene isolated from the same cow. Further, the five predicted protein domains of the two genes are 100%, 94%, 93%, 77%, and 88% identical. Based on the above results, we propose that expression of multiple homologous core proteins from a single animal is a factor in generating diversity of saccharides in mucins and in providing resistance of the molecules to proteolysis. In addition, this work raises several important issues in mucin cloning such as assembling sequences from seemingly overlapping clones and deducing consensus sequences for nearly identical tandem repeats. Copyright 1998 Academic Press.
WEB-server for search of a periodicity in amino acid and nucleotide sequences

NASA Astrophysics Data System (ADS)

E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

2017-12-01

A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.

Nucleotide sequence and genetic organization of barley stripe mosaic virus RNA gamma.

PubMed

Gustafson, G; Hunter, B; Hanau, R; Armour, S L; Jackson, A O

1987-06-01

The complete nucleotide sequences of RNA gamma from the Type and ND18 strains of barley stripe mosaic virus (BSMV) have been determined. The sequences are 3164 (Type) and 2791 (ND18) nucleotides in length. Both sequences contain a 5'-noncoding region (87 or 88 nucleotides) which is followed by a long open reading frame (ORF1). A 42-nucleotide intercistronic region separates ORF1 from a second, shorter open reading frame (ORF2) located near the 3'-end of the RNA. There is a high degree of homology between the Type and ND18 strains in the nucleotide sequence of ORF1. However, the Type strain contains a 366 nucleotide direct tandem repeat within ORF1 which is absent in the ND18 strain. Consequently, the predicted translation product of Type RNA gamma ORF1 (mol wt 87,312) is significantly larger than that of ND18 RNA gamma ORF1 (mol wt 74,011). The amino acid sequence of the ORF1 polypeptide contains homologies with putative RNA polymerases from other RNA viruses, suggesting that this protein may function in replication of the BSMV genome. The nucleotide sequence of RNA gamma ORF2 is nearly identical in the Type and ND18 strains. ORF2 codes for a polypeptide with a predicted molecular weight of 17,209 (Type) or 17,074 (ND18) which is known to be translated from a subgenomic (sg) RNA. The initiation point of this sgRNA has been mapped to a location 27 nucleotides upstream of the ORF2 initiation codon in the intercistronic region between ORF1 and ORF2. The sgRNA is not coterminal with the 3'-end of the genomic RNA, but instead contains heterogeneous poly(A) termini up to 150 nucleotides long (J. Stanley, R. Hanau, and A. O. Jackson, 1984, Virology 139, 375-383). In the genomic RNA gamma, ORF2 is followed by a short poly(A) tract and a 238-nucleotide tRNA-like structure.
A report on identification of sequence polymorphism in barcode region of six commercially important Cymbopogon species.

PubMed

Bishoyi, Ashok Kumar; Kavane, Aarti; Sharma, Anjali; Geetha, K A

2017-02-01

CYMBOPOGON: is an important member of grass family Poaceae, cultivated for essential oils which have greater medicinal and industrial value. Taxonomic identification of Cymbopogon species is determined mainly by morphological markers, odour of essential oils and concentration of bioactive compounds present in the oil matrices which are highly influenced by environment. Authenticated molecular marker based taxonomical identification is also lacking in the genus; hence effort was made to evaluate potential DNA barcode loci in six commercially important Cymbopogon species for their individual discrimination and authentication at the species level. Four widely used DNA barcoding regions viz., ITS 1 & ITS 2 spacers, matK, psbA-trnH and rbcL were taken for the study. Gene sequences of the same or related genera of the concerned loci were mined from NCBI domain and primers were designed and validated for barcode loci amplification. Out of the four loci studied, sequences from matK and ITS spacer loci revealed 0.46% and 5.64% nucleotide sequence diversity, respectively whereas the other two loci i.e., psbA-trnH and rbcL showed 100% sequence homology. The newly developed primers can be used for barcode loci amplification in the genus Cymbopogon. The identified Single Nucleotide Polymorphisms from the studied sequences may be used as barcodes for the six Cymbopogon species. The information generated can also be utilized for barcode development of the genus by including more number of Cymbopgon species in future.
Implication of an Aldehyde Dehydrogenase Gene and a Phosphinothricin N-Acetyltransferase Gene in the Diversity of Pseudomonas cichorii Virulence

PubMed Central

Tanaka, Masayuki; Wali, Ullah Md; Nakayashiki, Hitoshi; Fukuda, Tatsuya; Mizumoto, Hiroyuki; Ohnishi, Kouhei; Kiba, Akinori; Hikichi, Yasufumi

2011-01-01

Pseudomonas cichorii harbors the hrp genes. hrp-mutants lose their virulence on eggplant but not on lettuce. A phosphinothricin N-acetyltransferase gene (pat) is located between hrpL and an aldehyde dehydrogenase gene (aldH) in the genome of P. cichorii. Comparison of nucleotide sequences and composition of the genes among pseudomonads suggests a common ancestor of hrp and pat between P. cichorii strains and P. viridiflava strains harboring the single hrp pathogenicity island. In contrast, phylogenetic diversification of aldH corresponded to species diversification amongst pseudomonads. In this study, the involvement of aldH and pat in P. cichorii virulence was analyzed. An aldH-deleted mutant (ΔaldH) and a pat-deleted mutant (Δpat) lost their virulence on eggplant but not on lettuce. P. cichorii expressed both genes in eggplant leaves, independent of HrpL, the transcriptional activator for the hrp. Inoculation into Asteraceae species susceptible to P. cichorii showed that the involvement of hrp, pat and aldH in P. cichorii virulence is independent of each other and has no relationship with the phylogeny of Asteraceae species based on the nucleotide sequences of ndhF and rbcL. It is thus thought that not only the hrp genes but also pat and aldH are implicated in the diversity of P. cichorii virulence on susceptible host plant species. PMID:24704843
Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations

PubMed Central

Zhou, Shuntai; Jones, Corbin; Mieczkowski, Piotr

2015-01-01

ABSTRACT Validating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS. IMPORTANCE Although next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize the recovery of sequences from input templates and to reduce resampling of the Primer ID so that appropriate multiplexing can be included in the experimental design. With the defined sampling depth and measured error rate, we are able to assign cutoffs for the accurate detection of minority variants in viral populations. This approach allows the power of NGS to be realized without having to guess about sampling depth or to ignore the problem of PCR resampling, while also being able to correct most of the errors in the data set. PMID:26041299
Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing.

PubMed

Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I

2018-01-01

Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.
Development of an oligonucleotide probe for Aureobasidium pullulans based on the small-subunit rRNA gene.

PubMed Central

Li, S; Cullen, D; Hjort, M; Spear, R; Andrews, J H

1996-01-01

Aureobasidium pullulans, a cosmopolitan yeast-like fungus, colonizes leaf surfaces and has potential as a biocontrol agent of pathogens. To assess the feasibility of rRNA as a target for A. pullulans-specific oligonucleotide probes, we compared the nucleotide sequences of the small-subunit rRNA (18S) genes of 12 geographically diverse A. pullulans strains. Extreme sequence conservation was observed. The consensus A. pullulans sequence was compared with other fungal sequences to identify potential probes. A 21-mer probe which hybridized to the 12 A. pullulans strains but not to 98 other fungi, including 82 isolates from the phylloplane, was identified. A 17-mer highly specific for Cladosporium herbarum was also identified. These probes have potential in monitoring and quantifying fungi in leaf surface and other microbial communities. PMID:8633850
Diversity of interferon inducible Mx gene in horses and association of variations with susceptibility vis-à-vis resistance against equine influenza infection.

PubMed

Manuja, Balvinder K; Manuja, Anju; Dahiya, Rajni; Singh, Sandeep; Sharma, R C; Gahlot, S K

2014-10-01

Equine influenza (EI) is primarily an infection of the upper respiratory tract and is one of the major infectious respiratory diseases of economic importance in equines. Re-emergence of the disease, species jumping by H3N8 virus in canines and possible threat of human pandemic due to the unpredictable nature of the virus have necessitated research on devising strategies for preventing the disease. The myxovirus resistance protein (Mx) has been reported to confer resistance to Orthomyxo virus infection by modifying cellular functions needed along the viral replication pathway. Polymorphisms and differential antiviral activities of Mx gene have been reported in pigs and chicken. Here we report the diversity of Mx gene, its expression in response to stimulation with interferon (IFN) α/β and their association with EI resistance and susceptibility in Marwari horses. Blood samples were collected from horses declared positive for equine influenza and in contact animals with a history of no clinical signs. Mx gene was amplified by reverse transcription from total RNA isolated from peripheral blood mononuclear cells (PBMCs) stimulated with IFN α/β using gene specific primers. The amplified gene products from representative samples were cloned and sequenced. Nucleotide sequences and deduced amino acid sequences were analyzed. Out of a total 24 amino acids substitutions sorting intolerant from tolerant (SIFT) analysis predicted 13 substitutions with functional consequences. Five substitutions (V67A, W123L, E346Y, N347Y, S689N) were observed only in resistant animals. Evolutionary distances based on nucleotide sequences with in equines ranged between 0.3-2.0% and 20-24% with other species. On phylogenetic analysis all equine sequences clustered together while other species formed separate clades. Copyright © 2014 Elsevier B.V. All rights reserved.
Genetic diversity and distribution of a distinct strain of Chili leaf curl virus and associated betasatellite infecting tomato and pepper in Oman.

PubMed

Khan, Akhtar J; Akhtar, Sohail; Al-Zaidi, Amal M; Singh, Achuit K; Briddon, Rob W

2013-10-01

Tomato and pepper are widely grown in Oman for local consumption. A countrywide survey was conducted during 2010-2011 to collect samples and assess the diversity of begomoviruses associated with leaf curl disease of tomato and pepper. A virus previously only identified on the Indian subcontinent, chili leaf curl virus (ChLCV), was found associated with tomato and pepper diseases in all vegetable grown areas of Oman. Some of the infected plant samples were also found to contain a betasatellite. A total of 19 potentially full-length begomovirus and eight betasatellite clones were sequenced. The begomovirus clones showed >96% nucleotide sequence identity, showing them to represent a single species. Comparisons to sequences available in the databases showed the highest levels of nucleotide sequence identity (88.0-91.1%) to isolates of the "Pakistan" strain of ChLCV (ChLCV-PK), indicating the virus from Oman to be a distinct strain, for which the name Oman strain (ChLCV-OM) is proposed. An analysis for recombination showed ChLCV-OM likely to have originated by recombination between ChLCV-PK (the major parent), pepper leaf curl Lahore virus and a third strain of ChLCV. The betasatellite sequences obtained were shown to have high levels of identity to isolates of tomato leaf curl betasatellite (ToLCB) previous shown to be present in Oman. For the disease in tomato Koch's postulates were satisfied by Agrobacterium-mediated inoculation of virus and betasatellites clones. This showed the symptoms induced by the virus in the presence of the betasatellite to be enhanced, although viral DNA levels were not affected. ChLCV-OM is the fourth begomovirus identified in tomato in Oman and the first in Capsicum. The significance of these findings is discussed. Copyright © 2013 Elsevier B.V. All rights reserved.
Nucleotide cleaving agents and method

DOEpatents

Que, Jr., Lawrence; Hanson, Richard S.; Schnaith, Leah M. T.

2000-01-01

The present invention provides a unique series of nucleotide cleaving agents and a method for cleaving a nucleotide sequence, whether single-stranded or double-stranded DNA or RNA, using and a cationic metal complex having at least one polydentate ligand to cleave the nucleotide sequence phosphate backbone to yield a hydroxyl end and a phosphate end.
Genetic diversity and recombination analysis of sweepoviruses from Brazil

PubMed Central

2012-01-01

Background Monopartite begomoviruses (genus Begomovirus, family Geminiviridae) that infect sweet potato (Ipomoea batatas) around the world are known as sweepoviruses. Because sweet potato plants are vegetatively propagated, the accumulation of viruses can become a major constraint for root production. Mixed infections of sweepovirus species and strains can lead to recombination, which may contribute to the generation of new recombinant sweepoviruses. Results This study reports the full genome sequence of 34 sweepoviruses sampled from a sweet potato germplasm bank and commercial fields in Brazil. These sequences were compared with others from public nucleotide sequence databases to provide a comprehensive overview of the genetic diversity and patterns of genetic exchange in sweepoviruses isolated from Brazil, as well as to review the classification and nomenclature of sweepoviruses in accordance with the current guidelines proposed by the Geminiviridae Study Group of the International Committee on Taxonomy of Viruses (ICTV). Co-infections and extensive recombination events were identified in Brazilian sweepoviruses. Analysis of the recombination breakpoints detected within the sweepovirus dataset revealed that most recombination events occurred in the intergenic region (IR) and in the middle of the C1 open reading frame (ORF). Conclusions The genetic diversity of sweepoviruses was considerably greater than previously described in Brazil. Moreover, recombination analysis revealed that a genomic exchange is responsible for the emergence of sweepovirus species and strains and provided valuable new information for understanding the diversity and evolution of sweepoviruses. PMID:23082767
Diversity and adaptive evolution of Saccharomyces wine yeast: a review

PubMed Central

Marsit, Souhir; Dequin, Sylvie

2015-01-01

Saccharomyces cerevisiae and related species, the main workhorses of wine fermentation, have been exposed to stressful conditions for millennia, potentially resulting in adaptive differentiation. As a result, wine yeasts have recently attracted considerable interest for studying the evolutionary effects of domestication. The widespread use of whole-genome sequencing during the last decade has provided new insights into the biodiversity, population structure, phylogeography and evolutionary history of wine yeasts. Comparisons between S. cerevisiae isolates from various origins have indicated that a variety of mechanisms, including heterozygosity, nucleotide and structural variations, introgressions, horizontal gene transfer and hybridization, contribute to the genetic and phenotypic diversity of S. cerevisiae. This review will summarize the current knowledge on the diversity and evolutionary history of wine yeasts, focusing on the domestication fingerprints identified in these strains. PMID:26205244
Nucleotide sequence analysis of the recA gene and discrimination of the three isolates of urease-positive thermophilic Campylobacter (UPTC) isolated from seagulls (Larus spp.) in Northern Ireland.

PubMed

Matsuda, M; Tai, K; Moore, J E; Millar, B C; Murayama, O

2004-01-01

Nucleotide sequencing after TA cloning of the amplicon of the almost-full length recA gene from three strains of UPTC (A1, A2, and A3) isolated from seagulls in Northern Ireland, the phenotypical and genotypical characteristics of which have been demonstrated to be indistinguishable, clarified nucleotide differences at three nucleotide positions among the three strains. In conclusion, the nucleotide sequences of the recA gene were found to discriminate among the three strains of UPTC, A1, A2, and A3, which are indistinguishable phenotypically and genotypically. Thus, the present study strongly suggests that nucleotide sequence data of the amplicon of a suitable gene or region could aid in discriminating among isolates of the UPTC group, which are indistinguishable phenotypically and genotypically. Copyright 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Genetic Diversity and Molecular Evolution of Chinese Waxy Maize Germplasm

PubMed Central

Zheng, Hongjian; Wang, Hui; Yang, Hua; Wu, Jinhong; Shi, Biao; Cai, Run; Xu, Yunbi; Wu, Aizhong; Luo, Lijun

2013-01-01

Waxy maize (Zea mays L. var. certaina Kulesh), with many excellent characters in terms of starch composition and economic value, has grown in China for a long history and its production has increased dramatically in recent decades. However, the evolution and origin of waxy maize still remains unclear. We studied the genetic diversity of Chinese waxy maize including typical landraces and inbred lines by SSR analysis and the results showed a wide genetic diversity in the Chinese waxy maize germplasm. We analyzed the origin and evolution of waxy maize by sequencing 108 samples, and downloading 52 sequences from GenBank for the waxy locus in a number of accessions from genus Zea. A sharp reduction of nucleotide diversity and significant neutrality tests (Tajima’s D and Fu and Li’s F*) were observed at the waxy locus in Chinese waxy maize but not in nonglutinous maize. Phylogenetic analysis indicated that Chinese waxy maize originated from the cultivated flint maize and most of the modern waxy maize inbred lines showed a distinct independent origin and evolution process compared with the germplasm from Southwest China. The results indicated that an agronomic trait can be quickly improved to meet production demand by selection. PMID:23818949
Mitochondrial DNA markers reveal high genetic diversity but low genetic differentiation in the black fly Simulium tani Takaoka & Davies along an elevational gradient in Malaysia.

PubMed

Low, Van Lun; Adler, Peter H; Takaoka, Hiroyuki; Ya'cob, Zubaidah; Lim, Phaik Eem; Tan, Tiong Kai; Lim, Yvonne A L; Chen, Chee Dhang; Norma-Rashid, Yusoff; Sofian-Azirun, Mohd

2014-01-01

The population genetic structure of Simulium tani was inferred from mitochondria-encoded sequences of cytochrome c oxidase subunits I (COI) and II (COII) along an elevational gradient in Cameron Highlands, Malaysia. A statistical parsimony network of 71 individuals revealed 71 haplotypes in the COI gene and 43 haplotypes in the COII gene; the concatenated sequences of the COI and COII genes revealed 71 haplotypes. High levels of genetic diversity but low levels of genetic differentiation were observed among populations of S. tani at five elevations. The degree of genetic diversity, however, was not in accordance with an altitudinal gradient, and a Mantel test indicated that elevation did not have a limiting effect on gene flow. No ancestral haplotype of S. tani was found among the populations. Pupae with unique structural characters at the highest elevation showed a tendency to form their own haplotype cluster, as revealed by the COII gene. Tajima's D, Fu's Fs, and mismatch distribution tests revealed population expansion of S. tani in Cameron Highlands. A strong correlation was found between nucleotide diversity and the levels of dissolved oxygen in the streams where S. tani was collected.
Elevated Genetic Diversity in the Emerging Blueberry Pathogen Exobasidium maculosum.

PubMed

Stewart, Jane E; Brooks, Kyle; Brannen, Phillip M; Cline, William O; Brewer, Marin T

2015-01-01

Emerging diseases caused by fungi are increasing at an alarming rate. Exobasidium leaf and fruit spot of blueberry, caused by the fungus Exobasidium maculosum, is an emerging disease that has rapidly increased in prevalence throughout the southeastern USA, severely reducing fruit quality in some plantings. The objectives of this study were to determine the genetic diversity of E. maculosum in the southeastern USA to elucidate the basis of disease emergence and to investigate if populations of E. maculosum are structured by geography, host species, or tissue type. We sequenced three conserved loci from 82 isolates collected from leaves and fruit of rabbiteye blueberry (Vaccinium virgatum), highbush blueberry (V. corymbosum), and southern highbush blueberry (V. corymbosum hybrids) from commercial fields in Georgia and North Carolina, USA, and 6 isolates from lowbush blueberry (V. angustifolium) from Maine, USA, and Nova Scotia, Canada. Populations of E. maculosum from the southeastern USA and from lowbush blueberry in Maine and Nova Scotia are distinct, but do not represent unique species. No difference in genetic structure was detected between different host tissues or among different host species within the southeastern USA; however, differentiation was detected between populations in Georgia and North Carolina. Overall, E. maculosum showed extreme genetic diversity within the conserved loci with 286 segregating sites among the 1,775 sequenced nucleotides and each isolate representing a unique multilocus haplotype. However, 94% of the nucleotide substitutions were silent, so despite the high number of mutations, selective constraints have limited changes to the amino acid sequences of the housekeeping genes. Overall, these results suggest that the emergence of Exobasidium leaf and fruit spot is not due to a recent introduction or host shift, or the recent evolution of aggressive genotypes of E. maculosum, but more likely as a result of an increasing host population or an environmental change.
Elevated Genetic Diversity in the Emerging Blueberry Pathogen Exobasidium maculosum

PubMed Central

Stewart, Jane E.; Brooks, Kyle; Brannen, Phillip M.; Cline, William O.; Brewer, Marin T.

2015-01-01

Emerging diseases caused by fungi are increasing at an alarming rate. Exobasidium leaf and fruit spot of blueberry, caused by the fungus Exobasidium maculosum, is an emerging disease that has rapidly increased in prevalence throughout the southeastern USA, severely reducing fruit quality in some plantings. The objectives of this study were to determine the genetic diversity of E. maculosum in the southeastern USA to elucidate the basis of disease emergence and to investigate if populations of E. maculosum are structured by geography, host species, or tissue type. We sequenced three conserved loci from 82 isolates collected from leaves and fruit of rabbiteye blueberry (Vaccinium virgatum), highbush blueberry (V. corymbosum), and southern highbush blueberry (V. corymbosum hybrids) from commercial fields in Georgia and North Carolina, USA, and 6 isolates from lowbush blueberry (V. angustifolium) from Maine, USA, and Nova Scotia, Canada. Populations of E. maculosum from the southeastern USA and from lowbush blueberry in Maine and Nova Scotia are distinct, but do not represent unique species. No difference in genetic structure was detected between different host tissues or among different host species within the southeastern USA; however, differentiation was detected between populations in Georgia and North Carolina. Overall, E. maculosum showed extreme genetic diversity within the conserved loci with 286 segregating sites among the 1,775 sequenced nucleotides and each isolate representing a unique multilocus haplotype. However, 94% of the nucleotide substitutions were silent, so despite the high number of mutations, selective constraints have limited changes to the amino acid sequences of the housekeeping genes. Overall, these results suggest that the emergence of Exobasidium leaf and fruit spot is not due to a recent introduction or host shift, or the recent evolution of aggressive genotypes of E. maculosum, but more likely as a result of an increasing host population or an environmental change. PMID:26207812
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.

PubMed

Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G

2012-10-02

Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Actinobase: Database on molecular diversity, phylogeny and biocatalytic potential of salt tolerant alkaliphilic actinomycetes.

PubMed

Sharma, Amit K; Gohel, Sangeeta; Singh, Satya P

2012-01-01

Actinobase is a relational database of molecular diversity, phylogeny and biocatalytic potential of haloalkaliphilic actinomycetes. The main objective of this data base is to provide easy access to range of information, data storage, comparison and analysis apart from reduced data redundancy, data entry, storage, retrieval costs and improve data security. Information related to habitat, cell morphology, Gram reaction, biochemical characterization and molecular features would allow researchers in understanding identification and stress adaptation of the existing and new candidates belonging to salt tolerant alkaliphilic actinomycetes. The PHP front end helps to add nucleotides and protein sequence of reported entries which directly help researchers to obtain the required details. Analysis of the genus wise status of the salt tolerant alkaliphilic actinomycetes indicated 6 different genera among the 40 classified entries of the salt tolerant alkaliphilic actinomycetes. The results represented wide spread occurrence of salt tolerant alkaliphilic actinomycetes belonging to diverse taxonomic positions. Entries and information related to actinomycetes in the database are publicly accessible at http://www.actinobase.in. On clustalW/X multiple sequence alignment of the alkaline protease gene sequences, different clusters emerged among the groups. The narrow search and limit options of the constructed database provided comparable information. The user friendly access to PHP front end facilitates would facilitate addition of sequences of reported entries. The database is available for free at http://www.actinobase.in.
Genetic and antigenic diversity of Theileria parva in cattle in Eastern and Southern zones of Tanzania. A study to support control of East Coast fever.

PubMed

Elisa, Mwega; Hasan, Salih Dia; Moses, Njahira; Elpidius, Rukambile; Skilton, Robert; Gwakisa, Paul

2015-04-01

This study investigated the genetic and antigenic diversity of Theileria parva in cattle from the Eastern and Southern zones of Tanzania. Thirty-nine (62%) positive samples were genotyped using 14 mini- and microsatellite markers with coverage of all four T. parva chromosomes. Wright's F index (F(ST) = 0 × 094) indicated a high level of panmixis. Linkage equilibrium was observed in the two zones studied, suggesting existence of a panmyctic population. In addition, sequence analysis of CD8+ T-cell target antigen genes Tp1 revealed a single protein sequence in all samples analysed, which is also present in the T. parva Muguga strain, which is a component of the FAO1 vaccine. All Tp2 epitope sequences were identical to those in the T. parva Muguga strain, except for one variant of a Tp2 epitope, which is found in T. parva Kiambu 5 strain, also a component the FAO1 vaccine. Neighbour joining tree of the nucleotide sequences of Tp2 showed clustering according to geographical origin. Our results show low genetic and antigenic diversity of T. parva within the populations analysed. This has very important implications for the development of sustainable control measures for T. parva in Eastern and Southern zones of Tanzania, where East Coast fever is endemic.
Genetic Variations in Two Seahorse Species (Hippocampus mohnikei and Hippocampus trimaculatus): Evidence for Middle Pleistocene Population Expansion

PubMed Central

Zhang, Yanhong; Pham, Nancy Kim; Zhang, Huixian; Lin, Junda; Lin, Qiang

2014-01-01

Population genetic of seahorses is confidently influenced by their species-specific ecological requirements and life-history traits. In the present study, partial sequences of mitochondrial cytochrome b (cytb) and control region (CR) were obtained from 50 Hippocampus mohnikei and 92 H. trimaculatus from four zoogeographical zones. A total of 780 base pairs of cytb gene were sequenced to characterize mitochondrial DNA (mtDNA) diversity. The mtDNA marker revealed high haplotype diversity, low nucleotide diversity, and a lack of population structure across both populations of H. mohnikei and H. trimaculatus. A neighbour-joining (NJ) tree of cytb gene sequences showed that H. mohnikei haplotypes formed one cluster. A maximum likelihood (ML) tree of cytb gene sequences showed that H. trimaculatus belonged to one lineage. The star-like pattern median-joining network of cytb and CR markers indicated a previous demographic expansion of H. mohnikei and H. trimaculatus. The cytb and CR data sets exhibited a unimodal mismatch distribution, which may have resulted from population expansion. Mismatch analysis suggested that the expansion was initiated about 276,000 years ago for H. mohnikei and about 230,000 years ago for H. trimaculatus during the middle Pleistocene period. This study indicates a possible signature of genetic variation and population expansion in two seahorses under complex marine environments. PMID:25144384

Nucleic acid analysis using terminal-phosphate-labeled nucleotides

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-04-22

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

PubMed Central

Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

1987-01-01

The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486
Genetic diversity of transmission-blocking vaccine candidate Pvs48/45 in Plasmodium vivax populations in China.

PubMed

Feng, Hui; Gupta, Bhavna; Wang, Meilian; Zheng, Wenqi; Zheng, Li; Zhu, Xiaotong; Yang, Yimei; Fang, Qiang; Luo, Enjie; Fan, Qi; Tsuboi, Takafumi; Cao, Yaming; Cui, Liwang

2015-12-01

The male gamete fertilization factor P48/45 in malaria parasites is a prime transmission-blocking vaccine (TBV) candidate. Efforts to develop antimalarial vaccines are often thwarted by genetic diversity of the target antigens. Here we evaluated the genetic diversity of Pvs48/45 gene in global Plasmodium vivax populations. We determined 200 Pvs48/45 sequences collected from temperate and subtropical parasite populations in China. Population genetic and evolutionary analyses were performed to determine the levels of genetic diversity, potential signature of selection, and population differentiation. Analysis of the Pvs48/45 sequences from 200 P. vivax parasites collected in a temperate and a tropical region revealed a low level of genetic diversity (π = 0.0012) with 14 single nucleotide polymorphisms, of which 11 were nonsynonymous. Analysis of 344 Pvs48/45 sequences from nine worldwide P. vivax populations detected a total of 38 haplotypes, of which 13 haplotypes were present only once. Multiple tests for selection confirmed a signature of positive selection on Pvs48/45 with selection skewed to the second cysteine domain. Haplotype network analysis and Wright's fixation index showed large geographical differentiation with the presence of continent-or region-specific mutations in this gene. Pvs48/45 displays low levels of genetic diversity with the presence of region-specific mutations. Some of the mutations may be potential epitope targets based on their positions in the predicted structure, highlighting the need for future evaluation of these mutations in designing Pvs48/45-based TBV.
High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

PubMed Central

2011-01-01

Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434
Effect of malaria transmission reduction by insecticide-treated bed nets (ITNs) on the genetic diversity of Plasmodium falciparum merozoite surface protein (MSP-1) and circumsporozoite (CSP) in western Kenya.

PubMed

Kariuki, Simon K; Njunge, James; Muia, Ann; Muluvi, Geofrey; Gatei, Wangeci; Ter Kuile, Feiko; Terlouw, Dianne J; Hawley, William A; Phillips-Howard, Penelope A; Nahlen, Bernard L; Lindblade, Kim A; Hamel, Mary J; Slutsker, Laurence; Shi, Ya Ping

2013-08-27

Although several studies have investigated the impact of reduced malaria transmission due to insecticide-treated bed nets (ITNs) on the patterns of morbidity and mortality, there is limited information on their effect on parasite diversity. Sequencing was used to investigate the effect of ITNs on polymorphisms in two genes encoding leading Plasmodium falciparum vaccine candidate antigens, the 19 kilodalton blood stage merozoite surface protein-1 (MSP-1(19kDa)) and the Th2R and Th3R T-cell epitopes of the pre-erythrocytic stage circumsporozoite protein (CSP) in a large community-based ITN trial site in western Kenya. The number and frequency of haplotypes as well as nucleotide and haplotype diversity were compared among parasites obtained from children <5 years old prior to the introduction of ITNs (1996) and after 5 years of high coverage ITN use (2001). A total of 12 MSP-1(19kDa) haplotypes were detected in 1996 and 2001. The Q-KSNG-L and E-KSNG-L haplotypes corresponding to the FVO and FUP strains of P. falciparum were the most prevalent (range 32-37%), with an overall haplotype diversity of > 0.7. No MSP-1(19kDa) 3D7 sequence-types were detected in 1996 and the frequency was less than 4% in 2001. The CSP Th2R and Th3R domains were highly polymorphic with a total of 26 and 14 haplotypes, respectively detected in 1996 and 34 and 13 haplotypes in 2001, with an overall haplotype diversity of > 0.9 and 0.75 respectively. The frequency of the most predominant Th2R and Th3R haplotypes was 14 and 36%, respectively. The frequency of Th2R and Th3R haplotypes corresponding to the 3D7 parasite strain was less than 4% at both time points. There was no significant difference in nucleotide and haplotype diversity in parasite isolates collected at both time points. High diversity in these two genes has been maintained overtime despite marked reductions in malaria transmission due to ITNs use. The frequency of 3D7 sequence-types was very low in this area. These findings provide information that could be useful in the design of future malaria vaccines for deployment in endemic areas with high ITN coverage and in interpretation of efficacy data for malaria vaccines based on 3D7 parasite strains.
Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array

PubMed Central

Wang, Shichen; Wong, Debbie; Forrest, Kerrie; Allen, Alexandra; Chao, Shiaoman; Huang, Bevan E; Maccaferri, Marco; Salvi, Silvio; Milner, Sara G; Cattivelli, Luigi; Mastrangelo, Anna M; Whan, Alex; Stephen, Stuart; Barker, Gary; Wieseke, Ralf; Plieske, Joerg; International Wheat Genome Sequencing Consortium; Lillemo, Morten; Mather, Diane; Appels, Rudi; Dolferus, Rudy; Brown-Guedira, Gina; Korol, Abraham; Akhunova, Alina R; Feuillet, Catherine; Salse, Jerome; Morgante, Michele; Pozniak, Curtis; Luo, Ming-Cheng; Dvorak, Jan; Morell, Matthew; Dubcovsky, Jorge; Ganal, Martin; Tuberosa, Roberto; Lawley, Cindy; Mikoulitch, Ivan; Cavanagh, Colin; Edwards, Keith J; Hayden, Matthew; Akhunov, Eduard

2014-01-01

High-density single nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships between individuals in populations and studying marker–trait associations in mapping experiments. We developed a genotyping array including about 90 000 gene-associated SNPs and used it to characterize genetic variation in allohexaploid and allotetraploid wheat populations. The array includes a significant fraction of common genome-wide distributed SNPs that are represented in populations of diverse geographical origin. We used density-based spatial clustering algorithms to enable high-throughput genotype calling in complex data sets obtained for polyploid wheat. We show that these model-free clustering algorithms provide accurate genotype calling in the presence of multiple clusters including clusters with low signal intensity resulting from significant sequence divergence at the target SNP site or gene deletions. Assays that detect low-intensity clusters can provide insight into the distribution of presence–absence variation (PAV) in wheat populations. A total of 46 977 SNPs from the wheat 90K array were genetically mapped using a combination of eight mapping populations. The developed array and cluster identification algorithms provide an opportunity to infer detailed haplotype structure in polyploid wheat and will serve as an invaluable resource for diversity studies and investigating the genetic basis of trait variation in wheat. PMID:24646323
Population genetic structure of the mantis shrimp Oratosquilla oratoria (Crustacea: Squillidae) in the Yellow Sea and East China Sea

NASA Astrophysics Data System (ADS)

Yang, Mei; Li, Xinzheng

2017-09-01

The mantis shrimp Oratosquilla oratoria is an ecologically and economically important species in the Western Pacific. In present study, the population genetic structure of Oratosquilla oratoria from the Yellow Sea and East China Sea was examined with mitochondrial DNA control region sequences. In total, 394 samples were collected from 18 locations and 102 haplotypes were obtained. For the Yellow Sea, the overall nucleotide diversity and haplotype diversity were 0.006 9 and 0.946 8, respectively; while across all the East China Sea locations, the overall nucleotide diversity and haplotype diversity were 0.027 94 and 0.979 0, respectively. The results of AMOVA and pairwise F ST (0.145 2, P <0.001) revealed moderate differentiation between the Yellow Sea and East China Sea populations of O. oratoria. However, neither the neighbor-joining tree nor haplotype network showed clades with geographic pattern, which indicated considerable gene flow was existed between the Yellow Sea and East China Sea, and supporting the high larval dispersal ability in this species. Mismatch distribution analysis and neutrality tests suggested that O. oratoria has undergone population expansion event, and the Pleistocene glacial cycles might have an impact on the historical demography of O. oratoria. The genetic information obtained in this study can provide useful information for sustainable improvements for capture fisheries management strategies.
Single nucleotide polymorphism analysis of Korean native chickens using next generation sequencing data.

PubMed

Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon

2015-02-01

There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.
Molecular characterization of Giardia psittaci by multilocus sequence analysis.

PubMed

Abe, Niichiro; Makino, Ikuko; Kojima, Atsushi

2012-12-01

Multilocus sequence analyses targeting small subunit ribosomal DNA (SSU rDNA), elongation factor 1 alpha (ef1α), glutamate dehydrogenase (gdh), and beta giardin (β-giardin) were performed on Giardia psittaci isolates from three Budgerigars (Melopsittacus undulates) and four Barred parakeets (Bolborhynchus lineola) kept in individual households or imported from overseas. Nucleotide differences and phylogenetic analyses at four loci indicate the distinction of G. psittaci from the other known Giardia species: Giardia muris, Giardia microti, Giardia ardeae, and Giardia duodenalis assemblages. Furthermore, G. psittaci was related more closely to G. duodenalis than to the other known Giardia species, except for G. microti. Conflicting signals regarded as "double peaks" were found at the same nucleotide positions of the ef1α in all isolates. However, the sequences of the other three loci, including gdh and β-giardin, which are known to be highly variable, from all isolates were also mutually identical at every locus. They showed no double peaks. These results suggest that double peaks found in the ef1α sequences are caused not by mixed infection with genetically different G. psittaci isolates but by allelic sequence heterogeneity (ASH), which is observed in diplomonad lineages including G. duodenalis. No sequence difference was found in any G. psittaci isolates at the gdh and β-giardin, suggesting that G. psittaci is indeed not more diverse genetically than other Giardia species. This report is the first to provide evidence related to the genetic characteristics of G. psittaci obtained using multilocus sequence analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
Mango (Mangifera indica L.) germplasm diversity based on single nucleotide polymorphisms derived from the transcriptome.

PubMed

Sherman, Amir; Rubinstein, Mor; Eshed, Ravit; Benita, Miri; Ish-Shalom, Mazal; Sharabi-Schwager, Michal; Rozen, Ada; Saada, David; Cohen, Yuval; Ophir, Ron

2015-11-14

Germplasm collections are an important source for plant breeding, especially in fruit trees which have a long duration of juvenile period. Thus, efforts have been made to study the diversity of fruit tree collections. Even though mango is an economically important crop, most of the studies on diversity in mango collections have been conducted with a small number of genetic markers. We describe a de novo transcriptome assembly from mango cultivar 'Keitt'. Variation discovery was performed using Illumina resequencing of 'Keitt' and 'Tommy Atkins' cultivars identified 332,016 single-nucleotide polymorphisms (SNPs) and 1903 simple-sequence repeats (SSRs). Most of the SSRs (70.1%) were of trinucleotide with the preponderance of motif (GGA/AAG)n and only 23.5% were di-nucleotide SSRs with the mostly of (AT/AT)n motif. Further investigation of the diversity in the Israeli mango collection was performed based on a subset of 293 SNPs. Those markers have divided the Israeli mango collection into two major groups: one group included mostly mango accessions from Southeast Asia (Malaysia, Thailand, Indonesia) and India and the other with mainly of Floridian and Israeli mango cultivars. The latter group was more polymorphic (FS=-0.1 on the average) and was more of an admixture than the former group. A slight population differentiation was detected (FST=0.03), suggesting that if the mango accessions of the western world apparently was originated from Southeast Asia, as has been previously suggested, the duration of cultivation was not long enough to develop a distinct genetic background. Whole-transcriptome reconstruction was used to significantly broaden the mango's genetic variation resources, i.e., SNPs and SSRs. The set of SNP markers described in this study is novel. A subset of SNPs was sampled to explore the Israeli mango collection and most of them were polymorphic in many mango accessions. Therefore, we believe that these SNPs will be valuable as they recapitulate and strengthen the history of mango diversity.
The genomes and comparative genomics of Lactobacillus delbrueckii phages.

PubMed

Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani

2011-07-01

Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.
A New Primer to Amplify pmoA Gene From NC10 Bacteria in the Sediments of Dongchang Lake and Dongping Lake.

PubMed

Wang, Shenghui; Liu, Yanjun; Liu, Guofu; Huang, Yaru; Zhou, Yu

2017-08-01

Nitrite-dependent anaerobic methane oxidation (n-damo) is catalyzed by the NC10 phylum bacterium "Candidatus Methylomirabilis oxyfera" (M. oxyfera). Generally, the pmoA gene is applied as a functional marker to test and identify NC10-like bacteria. However, it is difficult to detect the NC10 bacteria from sediments of freshwater lake (Dongchang Lake and Dongping Lake) with the previous pmoA gene primer sets. In this work, a new primer cmo208 was designed and used to amplify pmoA gene of NC10-like bacteria. A newly nested PCR approach was performed using the new primer cmo208 and the previous primers cmo182, cmo682, and cmo568 to detect the NC10 bacteria. The obtained pmoA gene sequences exhibited 85-92% nucleotide identity and 95-97% amino acid sequence identity to pmoA gene of M. oxyfera. The obtained diversity of pmoA gene sequences coincided well with the diversity of 16S rRNA sequences. These results indicated that the newly designed pmoA primer cmo208 could give one more option to detect NC10 bacteria from different environmental samples.
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
Molecular phylogeny and larval morphological diversity of the lanternfish genus Hygophum (Teleostei: Myctophidae).

PubMed

Yamaguchi, M; Miya, M; Okiyama, M; Nishida, M

2000-04-01

Larvae of the deep-sea lanternfish genus Hygophum (Myctophidae) exhibit a remarkable morphological diversity that is quite unexpected, considering their homogeneous adult morphology. In an attempt to elucidate the evolutionary patterns of such larval morphological diversity, nucleotide sequences of a portion of the mitochondrially encoded 16S ribosomal RNA gene were determined for seven Hygophum species and three outgroup taxa. Secondary structure-based alignment resulted in a character matrix consisting of 1172 bp of unambiguously aligned sequences, which were subjected to phylogenetic analyses using maximum-parsimony, maximum-likelihood, and neighbor-joining methods. The resultant tree topologies from the three methods were congruent, with most nodes, including that of the genus Hygophum, being strongly supported by various tree statistics. The most parsimonious reconstruction of the three previously recognized, distinct larval morphs onto the molecular phylogeny revealed that one of the morphs had originated as the common ancestor of the genus, the other two having diversified separately in two subsequent major clades. The patterns of such diversification are discussed in terms of the unusual larval eye morphology and geographic distribution. Copyright 2000 Academic Press.
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2014 CFR

2014-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2013 CFR

2013-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2012 CFR

2012-07-01

...” means those amino acids other than “Xaa” and those nucleotide bases other than “n”defined in accordance... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences...
Near East mtDNA haplotype variants in Roman cattle from Augusta Raurica, Switzerland, and in the Swiss Evolène breed.

PubMed

Schlumbaum, A; Turgay, M; Schibler, J

2006-08-01

Typical Near East mitochondrial haplotypes of the T2 lineage were found in one cattle metacarpus sample from the Roman period and in two present-day Evolène cattle in Switzerland. Sequences from eight additional Evolène and four Raetian Grey aligned to the European haplotype T3. Analysis of nucleotide diversity within the mitochondrial D-loop of both studied Swiss cattle breeds revealed high haplotype diversity and similar diversity to a European cattle reference group. Mitochondrial T3 haplotypes radiated star-like from two similarly frequent haplotypes, possibly indicating two different expansion routes. The breed structure of Evolène cattle can be explained either by an introduction of diverse female lineages from the domestication centre or by later admixture. The introduction of the Near East lineage to Switzerland must have happened during the Roman time or earlier.
Multilocus sequence typing (MLST) for lineage assignment and high resolution diversity studies in Trypanosoma cruzi.

PubMed

Yeo, Matthew; Mauricio, Isabel L; Messenger, Louisa A; Lewis, Michael D; Llewellyn, Martin S; Acosta, Nidia; Bhattacharyya, Tapan; Diosque, Patricio; Carrasco, Hernan J; Miles, Michael A

2011-06-01

Multilocus sequence typing (MLST) is a powerful and highly discriminatory method for analysing pathogen population structure and epidemiology. Trypanosoma cruzi, the protozoan agent of American trypanosomiasis (Chagas disease), has remarkable genetic and ecological diversity. A standardised MLST protocol that is suitable for assignment of T. cruzi isolates to genetic lineage and for higher resolution diversity studies has not been developed. We have sequenced and diplotyped nine single copy housekeeping genes and assessed their value as part of a systematic MLST scheme for T. cruzi. A minimum panel of four MLST targets (Met-III, RB19, TcGPXII, and DHFR-TS) was shown to provide unambiguous assignment of isolates to the six known T. cruzi lineages (Discrete Typing Units, DTUs TcI-TcVI). In addition, we recommend six MLST targets (Met-II, Met-III, RB19, TcMPX, DHFR-TS, and TR) for more in depth diversity studies on the basis that diploid sequence typing (DST) with this expanded panel distinguished 38 out of 39 reference isolates. Phylogenetic analysis implies a subdivision between North and South American TcIV isolates. Single Nucleotide Polymorphism (SNP) data revealed high levels of heterozygosity among DTUs TcI, TcIII, TcIV and, for three targets, putative corresponding homozygous and heterozygous loci within DTUs TcI and TcIII. Furthermore, individual gene trees gave incongruent topologies at inter- and intra-DTU levels, inconsistent with a model of strict clonality. We demonstrate the value of systematic MLST diplotyping for describing inter-DTU relationships and for higher resolution diversity studies of T. cruzi, including presence of recombination events. The high levels of heterozygosity will facilitate future population genetics analysis based on MLST haplotypes.
Molecular epidemiology of Plum pox virus in Japan.

PubMed

Maejima, Kensaku; Himeno, Misako; Komatsu, Ken; Takinami, Yusuke; Hashimoto, Masayoshi; Takahashi, Shuichiro; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou

2011-05-01

For a molecular epidemiological study based on complete genome sequences, 37 Plum pox virus (PPV) isolates were collected from the Kanto region in Japan. Pair-wise analyses revealed that all 37 Japanese isolates belong to the PPV-D strain, with low genetic diversity (less than 0.8%). In phylogenetic analysis of the PPV-D strain based on complete nucleotide sequences, the relationships of the PPV-D strain were reconstructed with high resolution: at the global level, the American, Canadian, and Japanese isolates formed their own distinct monophyletic clusters, suggesting that the routes of viral entry into these countries were independent; at the local level, the actual transmission histories of PPV were precisely reconstructed with high bootstrap support. This is the first description of the molecular epidemiology of PPV based on complete genome sequences.

Characterization of Chiton Ischnochiton hakodadensis Foot Based on Transcriptome Sequencing

NASA Astrophysics Data System (ADS)

Dou, Huaiqian; Miao, Yan; Li, Yuli; Li, Yangping; Dai, Xiaoting; Zhang, Xiaokang; Liang, Pengyu; Liu, Weizhi; Wang, Shi; Bao, Zhenmin

2018-06-01

Chiton ( Ischnochiton hakodadensis) is one of marine mollusks well known for its eight separate shell plates. I. hakodadensis is important, which plays a vital role in the ecosystems it inhabits. So far, the genetic studies on the chiton are scarce due in part to insufficient genomic resources available for this species. In this study, we investigated the transcriptome of the chiton foot using Illumina sequencing technology. The reads were assembled and clustered into 256461 unigenes, of which 42247 were divided into diverse functional categories by Gene Ontology (GO) annotation terms, and 17256 mapped onto 365 pathways by KEGG pathway mapping. Meanwhile, a set of differentially expressed genes (DEGs) between distal and proximal muscles were identified as the foot adhesive locomotion associated, thus were useful for our future studies. Moreover, up to 679384 high-quality single nucleotide polymorphisms (SNPs) and 19814 simple sequence repeats (SSRs) were identified in this study, which are valuable for subsequent studies on genetic diversity and variation. The transcriptomic resource obtained in this study should aid to future genetic and genomic studies of chiton.
Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

PubMed Central

Hall, L; Laird, J E; Craig, R K

1984-01-01

Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

PubMed Central

Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

2006-01-01

Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
Genetic diversity analysis of Gossypium arboreum germplasm accessions using genotyping-by-sequencing.

PubMed

Li, Ruijuan; Erpelding, John E

2016-10-01

The diploid cotton species Gossypium arboreum possesses many favorable agronomic traits such as drought tolerance and disease resistance, which can be utilized in the development of improved upland cotton cultivars. The USDA National Plant Germplasm System maintains more than 1600 G. arboreum accessions. Little information is available on the genetic diversity of the collection thereby limiting the utilization of this cotton species. The genetic diversity and population structure of the G. arboreum germplasm collection were assessed by genotyping-by-sequencing of 375 accessions. Using genome-wide single nucleotide polymorphism sequence data, two major clusters were inferred with 302 accessions in Cluster 1, 64 accessions in Cluster 2, and nine accessions unassigned due to their nearly equal membership to each cluster. These two clusters were further evaluated independently resulting in the identification of two sub-clusters for the 302 Cluster 1 accessions and three sub-clusters for the 64 Cluster 2 accessions. Low to moderate genetic diversity between clusters and sub-clusters were observed indicating a narrow genetic base. Cluster 2 accessions were more genetically diverse and the majority of the accessions in this cluster were landraces. In contrast, Cluster 1 is composed of varieties or breeding lines more recently added to the collection. The majority of the accessions had kinship values ranging from 0.6 to 0.8. Eight pairs of accessions were identified as potential redundancies due to their high kinship relatedness. The genetic diversity and genotype data from this study are essential to enhance germplasm utilization to identify genetically diverse accessions for the detection of quantitative trait loci associated with important traits that would benefit upland cotton improvement.
Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

PubMed

Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

2014-06-01

The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.
Human ribosomal RNA gene: nucleotide sequence of the transcription initiation region and comparison of three mammalian genes.

PubMed Central

Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M

1982-01-01

The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460
Genetic diversity and recombination of enterovirus G strains in Japanese pigs: High prevalence of strains carrying a papain-like cysteine protease sequence in the enterovirus G population.

PubMed

Tsuchiaka, Shinobu; Naoi, Yuki; Imai, Ryo; Masuda, Tsuneyuki; Ito, Mika; Akagami, Masataka; Ouchi, Yoshinao; Ishii, Kazuo; Sakaguchi, Shoichi; Omatsu, Tsutomu; Katayama, Yukie; Oba, Mami; Shirai, Junsuke; Satani, Yuki; Takashima, Yasuhiro; Taniguchi, Yuji; Takasu, Masaki; Madarame, Hiroo; Sunaga, Fujiko; Aoki, Hiroshi; Makino, Shinji; Mizutani, Tetsuya; Nagai, Makoto

2018-01-01

To study the genetic diversity of enterovirus G (EV-G) among Japanese pigs, metagenomics sequencing was performed on fecal samples from pigs with or without diarrhea, collected between 2014 and 2016. Fifty-nine EV-G sequences, which were >5,000 nucleotides long, were obtained. By complete VP1 sequence analysis, Japanese EV-G isolates were classified into G1 (17 strains), G2 (four strains), G3 (22 strains), G4 (two strains), G6 (two strains), G9 (six strains), G10 (five strains), and a new genotype (one strain). Remarkably, 16 G1 and one G2 strain identified in diarrheic (23.5%; four strains) or normal (76.5%; 13 strains) fecal samples possessed a papain-like cysteine protease (PL-CP) sequence, which was recently found in the USA and Belgium in the EV-G genome, at the 2C-3A junction site. This paper presents the first report of the high prevalence of viruses carrying PL-CP in the EV-G population. Furthermore, possible inter- and intragenotype recombination events were found among EV-G strains, including G1-PL-CP strains. Our findings may advance the understanding of the molecular epidemiology and genetic evolution of EV-Gs.
Genetic characterization and phylogenetic analysis of porcine circovirus type 2 (PCV2) in Serbia.

PubMed

Savic, Bozidar; Milicevic, Vesna; Jakic-Dimic, Dobrila; Bojkovski, Jovan; Prodanovic, Radisa; Kureljusic, Branislav; Potkonjak, Aleksandar; Savic, Borivoje

2012-01-01

Porcine circovirus type 2 (PCV2) is the main causative agent of postweaning multisystemic wasting syndrome (PMWS). To characterize and determine the genetic diversity of PCV2 in the porcine population of Serbia, nucleotide and deduced amino acid sequences of the open reading frame 2 (ORF2) of PCV2 collected from the tissues of pigs that either had died as a result of PMWS or did not exhibit disease symptoms were analyzed. Sequencing and phylogenetic analysis showed considerable diversity among PCV2 ORF2 sequences and the existence of two main PCV2 genotypes, PCV2b and PCV2a, with at least three clusters, 1A/B, 1C and 2D. In order to provide further proof that the 1C strain is circulating in the porcine population, the whole viral genome of one PCV2 isolate was sequenced. Genotyping and phylogenetic analysis using the entire viral genome sequences confirmed that there was a PMWS-associated 1C strain emerging in Serbia. Our analysis also showed that PCV2b is dominant in the porcine population, and that it is exclusively associated with PMWS occurrences in the country. These data constitute a useful basis for further epidemiological studies regarding the heterogeneity of PCV2 strains on the European continent.
MHC class II genes in European wolves: a comparison with dogs.

PubMed

Seddon, Jennifer M; Ellegren, Hans

2002-10-01

The genome of the grey wolf, one of the most widely distributed land mammal species, has been subjected to both stochastic factors, including biogeographical subdivision and population fragmentation, and strong selection during the domestication of the dog. To explore the effects of drift and selection on the partitioning of MHC variation in the diversification of species, we present nine DQA, 10 DQB, and 17 DRB1 sequences of the second exon for European wolves and compare them with sequences of North American wolves and dogs. The relatively large number of class II alleles present in both European and North American wolves attests to their large historical population sizes, yet there are few alleles shared between these regions at DQB and DRB1. Similarly, the dog has an extensive array of class II MHC alleles, a consequence of a genetically diverse origin, but allelic overlap with wolves only at DQA. Although we might expect a progression from shared alleles to shared allelic lineages during differentiation, the partitioning of diversity between wolves and dogs at DQB and DRB1 differs from that at DQA. Furthermore, an extensive region of nucleotide sequence shared between DRB1 and DQB alleles and a shared motif suggests intergenic recombination may have contributed to MHC diversity in the Canidae.
Bacterial diversity in Adélie penguin, Pygoscelis adeliae, guano: molecular and morpho-physiological approaches.

PubMed

Zdanowski, Marek K; Weglenski, Piotr; Golik, Pawel; Sasin, Joanna M; Borsuk, Piotr; Zmuda, Magdalena J; Stankovic, Anna

2004-11-01

The total number of bacteria and culturable bacteria in Adélie penguin (Pygoscelis adeliae) guano was determined during 42 days of decomposition in a location adjacent to the rookery in Admiralty Bay, King George Island, Antarctica. Of the culturable bacteria, 72 randomly selected colonies were described using 49 morpho-physiological tests, 27 of which were subsequently considered significant in characterizing and differentiating the isolates. On the basis of the nucleotide sequence of a fragment of the 16S rRNA gene in each of 72 pure isolates, three major phylogenetic groups were identified, namely the Moraxellaceae/Pseudomonadaceae (29 isolates), the Flavobacteriaceae (14), and the Micrococcaceae (29). Grouping of the isolates on the basis of morpho-physiological tests (whether 49 or 27 parameters) showed similar results to those based on 16S rRNA gene sequences. Clusters were characterized by considerable intra-cluster variation in both 16S rRNA gene sequences and morpho-physiological responses. High diversity in abundance and morphometry of total bacterial communities during penguin guano decomposition was supported by image analysis of epifluorescence micrographs. The results indicate that the bacterial community in penguin guano is not only one of the richest in Antarctica, but is extremely diverse, both phylogenetically and morpho-physiologically.
Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

PubMed

Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

2017-11-28

Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.
A population genetics analysis in clinical isolates of Sporothrix schenckii based on calmodulin and calcium/calmodulin-dependent kinase partial gene sequences.

PubMed

Rangel-Gamboa, Lucia; Martinez-Hernandez, Fernando; Maravilla, Pablo; Flisser, Ana

2018-02-02

Sporotrichosis is a subcutaneous mycosis that is caused by diverse species of Sporothrix. High levels of genetic diversity in Sporothrix isolates have been reported, but few population genetics analyses have been documented. To analyse the genetic variability and population genetics relations of Sporothrix schenckii Mexican clinical isolates and to compare them with other reported isolates. We studied the partial sequences of calmodulin and calcium/calmodulin-dependent kinase genes in 24 isolates; 22 from Mexico, one from Colombia, and one ATCC ® 6331™; the latter was used as a positive control. In total, 24 isolates were analysed. Phylogenetic, haplotype and population genetic analyses were performed with 24 sequences obtained by us and 345 sequences obtained from GenBank. The frequency of S. schenckii sensu stricto was 81% in the 22 Mexican isolates, while the remaining 19% were Sporothrix globosa. Mexican S. schenckii sensu stricto had high genetic diversity and was related to isolates from South America. In contrast, S. globosa showed one haplotype related to isolates from Asia, Brazil, Spain and the USA. In S. schenckii sensu stricto, S. brasiliensis and S. globosa, haplotype polymorphism (θ) values were higher than the nucleotide diversity data (π). In addition, Tajima's D plus Fu and Li's tests analyses displayed negative values, suggesting directional selection and arguing against the model of neutral evolution in these populations. In addition, analyses showed that calcium/calmodulin-dependent kinase was a suitable genetic marker to discriminate between common Sporothrix species. © 2018 Blackwell Verlag GmbH.
Array of nucleic acid probes on biological chips for diagnosis of HIV and methods of using the same

DOEpatents

Chee, Mark; Gingeras, Thomas R.; Fodor, Stephen P. A.; Hubble, Earl A.; Morris, MacDonald S.

1999-01-19

The invention provides an array of oligonucleotide probes immobilized on a solid support for analysis of a target sequence from a human immunodeficiency virus. The array comprises at least four sets of oligonucleotide probes 9 to 21 nucleotides in length. A first probe set has a probe corresponding to each nucleotide in a reference sequence from a human immunodeficiency virus. A probe is related to its corresponding nucleotide by being exactly complementary to a subsequence of the reference sequence that includes the corresponding nucleotide. Thus, each probe has a position, designated an interrogation position, that is occupied by a complementary nucleotide to the corresponding nucleotide. The three additional probe sets each have a corresponding probe for each probe in the first probe set. Thus, for each nucleotide in the reference sequence, there are four corresponding probes, one from each of the probe sets. The three corresponding probes in the three additional probe sets are identical to the corresponding probe from the first probe or a subsequence thereof that includes the interrogation position, except that the interrogation position is occupied by a different nucleotide in each of the four corresponding probes.
Phosphate-Modified Nucleotides for Monitoring Enzyme Activity.

PubMed

Ermert, Susanne; Marx, Andreas; Hacker, Stephan M

2017-04-01

Nucleotides modified at the terminal phosphate position have been proven to be interesting entities to study the activity of a variety of different protein classes. In this chapter, we present various types of modifications that were attached as reporter molecules to the phosphate chain of nucleotides and briefly describe the chemical reactions that are frequently used to synthesize them. Furthermore, we discuss a variety of applications of these molecules. Kinase activity, for instance, was studied by transfer of a phosphate modified with a reporter group to the target proteins. This allows not only studying the activity of kinases, but also identifying their target proteins. Moreover, kinases can also be directly labeled with a reporter at a conserved lysine using acyl-phosphate probes. Another important application for phosphate-modified nucleotides is the study of RNA and DNA polymerases. In this context, single-molecule sequencing is made possible using detection in zero-mode waveguides, nanopores or by a Förster resonance energy transfer (FRET)-based mechanism between the polymerase and a fluorophore-labeled nucleotide. Additionally, fluorogenic nucleotides that utilize an intramolecular interaction between a fluorophore and the nucleobase or an intramolecular FRET effect have been successfully developed to study a variety of different enzymes. Finally, also some novel techniques applying electron paramagnetic resonance (EPR)-based detection of nucleotide cleavage or the detection of the cleavage of fluorophosphates are discussed. Taken together, nucleotides modified at the terminal phosphate position have been applied to study the activity of a large diversity of proteins and are valuable tools to enhance the knowledge of biological systems.
The use of sequence-based SSR mining for the development of a vast collection of microsatellites in Aquilegia Formosa

Treesearch

Brandon Schlautman; Vera Pfeiffer; Juan Zalapa; Johanne Brunet

2014-01-01

Numerous microsatellite markers were developed for Aquilegia formosafrom sequences deposited within the Expressed Sequence Tag (EST), Genomic Survey Sequence (GSS), and Nucleotide databases in NCBI. Microsatellites (SSRs) were identified and primers were designed for 9 SSR containing sequences in the Nucleotide database, 3803 sequences in the EST...
Characterization of Foodborne Outbreaks of Salmonella enterica Serovar Enteritidis with Whole-Genome Sequencing Single Nucleotide Polymorphism-Based Analysis for Surveillance and Outbreak Detection.

PubMed

Taylor, Angela J; Lappi, Victoria; Wolfgang, William J; Lapierre, Pascal; Palumbo, Michael J; Medus, Carlota; Boxrud, David

2015-10-01

Salmonella enterica serovar Enteritidis is a significant cause of gastrointestinal illness in the United States; however, current molecular subtyping methods lack resolution for this highly clonal serovar. Advances in next-generation sequencing technologies have made it possible to examine whole-genome sequencing (WGS) as a potential molecular subtyping tool for outbreak detection and source trace back. Here, we conducted a retrospective analysis of S. Enteritidis isolates from seven epidemiologically confirmed foodborne outbreaks and sporadic isolates (not epidemiologically linked) to determine the utility of WGS to identify outbreaks. A collection of 55 epidemiologically characterized clinical and environmental S. Enteritidis isolates were sequenced. Single nucleotide polymorphism (SNP)-based cluster analysis of the S. Enteritidis genomes revealed well supported clades, with less than four-SNP pairwise diversity, that were concordant with epidemiologically defined outbreaks. Sporadic isolates were an average of 42.5 SNPs distant from the outbreak clusters. Isolates collected from the same patient over several weeks differed by only two SNPs. Our findings show that WGS provided greater resolution between outbreak, sporadic, and suspect isolates than the current gold standard subtyping method, pulsed-field gel electrophoresis (PFGE). Furthermore, results could be obtained in a time frame suitable for surveillance activities, supporting the use of WGS as an outbreak detection and characterization method for S. Enteritidis. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Isolation and characterization of NBS–LRR resistance gene analogues from mango

PubMed Central

Lei, Xintao; Yao, Quansheng; Xu, Xuerong; Liu, Yang

2014-01-01

The nucleotide-binding site (NBS)–leucine-rich repeat (LRR) gene family is a class of R genes in plants. NBS genes play a very important role in disease defence. To further study the variation and homology of mango NBS–LRR genes, 16 resistance gene analogues (RGAs) (GenBank accession number HM446507-22) were isolated from the polymerase chain reaction fragments and sequenced by using two degenerate primer sets. The total nucleotide diversity index Pi was 0.362, and 236 variation sites were found among 16 RGAs. The degree of homology between the RGAs varied from 44.4% to 98.5%. Sixteen RGAs could be translated into amino sequences. The high level of this homology in the protein sequences of the P-loop and kinase-2 of the NBS domain between the RGAs isolated in this study and previously characterized R genes indicated that these cloned sequences belonged to the NBS–LRR gene family. Moreover, these 16 RGAs could be classified into the non-TIR–NBS–LRR gene family because only tryptophan (W) could be claimed as the final residual of the kinase-2 domain of all RGAs isolated here. From our results, we concluded that our mango NBS–LRR genes possessed a high level of variation from the mango genome, which may allow mango to recognize many different pathogenic virulence factors. PMID:26740762
Identification and characterization of RAPD-SCAR markers linked to glyphosate-susceptible and -resistant biotypes of Eleusine indica (L.) Gaertn.

PubMed

Cha, Thye San; Anne-Marie, Kaben; Chuah, Tse Seng

2014-02-01

Eleusine indica is one of the most common weed species found in agricultural land worldwide. Although herbicide-glyphosate provides good control of the weed, its frequent uses has led to abundant reported cases of resistance. Hence, the development of genetic markers for quick detection of glyphosate-resistance in E. indica population is imperative for the control and management of the weed. In this study, a total of 14 specific random amplified polymorphic DNA (RAPD) markers were identified and two of the markers, namely S4R727 and S26R6976 were further sequence characterized. Sequence alignment revealed that marker S4R727 showing a 12-bp nucleotides deletion in resistant biotypes, while marker S26R6976 contained a 167-bp nucleotides insertion in the resistant biotypes. Based on these sequence differences, three pairs of new sequence characterized amplified region (SCAR) primers were developed. The specificity of these primer pairs were further validated with genomic DNA extracted from ten individual plants of one glyphosate-susceptible and five glyphosate-resistant (R2, R4, R6, R8 and R11) populations. The resulting RAPD-SCAR markers provided the basis for assessing genetic diversity between glyphosate-susceptible and -resistant E. indica biotypes, as well for the identification of genetic locus link to glyphosate-resistance event in the species.
Sequencing the cap-snatching repertoire of H1N1 influenza provides insight into the mechanism of viral transcription initiation

PubMed Central

Koppstein, David; Ashour, Joseph; Bartel, David P.

2015-01-01

The influenza polymerase cleaves host RNAs ∼10–13 nucleotides downstream of their 5′ ends and uses this capped fragment to prime viral mRNA synthesis. To better understand this process of cap snatching, we used high-throughput sequencing to determine the 5′ ends of A/WSN/33 (H1N1) influenza mRNAs. The sequences provided clear evidence for nascent-chain realignment during transcription initiation and revealed a strong influence of the viral template on the frequency of realignment. After accounting for the extra nucleotides inserted through realignment, analysis of the capped fragments indicated that the different viral mRNAs were each prepended with a common set of sequences and that the polymerase often cleaved host RNAs after a purine and often primed transcription on a single base pair to either the terminal or penultimate residue of the viral template. We also developed a bioinformatic approach to identify the targeted host transcripts despite limited information content within snatched fragments and found that small nuclear RNAs and small nucleolar RNAs contributed the most abundant capped leaders. These results provide insight into the mechanism of viral transcription initiation and reveal the diversity of the cap-snatched repertoire, showing that noncoding transcripts as well as mRNAs are used to make influenza mRNAs. PMID:25901029
A MicroRNA Superfamily Regulates Nucleotide Binding Site–Leucine-Rich Repeats and Other mRNAs[W][OA

PubMed Central

Shivaprasad, Padubidri V.; Chen, Ho-Ming; Patel, Kanu; Bond, Donna M.; Santos, Bruno A.C.M.; Baulcombe, David C.

2012-01-01

Analysis of tomato (Solanum lycopersicum) small RNA data sets revealed the presence of a regulatory cascade affecting disease resistance. The initiators of the cascade are microRNA members of an unusually diverse superfamily in which miR482 and miR2118 are prominent members. Members of this superfamily are variable in sequence and abundance in different species, but all variants target the coding sequence for the P-loop motif in the mRNA sequences for disease resistance proteins with nucleotide binding site (NBS) and leucine-rich repeat (LRR) motifs. We confirm, using transient expression in Nicotiana benthamiana, that miR482 targets mRNAs for NBS-LRR disease resistance proteins with coiled-coil domains at their N terminus. The targeting causes mRNA decay and production of secondary siRNAs in a manner that depends on RNA-dependent RNA polymerase 6. At least one of these secondary siRNAs targets other mRNAs of a defense-related protein. The miR482-mediated silencing cascade is suppressed in plants infected with viruses or bacteria so that expression of mRNAs with miR482 or secondary siRNA target sequences is increased. We propose that this process allows pathogen-inducible expression of NBS-LRR proteins and that it contributes to a novel layer of defense against pathogen attack. PMID:22408077

Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana)

PubMed Central

2010-01-01

Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
A study of lactose metabolism in Lactococcus garvieae reveals a genetic marker for distinguishing between dairy and fish biotypes.

PubMed

Fortina, Maria Grazia; Ricci, Giovanni; Borgo, Francesca

2009-06-01

Dairy and fish isolates of Lactococcus garvieae were tested for their ability to utilize lactose and to grow in milk. Fish isolates were unable to assimilate lactose, but unexpectedly, they possessed the ability to grow in milk. Genetic studies, carried out constructing different vectorette libraries, provided evidence that in fish isolates, no genes involved in lactose utilization were present. For L. garvieae dairy isolates, a single system for the catabolism of lactose was found. It consists of a lactose transport and hydrolysis depending on a phosphoenolpyruvate-dependent phosphotransferase system combined with a phospho-beta-galactosidase. The genes involved were highly similar at the nucleotide sequence level to their counterparts in Lactococcus lactis; however, while in many L. lactis strains these genes are plasmid encoded, in L. garvieae they are chromosomally located. Thus, in the species L. garvieae, the phospho-beta-galactosidase gene, detectable in all strains of dairy origin but lacking in fish isolates, can be considered a reliable genetic marker for distinguishing biotypes in the two diverse ecological niches. Moreover, we obtained information regarding the complete nucleotide sequence of the gal operon in L. garvieae, consisting of a galactose permease and the Leloir pathway enzymes. This is one of the first reports concerning the determination of the nucleotide sequences of genes (other than the 16S rDNA gene) in L. garvieae and should be considered a step in a continuous effort to explore the genome of this species, with the aim of determining the real relationship between the presence of L. garvieae in dairy products and food safety.
Isolation and Genomic Characterization of a Duck-Origin GPV-Related Parvovirus from Cherry Valley Ducklings in China

PubMed Central

Chen, Hao; Dou, Yanguo; Tang, Yi; Zhang, Zhenjie; Zheng, Xiaoqiang; Niu, Xiaoyu; Yang, Jing; Yu, Xianglong; Diao, Youxiang

2015-01-01

A newly emerged duck parvovirus, which causes beak atrophy and dwarfism syndrome (BADS) in Cherry Valley ducks, has appeared in Northern China since March 2015. To explore the genetic diversity among waterfowl parvovirus isolates, the complete genome of an identified isolate designated SDLC01 was sequenced and analyzed in the present study. Genomic sequence analysis showed that SDLC01 shared 90.8%–94.6% of nucleotide identity with goose parvovirus (GPV) isolates and 78.6%–81.6% of nucleotide identity with classical Muscovy duck parvovirus (MDPV) isolates. Phylogenetic analysis of 443 nucleotides (nt) of the fragment A showed that SDLC01 was highly similar to a mule duck isolate (strain D146/02) and close to European GPV isolates but separate from Asian GPV isolates. Analysis of the left inverted terminal repeat regions revealed that SDLC01 had two major segments deleted between positions 160–176 and 306–322 nt compared with field GPV and MDPV isolates. Phylogenetic analysis of Rep and VP1 encoded by two major open reading frames of parvoviruses revealed that SDLC01 was distinct from all GPV and MDPV isolates. The viral pathogenicity and genome characterization of SDLC01 suggest that the novel GPV (N-GPV) is the causative agent of BADS and belongs to a distinct GPV-related subgroup. Furthermore, N-GPV sequences were detected in diseased ducks by polymerase chain reaction and viral proliferation was demonstrated in duck embryos and duck embryo fibroblast cells. PMID:26465143
[Molecular epidemiological analysis of HIV-1 variants circulating in Russia in 1987-2015].

PubMed

Lapovok, I A; Lopatukhin, A E; Kireev, D E; Kazennova, E V; Lebedev, A V; Bobkova, M R; Kolomeets, A N; Turbina, G I; Shipulin, G A; Ladnaya, N N; Pokrovsky, V V

To simultaneously analyze HIV-1 samples from all Russian regions to characterize the epidemiology of HIV infection in the country as a whole. The most extensive study was conducted to examine nucleotide sequences of the pol gene of HIV-1 samples isolated from HIV-positive persons in different regions of Russia, with the diagnosis date being fixed during 1987-2015. The nucleotide sequences of the HIV-1 genome were analyzed using computer programs and on-line applications to identify a virus subtype and new recombinant forms. The nucleotide sequences of the pol gene were analyzed in 1697 HIV-1 samples and the findings were that the genetic variant subtype A1 (IDU-A) was dominant throughout the entire territory of Russia (in more than 80% of all infection cases). Other virus variants circulating in Russia were analyzed; the phenomenon of the higher distribution of the recombinant form CRF63/02A in Siberia, which had been previously described in the literature, was also confirmed. Four new recombinant forms generated by the virus subtype A1 (IDU-A) and B and two AG recombinant forms were found. There was a larger genetic distance between the viruses of IDU-A variant circulating among the injecting drug users and those infected through heterosexual contact, as well as a change in the viruses of subtype G that caused the outbreak in the south of the country over time in 1988-1989. The findings demonstrate continuous HIV-1 genetic variability and recombination over time in Russia, as well as increased genetic diversity with higher HIV infection rates in the population.
Molecular characterization of faba bean necrotic yellows viruses in Tunisia.

PubMed

Kraberger, Simona; Kumari, Safaa G; Najar, Asma; Stainton, Daisy; Martin, Darren P; Varsani, Arvind

2018-03-01

Faba bean necrotic yellows virus (FBNYV) (genus Nanovirus; family Nanoviridae) has a genome comprising eight individually encapsidated circular single-stranded DNA components. It has frequently been found infecting faba bean (Vicia faba L.) and chickpea (Cicer arietinum L.) in association with satellite molecules (alphasatellites). Genome sequences of FBNYV from Azerbaijan, Egypt, Iran, Morocco, Spain and Syria have been determined previously and we now report the first five genome sequences of FBNYV and associated alphasatellites from faba bean sampled in Tunisia. In addition, we have determined the genome sequences of two additional FBNYV isolates from chickpea plants sampled in Syria and Iran. All individual FBNYV genome component sequences that were determined here share > 84% nucleotide sequence identity with FBNYV sequences available in public databases, with the DNA-M component displaying the highest degree of diversity. As with other studied nanoviruses, recombination and genome component reassortment occurs frequently both between FBNYV genomes and between genomes of nanoviruses belonging to other species.
Partial bisulfite conversion for unique template sequencing

PubMed Central

Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael

2018-01-01

Abstract We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. PMID:29161423
Fungal endophyte diversity in Sarracenia.

PubMed

Glenn, Anthony; Bodri, Michael S

2012-01-01

Fungal endophytes were isolated from 4 species of the carnivorous pitcher plant genus Sarracenia: S. minor, S. oreophila, S. purpurea, and S. psittacina. Twelve taxa of fungi, 8 within the Ascomycota and 4 within the Basidiomycota, were identified based on PCR amplification and sequencing of the internal transcribed spacer sequences of nuclear ribosomal DNA (ITS rDNA) with taxonomic identity assigned using the NCBI nucleotide megablast search tool. Endophytes are known to produce a large number of metabolites, some of which may contribute to the protection and survival of the host. We speculate that endophyte-infected Sarracenia may benefit from their fungal associates by their influence on nutrient availability from within pitchers and, possibly, by directly influencing the biota within pitchers.
The complete sequence of the mitochondrial genome of Arctic fox (Alopex lagopus).

PubMed

Yan, Shou-Qing; Guo, Peng-Cheng; Yue, Yuan; Li, Wan-Hong; Bai, Chun-Yan; Li, Yu-Mei; Sun, Jin-Hai; Zhao, Zhi-Hui

2016-11-01

In the present study, the complete mitochondrial genome sequence of Arctic fox (Alopex lagopus) was determined for the first time. It has a total length of 16,656 bp, and contains 13 protein-coding genes, 22 tRNA genes, 2 ribosome RNA genes and 1 control region. The nucleotide composition is 31.3% for A, 26.2% for C, 14.8% for G and 27.7% for T, respectively. The D-loop region located between tRNA Pro and tRNA Phe contains a (ACACGTACACGCAT) 18 tandem repeat array. The data will be useful for the investigation of the genetic structure and diversity in the natural and farmed population of Arctic foxes.
Nucleotide diversity, natural variation, and evolution of Flexible culm-1 and Strong culm-2 lodging resistance genes in rice.

PubMed

Rashid, Muhammad Abdul Rehman; Zhao, Yan; Zhang, Hongliang; Li, Jinjie; Li, Zichao

2016-07-01

Lodging resistance is one of the vital traits in yield improvement and sustainability. Culm wall thickness, diameter, and strength are different traits that can govern the lodging resistance in rice. The genes SCM2 and FC1 have been isolated for culm thickness, strength, and flexibility, but their functional nucleotide variations were still unknown. We used a 13× deep sequence of 795 diverse genotypes to present the functional variation and SNP diversity in SCM2 and FC1. The major functional variant for the SCM2 gene was at position 27480181 and for the FC1 gene at position 31072992. Haplotype analysis of both genes provided their various allelic differences among haplotypes. SCM2 alleles further presented the evolution of Oryza sativa L. subsp. indica and subsp. japonica genomes from common parent in different geographical zones, while the haplotypes of FC1 suggested their evolution from different strains of the common parent Oryza rufipogon. SCM2 showed purifying selection and functional associations with rare alleles, while FC1 displayed balanced selection favored by multiple heterozygous alleles. Genotypes with an allelic combination of SCM2-3 and FC1-2 in japonica background exhibited striking resistance against lodging, which can be used in further breeding programs.
Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes.

PubMed

Kabra, Ritika; Kapil, Aditi; Attarwala, Kherunnisa; Rai, Piyush Kant; Shanker, Asheesh

2016-04-01

Microsatellites also known as Simple Sequence Repeats are short tandem repeats of 1-6 nucleotides. These repeats are found in coding as well as non-coding regions of both prokaryotic and eukaryotic genomes and play a significant role in the study of gene regulation, genetic mapping, DNA fingerprinting and evolutionary studies. The availability of 73 complete genome sequences of cyanobacteria enabled us to mine and statistically analyze microsatellites in these genomes. The cyanobacterial microsatellites identified through bioinformatics analysis were stored in a user-friendly database named CyanoSat, which is an efficient data representation and query system designed using ASP.net. The information in CyanoSat comprises of perfect, imperfect and compound microsatellites found in coding, non-coding and coding-non-coding regions. Moreover, it contains PCR primers with 200 nucleotides long flanking region. The mined cyanobacterial microsatellites can be freely accessed at www.compubio.in/CyanoSat/home.aspx. In addition to this 82 polymorphic, 13,866 unique and 2390 common microsatellites were also detected. These microsatellites will be useful in strain identification and genetic diversity studies of cyanobacteria.
A HIV-1 heterosexual transmission chain in Guangzhou, China: a molecular epidemiological study.

PubMed

Han, Zhigang; Leung, Tommy W C; Zhao, Jinkou; Wang, Ming; Fan, Lirui; Li, Kai; Pang, Xinli; Liang, Zhenbo; Lim, Wilina W L; Xu, Huifang

2009-09-25

We conducted molecular analyses to confirm four clustering HIV-1 infections (Patient A, B, C & D) in Guangzhou, China. These cases were identified by epidemiological investigation and suspected to acquire the infection through a common heterosexual transmission chain. Env C2V3V4 region, gag p17/p24 junction and partial pol gene of HIV-1 genome from serum specimens of these infected cases were amplified by reverse transcription polymerase chain reaction (RT-PCR) and nucleotide sequenced. Phylogenetic analyses indicated that their viral nucleotide sequences were significantly clustered together (bootstrap value is 99%, 98% and 100% in env, gag and pol tree respectively). Evolutionary distance analysis indicated that their genetic diversities of env, gag and pol genes were significantly lower than non-clustered controls, as measured by unpaired t-test (env gene comparison: p < 0.005; gag gene comparison: p < 0.005; pol gene comparison: p < 0.005). Epidemiological results and molecular analyses consistently illustrated these four cases represented a transmission chain which dispersed in the locality through heterosexual contact involving commercial sex worker.
Angiostrongylus cantonensis: identification and characterization of microRNAs in male and female adults.

PubMed

Chen, Mu-Xin; Ai, Lin; Xu, Min-Jun; Zhang, Ren-Li; Chen, Shao-Hong; Zhang, Yong-Nian; Guo, Jian; Cai, Yu-Chun; Tian, Li-Guang; Zhang, Ling-Ling; Zhu, Xing-Quan; Chen, Jia-Xu

2011-06-01

Angiostrongylus cantonensis causes eosinophilic meningitis and eosinophilic pleocytosis in humans and is of significant socio-economic importance globally. microRNAs (miRNAs) are endogenous small non-coding RNAs that play crucial roles in gene expression regulation, cellular function and defense, homeostasis and pathogenesis. They have been identified in a diverse range of organisms. The objective of this study was to determine and characterize miRNAs of female and male adults of A. cantonensis by Solexa deep sequencing. A total of 8,861,260 and 10,957,957 high quality reads with 20 and 23 conserved miRNAs were obtained in females and males, respectively. No new miRNA sequence was found. Nucleotide bias analysis showed that uracil was the prominent nucleotide, particularly at positions of 1, 10, 14, 17 and 22, approximately at the beginning, middle and the end of the conserved miRNAs. To our knowledge, this is the first report of miRNA profiles in A. cantonensis, which may represent a new platform for studying regulation of genes and their networks in A. cantonensis. Copyright © 2011 Elsevier Inc. All rights reserved.
Switchgrass ubiquitin promoter (PVUBI2) and uses thereof

DOEpatents

Stewart, C. Neal; Mann, David George James

2013-12-10

The subject application provides polynucleotides, compositions thereof and methods for regulating gene expression in a plant. Polynucleotides disclosed herein comprise novel sequences for a promoter isolated from Panicum virgatum (switchgrass) that initiates transcription of an operably linked nucleotide sequence. Thus, various embodiments of the invention comprise the nucleotide sequence of SEQ ID NO: 2 or fragments thereof comprising nucleotides 1 to 692 of SEQ ID NO: 2 that are capable of driving the expression of an operably linked nucleic acid sequence.
A genomewide catalogue of single nucleotide polymorphisms in white-beaked and Atlantic white-sided dolphins.

PubMed

Fernández, R; Schubert, M; Vargas-Velázquez, A M; Brownlow, A; Víkingsson, G A; Siebert, U; Jensen, L F; Øien, N; Wall, D; Rogan, E; Mikkelsen, B; Dabin, W; Alfarhan, A H; Alquraishi, S A; Al-Rasheid, K A S; Guillot, G; Orlando, L

2016-01-01

The field of population genetics is rapidly moving into population genomics as the quantity of data generated by high-throughput sequencing platforms increases. In this study, we used restriction-site-associated DNA sequencing (RADSeq) to recover genomewide genotypes from 70 white-beaked (Lagenorhynchus albirostris) and 43 Atlantic white-sided dolphins (L. acutus) gathered throughout their north-east Atlantic distribution range. Both species are at a high risk of being negatively affected by climate change. Here, we provide a resource of 38,240 RAD-tags and 52,981 nuclear SNPs shared between both species. We have estimated overall higher levels of nucleotide diversity in white-sided (π = 0.0492 ± 0.0006%) than in white-beaked dolphins (π = 0.0300 ± 0.0004%). White-sided dolphins sampled in the Faroe Islands, belonging to two pods (N = 7 and N = 11), showed similar levels of diversity (π = 0.0317 ± 0.0007% and 0.0267 ± 0.0006%, respectively) compared to unrelated individuals of the same species sampled elsewhere (e.g. π = 0.0285 ± 0.0007% for 11 Scottish individuals). No evidence of higher levels of kinship within pods can be derived from our analyses. When identifying the most likely number of genetic clusters among our sample set, we obtained an estimate of two to four clusters, corresponding to both species and possibly, two further clusters within each species. A higher diversity and lower population structuring was encountered in white-sided dolphins from the north-east Atlantic, in line with their preference for pelagic waters, as opposed to white-beaked dolphins that have a more patchy distribution, mainly across continental shelves. © 2015 John Wiley & Sons Ltd.
Genetic Variation of the Endangered Neotropical Catfish Steindachneridion scriptum (Siluriformes: Pimelodidae)

PubMed Central

Paixão, Rômulo V.; Ribolli, Josiane; Zaniboni-Filho, Evoy

2018-01-01

Steindachneridion scriptum is an important species as a resource for fisheries and aquaculture; it is currently threatened and has a reduced occurrence in South America. The damming of rivers, overfishing, and contamination of freshwater environments are the main impacts on the maintenance of this species. We accessed the genetic diversity and structure of S. scriptum using the DNA barcode and control region (D-loop) sequences of 43 individuals from the Upper Uruguay River Basin (UUR) and 10 sequences from the Upper Paraná River Basin (UPR), which were obtained from GenBank. S. scriptum from the UUR and the UPR were assigned in two distinct molecular operational taxonomic units (MOTUs) with higher inter-specific K2P distance than the optimum threshold (OT = 0.0079). The COI Intra-MOTU distances of S. scriptum specimens from the UUR ranged from 0.0000 to 0.0100. The control region indicated a high number of haplotypes and low nucleotide diversity, compatible with a new population in recent expansion process. Genetic structure was observed, with high differentiation between UUR and UPR basins, identified by BAPS, haplotype network, AMOVA (FST = 0.78, p < 0.05) and Mantel test. S. scriptum from the UUR showed a slight differentiation (FST = 0.068, p < 0.05), but not isolation-by-distance. Negative values of Tajima’s D and Fu’s Fs suggest recent demographic oscillations. The Bayesian skyline plot analysis indicated possible population expansion from beginning 2,500 years ago and a recent reduction in the population size. Low nucleotide diversity, spatial population structure, and the reduction of effective population size should be considered for the planning of strategies aimed at the conservation and rehabilitation of this important fisheries resource. PMID:29520295
Whole-genome analyses of DS-1-like human G2P[4] and G8P[4] rotavirus strains from Eastern, Western and Southern Africa

PubMed Central

Nyaga, Martin M.; Stucker, Karla M.; Esona, Mathew D.; Jere, Khuzwayo C.; Mwinyi, Bakari; Shonhai, Annie; Tsolenyanu, Enyonam; Mulindwa, Augustine; Chibumbya, Julia N.; Adolfine, Hokororo; Halpin, Rebecca A.; Roy, Sunando; Stockwell, Timothy B.; Berejena, Chipo; Seheri, Mapaseka L.; Mwenda, Jason M.; Steele, A. Duncan; Wentworth, David E.

2018-01-01

Group A rotaviruses (RVAs) with distinct G and P genotype combinations have been reported globally. We report the genome composition and possible origin of seven G8P[4] and five G2P[4] human RVA strains based on the genetic evolution of all 11 genome segments at the nucleotide level. Twelve RVA ELISA positive stool samples collected in the representative countries of Eastern, Southern and West Africa during the 2007–2012 surveillance seasons were subjected to sequencing using the Ion Torrent PGM and Illumina MiSeq platforms. A reference-based assembly was performed using CLC Bio’s clc_ref_assemble_long program, and full-genome consensus sequences were obtained. With the exception of the neutralising antigen, VP7, all study strains exhibited the DS-1-like genome constellation (P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2) and clustered phylogenetically with reference strains having a DS-1-like genetic backbone. Comparison of the nucleotide and amino acid sequences with selected global cognate genome segments revealed nucleotide and amino acid sequence identities of 81.7–100 % and 90.6–100 %, respectively, with NSP4 gene segment showing the most diversity among the strains. Bayesian analyses of all gene sequences to estimate the time of divergence of the lineage indicated that divergence times ranged from 16 to 44 years, except for the NSP4 gene where the lineage seemed to arise in the more distant past at an estimated 203 years ago. However, the long-term effects of changes found within the NSP4 genome segment should be further explored, and thus we recommend continued whole-genome analyses from larger sample sets to determine the evolutionary mechanisms of the DS-1-like strains collected in Africa. PMID:24952422
Cloning and sequence analysis of sucrose phosphate synthase gene from varieties of Pennisetum species.

PubMed

Li, H C; Lu, H B; Yang, F Y; Liu, S J; Bai, C J; Zhang, Y W

2015-03-31

Sucrose phosphate synthase (SPS) is an enzyme used by higher plants for sucrose synthesis. In this study, three primer sets were designed on the basis of known SPS sequences from maize (GenBank: NM_001112224.1) and sugarcane (GenBank: JN584485.1), and five novel SPS genes were identified by RT-PCR from the genomes of Pennisetum spp (the hybrid P. americanum x P. purpureum, P. purpureum Schum., P. purpureum Schum. cv. Red, P. purpureum Schum. cv. Taiwan, and P. purpureum Schum. cv. Mott). The cloned sequences showed 99.9% identity and 80-88% similarity to the SPS sequences of other plants. The SPS gene of hybrid Pennisetum had one nucleotide and four amino acid polymorphisms compared to the other four germplasms, and cluster analysis was performed to assess genetic diversity in this species. Additional characterization of the SPS gene product can potentially allow Pennisetum to be exploited as a biofuel source.
k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and treesmore » determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less
Association analysis of single nucleotide polymorphisms in candidate genes with root traits in maize (Zea mays L.) seedlings.

PubMed

Kumar, Bharath; Abdel-Ghani, Adel H; Pace, Jordon; Reyes-Matamoros, Jenaro; Hochholdinger, Frank; Lübberstedt, Thomas

2014-07-01

Several genes involved in maize root development have been isolated. Identification of SNPs associated with root traits would enable the selection of maize lines with better root architecture that might help to improve N uptake, and consequently plant growth particularly under N deficient conditions. In the present study, an association study (AS) panel consisting of 74 maize inbred lines was screened for seedling root traits in 6, 10, and 14-day-old seedlings. Allele re-sequencing of candidate root genes Rtcl, Rth3, Rum1, and Rul1 was also carried out in the same AS panel lines. All four candidate genes displayed different levels of nucleotide diversity, haplotype diversity and linkage disequilibrium. Gene based association analyses were carried out between individual polymorphisms in candidate genes, and root traits measured in 6, 10, and 14-day-old maize seedlings. Association analyses revealed several polymorphisms within the Rtcl, Rth3, Rum1, and Rul1 genes associated with seedling root traits. Several nucleotide polymorphisms in Rtcl, Rth3, Rum1, and Rul1 were significantly (P<0.05) associated with seedling root traits in maize suggesting that all four tested genes are involved in the maize root development. Thus considerable allelic variation present in these root genes can be exploited for improving maize root characteristics. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Genomic variation among populations of threatened coral: Acropora cervicornis.

PubMed

Drury, C; Dale, K E; Panlilio, J M; Miller, S V; Lirman, D; Larson, E A; Bartels, E; Crawford, D L; Oleksiak, M F

2016-04-13

Acropora cervicornis, a threatened, keystone reef-building coral has undergone severe declines (>90 %) throughout the Caribbean. These declines could reduce genetic variation and thus hamper the species' ability to adapt. Active restoration strategies are a common conservation approach to mitigate species' declines and require genetic data on surviving populations to efficiently respond to declines while maintaining the genetic diversity needed to adapt to changing conditions. To evaluate active restoration strategies for the staghorn coral, the genetic diversity of A. cervicornis within and among populations was assessed in 77 individuals collected from 68 locations along the Florida Reef Tract (FRT) and in the Dominican Republic. Genotyping by Sequencing (GBS) identified 4,764 single nucleotide polymorphisms (SNPs). Pairwise nucleotide differences (π) within a population are large (~37 %) and similar to π across all individuals. This high level of genetic diversity along the FRT is similar to the diversity within a small, isolated reef. Much of the genetic diversity (>90 %) exists within a population, yet GBS analysis shows significant variation along the FRT, including 300 SNPs with significant FST values and significant divergence relative to distance. There are also significant differences in SNP allele frequencies over small spatial scales, exemplified by the large FST values among corals collected within Miami-Dade county. Large standing diversity was found within each population even after recent declines in abundance, including significant, potentially adaptive divergence over short distances. The data here inform conservation and management actions by uncovering population structure and high levels of diversity maintained within coral collections among sites previously shown to have little genetic divergence. More broadly, this approach demonstrates the power of GBS to resolve differences among individuals and identify subtle genetic structure, informing conservation goals with evolutionary implications.

Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

USDA-ARS?s Scientific Manuscript database

Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...
Extension of the COG and arCOG databases by amino acid and nucleotide sequences

PubMed Central

Meereis, Florian; Kaufmann, Michael

2008-01-01

Background The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries. Results Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information. In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at . Conclusion NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document. PMID:19014535
DNA Nucleotide Sequence Restricted by the RI Endonuclease

PubMed Central

Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.

1972-01-01

The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974
Balancing Selection on a Regulatory Region Exhibiting Ancient Variation That Predates Human–Neandertal Divergence

PubMed Central

Iskow, Rebecca C.; Austermann, Christian; Scharer, Christopher D.; Raj, Towfique; Boss, Jeremy M.; Sunyaev, Shamil; Price, Alkes; Stranger, Barbara; Simon, Viviana; Lee, Charles

2013-01-01

Ancient population structure shaping contemporary genetic variation has been recently appreciated and has important implications regarding our understanding of the structure of modern human genomes. We identified a ∼36-kb DNA segment in the human genome that displays an ancient substructure. The variation at this locus exists primarily as two highly divergent haplogroups. One of these haplogroups (the NE1 haplogroup) aligns with the Neandertal haplotype and contains a 4.6-kb deletion polymorphism in perfect linkage disequilibrium with 12 single nucleotide polymorphisms (SNPs) across diverse populations. The other haplogroup, which does not contain the 4.6-kb deletion, aligns with the chimpanzee haplotype and is likely ancestral. Africans have higher overall pairwise differences with the Neandertal haplotype than Eurasians do for this NE1 locus (p<10−15). Moreover, the nucleotide diversity at this locus is higher in Eurasians than in Africans. These results mimic signatures of recent Neandertal admixture contributing to this locus. However, an in-depth assessment of the variation in this region across multiple populations reveals that African NE1 haplotypes, albeit rare, harbor more sequence variation than NE1 haplotypes found in Europeans, indicating an ancient African origin of this haplogroup and refuting recent Neandertal admixture. Population genetic analyses of the SNPs within each of these haplogroups, along with genome-wide comparisons revealed significant FST (p = 0.00003) and positive Tajima's D (p = 0.00285) statistics, pointing to non-neutral evolution of this locus. The NE1 locus harbors no protein-coding genes, but contains transcribed sequences as well as sequences with putative regulatory function based on bioinformatic predictions and in vitro experiments. We postulate that the variation observed at this locus predates Human–Neandertal divergence and is evolving under balancing selection, especially among European populations. PMID:23593015
Assessment of snake DNA barcodes based on mitochondrial COI and Cytb genes revealed multiple putative cryptic species in Thailand.

PubMed

Laopichienpong, Nararat; Muangmai, Narongrit; Supikamolseni, Arrjaree; Twilprawat, Panupon; Chanhome, Lawan; Suntrarachun, Sunutcha; Peyachoknagul, Surin; Srikulnath, Kornsorn

2016-12-15

DNA barcodes of mitochondrial cytochrome c oxidase I (COI), cytochrome b (Cytb) genes, and their combined data sets were constructed from 35 snake species in Thailand. No barcoding gap was detected in either of the two genes from the observed intra- and interspecific sequence divergences. Intra- and interspecific sequence divergences of the COI gene differed 14 times, with barcode cut-off scores ranging over 2%-4% for threshold values differentiated among most of the different species; the Cytb gene differed 6 times with cut-off scores ranging over 2%-6%. Thirty-five specific nucleotide mutations were also found at interspecific level in the COI gene, identifying 18 snake species, but no specific nucleotide mutation was observed for Cytb in any single species. This suggests that COI barcoding was a better marker than Cytb. Phylogenetic clustering analysis indicated that most species were represented by monophyletic clusters, suggesting that these snake species could be clearly differentiated using COI barcodes. However, the two-marker combination of both COI and Cytb was more effective, differentiating snake species by over 2%-4%, and reducing species numbers in the overlap value between intra- and interspecific divergences. Three species delimitation algorithms (general mixed Yule-coalescent, automatic barcoding gap detection, and statistical parsimony network analysis) were extensively applied to a wide range of snakes based on both barcodes. This revealed cryptic diversity for eleven snake species in Thailand. In addition, eleven accessions from the database previously grouped under the same species were represented at different species level, suggesting either high genetic diversity, or the misidentification of these sequences in the database as a consequence of cryptic species. Copyright © 2016 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex

Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less
Population genetic structure and phylogeographical pattern of a relict tree fern, Alsophila spinulosa (Cyatheaceae), inferred from cpDNA atpB- rbcL intergenic spacers.

PubMed

Su, Yingjuan; Wang, Ting; Zheng, Bo; Jiang, Yu; Chen, Guopei; Gu, Hongya

2004-11-01

Sequences of chloroplast DNA (cpDNA) atpB- rbcL intergenic spacers of individuals of a tree fern species, Alsophila spinulosa, collected from ten relict populations distributed in the Hainan and Guangdong provinces, and the Guangxi Zhuang region in southern China, were determined. Sequence length varied from 724 bp to 731 bp, showing length polymorphism, and base composition was with high A+T content between 63.17% and 63.95%. Sequences were neutral in terms of evolution (Tajima's criterion D=-1.01899, P>0.10 and Fu and Li's test D*=-1.39008, P>0.10; F*=-1.49775, P>0.10). A total of 19 haplotypes were identified based on nucleotide variation. High levels of haplotype diversity (h=0.744) and nucleotide diversity (Dij=0.01130) were detected in A. spinulosa, probably associated with its long evolutionary history, which has allowed the accumulation of genetic variation within lineages. Both the minimum spanning network and neighbor-joining trees generated for haplotypes demonstrated that current populations of A. spinulosa existing in Hainan, Guangdong, and Guangxi were subdivided into two geographical groups. An analysis of molecular variance indicated that most of the genetic variation (93.49%, P<0.001) was partitioned among regions. Wright's isolation by distance model was not supported across extant populations. Reduced gene flow by the Qiongzhou Strait and inbreeding may result in the geographical subdivision between the Hainan and Guangdong + Guangxi populations (FST=0.95, Nm=0.03). Within each region, the star-like pattern of phylogeography of haplotypes implied a population expansion process during evolutionary history. Gene genealogies together with coalescent theory provided significant information for uncovering phylogeography of A. spinulosa.
Genetic variation in Pythium myriotylum based on SNP typing and development of a PCR-RFLP detection of isolates recovered from Pythium soft rot ginger.

PubMed

Le, D P; Smith, M K; Aitken, E A B

2017-10-01

Pythium myriotylum is responsible for severe losses in both capsicum and ginger crops in Australia under different regimes. Intraspecific genomic variation within the pathogen might explain the differences in aggressiveness and pathogenicity on diverse hosts. In this study, whole genome data of four P. myriotylum isolates recovered from three hosts and one Pythium zingiberis isolate were derived and analysed for sequence diversity based on single nucleotide polymorphisms (SNPs). A higher number of true and unique SNPs occurred in P. myriotylum isolates obtained from ginger with symptoms of Pythium soft rot (PSR) in Australia compared to other P. myriotylum isolates. Overall, SNPs were discovered more in the mitochondrial genome than those in the nuclear genome. Among the SNPs, a single substitution from the cytosine (C) to the thymine (T) in the partially sequenced CoxII gene of 14 representatives of PSR P. myriotylum isolates was within a restriction site of HinP1I enzyme which was used in the PCR-RFLP for detection and identification of the isolates without sequencing. The PCR-RFLP was also sensitive to detect PSR P. myriotylum strains from artificially infected ginger without the need for isolation for pure cultures. This is the first study of intraspecific variants of Pythium myriotylum isolates recovered from different hosts and origins based on single nucleotide polymorphism (SNP) genotyping of multiple genes. The SNPs discovered provide valuable makers for detection and identification of P. myriotylum strains initially isolated from Pythium soft rot (PSR) ginger by using PCR-RFLP of the CoxII locus. The PCR-RFLP was also sensitive to detect P. myriotylum directly from PSR ginger sampled from pot trials without the need of isolation for pure cultures. © 2017 The Society for Applied Microbiology.
Comparative genomic analyses reveal a vast, novel network of nucleotide-centric systems in biological conflicts, immunity and signaling

PubMed Central

Burroughs, A. Maxwell; Zhang, Dapeng; Schäffer, Daniel E.; Iyer, Lakshminarayan M.; Aravind, L.

2015-01-01

Cyclic di- and linear oligo-nucleotide signals activate defenses against invasive nucleic acids in animal immunity; however, their evolutionary antecedents are poorly understood. Using comparative genomics, sequence and structure analysis, we uncovered a vast network of systems defined by conserved prokaryotic gene-neighborhoods, which encode enzymes generating such nucleotides or alternatively processing them to yield potential signaling molecules. The nucleotide-generating enzymes include several clades of the DNA-polymerase β-like superfamily (including Vibrio cholerae DncV), a minimal version of the CRISPR polymerase and DisA-like cyclic-di-AMP synthetases. Nucleotide-binding/processing domains include TIR domains and members of a superfamily prototyped by Smf/DprA proteins and base (cytokinin)-releasing LOG enzymes. They are combined in conserved gene-neighborhoods with genes for a plethora of protein superfamilies, which we predict to function as nucleotide-sensors and effectors targeting nucleic acids, proteins or membranes (pore-forming agents). These systems are sometimes combined with other biological conflict-systems such as restriction-modification and CRISPR/Cas. Interestingly, several are coupled in mutually exclusive neighborhoods with either a prokaryotic ubiquitin-system or a HORMA domain-PCH2-like AAA+ ATPase dyad. The latter are potential precursors of equivalent proteins in eukaryotic chromosome dynamics. Further, components from these nucleotide-centric systems have been utilized in several other systems including a novel diversity-generating system with a reverse transcriptase. We also found the Smf/DprA/LOG domain from these systems to be recruited as a predicted nucleotide-binding domain in eukaryotic TRPM channels. These findings point to evolutionary and mechanistic links, which bring together CRISPR/Cas, animal interferon-induced immunity, and several other systems that combine nucleic-acid-sensing and nucleotide-dependent signaling. PMID:26590262
Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification.

PubMed

Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc'h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

2017-01-01

Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus's but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies.
Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification

PubMed Central

Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc’h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

2017-01-01

Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus’s but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies. PMID:28362878
Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

PubMed

Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol

2016-01-01

The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.
Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

DOE PAGES

Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; ...

2016-11-10

The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less
Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.

The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less
Sequence of a cDNA encoding pancreatic preprosomatostatin-22.

PubMed Central

Magazin, M; Minth, C D; Funckes, C L; Deschenes, R; Tavianini, M A; Dixon, J E

1982-01-01

We report the nucleotide sequence of a precursor to somatostatin that upon proteolytic processing may give rise to a hormone of 22 amino acids. The nucleotide sequence of a cDNA from the channel catfish (Ictalurus punctatus) encodes a precursor to somatostatin that is 105 amino acids (Mr, 11,500). The cDNA coding for somatostatin-22 consists of 36 nucleotides in the 5' untranslated region, 315 nucleotides that code for the precursor to somatostatin-22, 269 nucleotides at the 3' untranslated region, and a variable length of poly(A). The putative preprohormone contains a sequence of hydrophobic amino acids at the amino terminus that has the properties of a "signal" peptide. A connecting sequence of approximately 57 amino acids is followed by a single Arg-Arg sequence, which immediately precedes the hormone. Somatostatin-22 is homologous to somatostatin-14 in 7 of the 14 amino acids, including the Phe-Trp-Lys sequence. Hybridization selection of mRNA, followed by its translation in a wheat germ cell-free system, resulted in the synthesis of a single polypeptide having a molecular weight of approximately 10,000 as estimated on Na-DodSO4/polyacrylamide gels. Images PMID:6127673
Assessing genetic diversity of wild and hatchery samples of the Chinese sucker (Myxocyprinus asiaticus) by the mitochondrial DNA control region.

PubMed

Wu, Jiayun; Wu, Bo; Hou, Feixia; Chen, Yongbai; Li, Chong; Song, Zhaobin

2016-01-01

To restore the natural populations of Chinese sucker (Myxocyprinus asiaticus), a hatchery release program has been underway for nearly 10 years. Using DNA sequences of the mitochondrial control region, we assessed the genetic diversity and genetic structure among samples collected from three sites of the wild population as well as from three hatcheries. The haplotype diversity of the wild samples (h = 0.899-0.975) was significantly higher than that of the hatchery ones (h = 0.296-0.666), but the nucleotide diversity was almost identical between them (π = 0.0170-0.0280). Relatively high gene flow was detected between the hatchery and wild samples. Analysis of effective population size indicated that M. asiaticus living in the Yangtze River has been expanding following a bottleneck in the recent past. Our results suggest the hatchery release programs for M. asiaticus have not reduced the genetic diversity, but have influenced the genetic structure of the species in the upper Yangtze River.
Plant nitrogen regulatory P-PII genes

DOEpatents

Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun

2001-01-01

The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the
Occurrence and genetic diversity of the Plasmopara halstedii virus in sunflower downy mildew populations of the world.

PubMed

Grasse, Wolfgang; Spring, Otmar

2015-03-01

Plasmopara halstedii virus (PhV) is a ss(+)RNA virus that exclusively occurs in the sunflower downy mildew pathogen Plasmopara halstedii, a biotrophic oomycete of severe economic impact. The virus origin and its genomic variability are unknown. A PCR-based screening of 128 samples of P. halstedii from five continents and up to 40 y old was conducted. PhV RNA was found in over 90 % of the isolates with no correlation to geographic origin or pathotype of its host. Sequence analyses of the two open reading frames (ORFs) revealed only 18 single nucleotide polymorphisms (SNPs) in 3873 nucleotides. The SNPs had no recognizable effect on the two encoded virus proteins. In 398 nucleotides of the untranslated regions (UTRs) of the RNA 2 strand eight additional SNPs and one short deletion was found. Modelling experiments revealed no effects of these variations on the secondary structure of the RNA. The results showed the presence of PhV in P. halstedii isolates of global origin and the existence of the virus since more than 40 y. The virus genome revealed a surprisingly low variation in both coding and noncoding parts. No sequence differences were correlated with host pathotype or geographic populations of the oomycete. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Genetic Diversity of Sheep Breeds from Albania, Greece, and Italy Assessed by Mitochondrial DNA and Nuclear Polymorphisms (SNPs)

PubMed Central

Pariset, Lorraine; Mariotti, Marco; Gargani, Maria; Joost, Stephane; Negrini, Riccardo; Perez, Trinidad; Bruford, Michael; Ajmone Marsan, Paolo; Valentini, Alessio

2011-01-01

We employed mtDNA and nuclear SNPs to investigate the genetic diversity of sheep breeds of three countries of the Mediterranean basin: Albania, Greece, and Italy. In total, 154 unique mtDNA haplotypes were detected by means of D-loop sequence analysis. The major nucleotide diversity was observed in Albania. We identified haplogroups, A, B, and C in Albanian and Greek samples, while Italian individuals clustered in groups A and B. In general, the data show a pattern reflecting old migrations that occurred in postneolithic and historical times. PCA analysis on SNP data differentiated breeds with good correspondence to geographical locations. This could reflect geographical isolation, selection operated by local sheep farmers, and different flock management and breed admixture that occurred in the last centuries. PMID:22125424
Systematic and Evolutionary Insights Derived from mtDNA COI Barcode Diversity in the Decapoda (Crustacea: Malacostraca)

PubMed Central

Matzen da Silva, Joana; Creer, Simon; dos Santos, Antonina; Costa, Ana C.; Cunha, Marina R.; Costa, Filipe O.; Carvalho, Gary R.

2011-01-01

Background Decapods are the most recognizable of all crustaceans and comprise a dominant group of benthic invertebrates of the continental shelf and slope, including many species of economic importance. Of the 17635 morphologically described Decapoda species, only 5.4% are represented by COI barcode region sequences. It therefore remains a challenge to compile regional databases that identify and analyse the extent and patterns of decapod diversity throughout the world. Methodology/Principal Findings We contributed 101 decapod species from the North East Atlantic, the Gulf of Cadiz and the Mediterranean Sea, of which 81 species represent novel COI records. Within the newly-generated dataset, 3.6% of the species barcodes conflicted with the assigned morphological taxonomic identification, highlighting both the apparent taxonomic ambiguity among certain groups, and the need for an accelerated and independent taxonomic approach. Using the combined COI barcode projects from the Barcode of Life Database, we provide the most comprehensive COI data set so far examined for the Order (1572 sequences of 528 species, 213 genera, and 67 families). Patterns within families show a general predicted molecular hierarchy, but the scale of divergence at each taxonomic level appears to vary extensively between families. The range values of mean K2P distance observed were: within species 0.285% to 1.375%, within genus 6.376% to 20.924% and within family 11.392% to 25.617%. Nucleotide composition varied greatly across decapods, ranging from 30.8 % to 49.4 % GC content. Conclusions/Significance Decapod biological diversity was quantified by identifying putative cryptic species allowing a rapid assessment of taxon diversity in groups that have until now received limited morphological and systematic examination. We highlight taxonomic groups or species with unusual nucleotide composition or evolutionary rates. Such data are relevant to strategies for conservation of existing decapod biodiversity, as well as elucidating the mechanisms and constraints shaping the patterns observed. PMID:21589909

Synthesis and evaluations of an acid-cleavable, fluorescently labeled nucleotide as a reversible terminator for DNA sequencing.

PubMed

Tan, Lianjiang; Liu, Yazhi; Li, Xiaowei; Wu, Xin-Yan; Gong, Bing; Shen, Yu-Mei; Shao, Zhifeng

2016-02-11

An acid-cleavable linker based on a dimethylketal moiety was synthesized and used to connect a nucleotide with a fluorophore to produce a 3'-OH unblocked nucleotide analogue as an excellent reversible terminator for DNA sequencing by synthesis.
Rumen Bacterial Diversity of 80 to 110-Day-Old Goats Using 16S rRNA Sequencing

PubMed Central

Han, Xufeng; Yang, Yuxin; Yan, Hailong; Wang, Xiaolong; Qu, Lei; Chen, Yulin

2015-01-01

The ability of rumen microorganisms to use fibrous plant matter plays an important role in ruminant animals; however, little information about rumen colonization by microbial populations after weaning has been reported. In this study, high-throughput sequencing was used to investigate the establishment of this microbial population in 80 to 110-day-old goats. Illumina sequencing of goat rumen samples yielded 101,356,610 nucleotides that were assembled into 256,868 reads with an average read length of 394 nucleotides. Taxonomic analysis of metagenomic reads indicated that the predominant phyla were distinct at different growth stages. The phyla Firmicutes and Synergistetes were predominant in samples taken from 80 to 100-day-old goats, but Bacteroidetes and Firmicutes became the most abundant phyla in samples from 110-day-old animals. There was a remarkable variation in the microbial populations with age; Firmicutes and Synergistetes decreased after weaning, but Bacteroidetes and Proteobacteria increased from 80 to 110 day of age. These findings suggested that colonization of the rumen by microorganisms is related to their function in the rumen digestive system. These results give a better understanding of the role of rumen microbes and the establishment of the microbial population, which help to maintain the host’s health and improve animal performance. PMID:25700157
The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus

PubMed Central

Matsuda, Fumihiko; Ishii, Kazuo; Bourvagnet, Patrice; Kuma, Kei-ichi; Hayashida, Hidenori; Miyata, Takashi; Honjo, Tasuku

1998-01-01

The complete nucleotide sequence of the 957-kb DNA of the human immunoglobulin heavy chain variable (VH) region locus was determined and 43 novel VH segments were identified. The region contains 123 VH segments classifiable into seven different families, of which 79 are pseudogenes. Of the 44 VH segments with an open reading frame, 39 are expressed as heavy chain proteins and 1 as mRNA, while the remaining 4 are not found in immunoglobulin cDNAs. Combinatorial diversity of VH region was calculated to be ∼6,000. Conservation of the promoter and recombination signal sequences was observed to be higher in functional VH segments than in pseudogenes. Phylogenetic analysis of 114 VH segments clearly showed clustering of the VH segments of each family. However, an independent branch in the tree contained a single VH, V4-44.1P, sharing similar levels of homology to human VH families and to those of other vertebrates. Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the locus took place at least eight times between 133 and 10 million years ago. One nonimmunoglobulin gene of unknown function was identified in the intergenic region. PMID:9841928
Prevalence of Tobacco mosaic virus in Iran and Evolutionary Analyses of the Coat Protein Gene

PubMed Central

Alishiri, Athar; Rakhshandehroo, Farshad; Zamanizadeh, Hamid-Reza; Palukaitis, Peter

2013-01-01

The incidence and distribution of Tobacco mosaic virus (TMV) and related tobamoviruses was determined using an enzyme-linked immunosorbent assay on 1,926 symptomatic horticultural crops and 107 asymptomatic weed samples collected from 78 highly infected fields in the major horticultural crop-producing areas in 17 provinces throughout Iran. The results were confirmed by host range studies and reverse transcription-polymerase chain reaction. The overall incidence of infection by these viruses in symptomatic plants was 11.3%. The coat protein (CP) gene sequences of a number of isolates were determined and disclosed to be a high identity (up to 100%) among the Iranian isolates. Phylogenetic analysis of all known TMV CP genes showed three clades on the basis of nucleotide sequences with all Iranian isolates distinctly clustered in clade II. Analysis using the complete CP amino acid sequence showed one clade with two subgroups, IA and IB, with Iranian isolates in both subgroups. The nucleotide diversity within each sub-group was very low, but higher between the two clades. No correlation was found between genetic distance and geographical origin or host species of isolation. Statistical analyses suggested a negative selection and demonstrated the occurrence of gene flow from the isolates in other clades to the Iranian population. PMID:25288953
Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

PubMed Central

Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

2016-01-01

DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962
Molecular Cloning and Sequencing of Hemoglobin-Beta Gene of Channel Catfish, Ictalurus Punctatus Rafinesque

USDA-ARS?s Scientific Manuscript database

: Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.

PubMed

Zhang, Wei; Qi, Weihong; Albert, Thomas J; Motiwala, Alifiya S; Alland, David; Hyytia-Trees, Eija K; Ribot, Efrain M; Fields, Patricia I; Whittam, Thomas S; Swaminathan, Bala

2006-06-01

Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7x10(-9) per site per year), we estimate that the most recent common ancestor of the contemporary beta-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens.
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms

PubMed Central

Zhang, Wei; Qi, Weihong; Albert, Thomas J.; Motiwala, Alifiya S.; Alland, David; Hyytia-Trees, Eija K.; Ribot, Efrain M.; Fields, Patricia I.; Whittam, Thomas S.; Swaminathan, Bala

2006-01-01

Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7 × 10−9 per site per year), we estimate that the most recent common ancestor of the contemporary β-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens. PMID:16606700
Novel methodologies for spectral classification of exon and intron sequences

NASA Astrophysics Data System (ADS)

Kwan, Hon Keung; Kwan, Benjamin Y. M.; Kwan, Jennifer Y. Y.

2012-12-01

Digital processing of a nucleotide sequence requires it to be mapped to a numerical sequence in which the choice of nucleotide to numeric mapping affects how well its biological properties can be preserved and reflected from nucleotide domain to numerical domain. Digital spectral analysis of nucleotide sequences unfolds a period-3 power spectral value which is more prominent in an exon sequence as compared to that of an intron sequence. The success of a period-3 based exon and intron classification depends on the choice of a threshold value. The main purposes of this article are to introduce novel codes for 1-sequence numerical representations for spectral analysis and compare them to existing codes to determine appropriate representation, and to introduce novel thresholding methods for more accurate period-3 based exon and intron classification of an unknown sequence. The main findings of this study are summarized as follows: Among sixteen 1-sequence numerical representations, the K-Quaternary Code I offers an attractive performance. A windowed 1-sequence numerical representation (with window length of 9, 15, and 24 bases) offers a possible speed gain over non-windowed 4-sequence Voss representation which increases as sequence length increases. A winner threshold value (chosen from the best among two defined threshold values and one other threshold value) offers a top precision for classifying an unknown sequence of specified fixed lengths. An interpolated winner threshold value applicable to an unknown and arbitrary length sequence can be estimated from the winner threshold values of fixed length sequences with a comparable performance. In general, precision increases as sequence length increases. The study contributes an effective spectral analysis of nucleotide sequences to better reveal embedded properties, and has potential applications in improved genome annotation.
Complete nucleotide sequence of a novel Hibiscus-infecting Cilevirus from Florida and its relationship with closely associated Cileviruses

USDA-ARS?s Scientific Manuscript database

The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...
40 CFR 174.3 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-07-01

..., flowers, and pollen. Noncoding, nonexpressed nucleotide sequences means the nucleotide sequences are not... surgical alteration of the plant pistil, bud pollination, mentor pollen, immunosuppressants, in vitro...
40 CFR 174.3 - Definitions.

Code of Federal Regulations, 2012 CFR

2012-07-01

..., flowers, and pollen. Noncoding, nonexpressed nucleotide sequences means the nucleotide sequences are not... surgical alteration of the plant pistil, bud pollination, mentor pollen, immunosuppressants, in vitro...
40 CFR 174.3 - Definitions.

Code of Federal Regulations, 2013 CFR

2013-07-01

..., flowers, and pollen. Noncoding, nonexpressed nucleotide sequences means the nucleotide sequences are not... surgical alteration of the plant pistil, bud pollination, mentor pollen, immunosuppressants, in vitro...
40 CFR 174.3 - Definitions.

Code of Federal Regulations, 2011 CFR

2011-07-01

..., flowers, and pollen. Noncoding, nonexpressed nucleotide sequences means the nucleotide sequences are not... surgical alteration of the plant pistil, bud pollination, mentor pollen, immunosuppressants, in vitro...
40 CFR 174.3 - Definitions.

Code of Federal Regulations, 2014 CFR

2014-07-01

..., flowers, and pollen. Noncoding, nonexpressed nucleotide sequences means the nucleotide sequences are not... surgical alteration of the plant pistil, bud pollination, mentor pollen, immunosuppressants, in vitro...
Association of candidate genes with drought tolerance traits in diverse perennial ryegrass accessions

PubMed Central

Jiang, Yiwei

2013-01-01

Drought is a major environmental stress limiting growth of perennial grasses in temperate regions. Plant drought tolerance is a complex trait that is controlled by multiple genes. Candidate gene association mapping provides a powerful tool for dissection of complex traits. Candidate gene association mapping of drought tolerance traits was conducted in 192 diverse perennial ryegrass (Lolium perenne L.) accessions from 43 countries. The panel showed significant variations in leaf wilting, leaf water content, canopy and air temperature difference, and chlorophyll fluorescence under well-watered and drought conditions across six environments. Analysis of 109 simple sequence repeat markers revealed five population structures in the mapping panel. A total of 2520 expression-based sequence readings were obtained for a set of candidate genes involved in antioxidant metabolism, dehydration, water movement across membranes, and signal transduction, from which 346 single nucleotide polymorphisms were identified. Significant associations were identified between a putative LpLEA3 encoding late embryogenesis abundant group 3 protein and a putative LpFeSOD encoding iron superoxide dismutase and leaf water content, as well as between a putative LpCyt Cu-ZnSOD encoding cytosolic copper-zinc superoxide dismutase and chlorophyll fluorescence under drought conditions. Four of these identified significantly associated single nucleotide polymorphisms from these three genes were also translated to amino acid substitutions in different genotypes. These results indicate that allelic variation in these genes may affect whole-plant response to drought stress in perennial ryegrass. PMID:23386684
Genome sequences of a mouse-avirulent and a mouse-virulent strain of Ross River virus.

PubMed

Faragher, S G; Meek, A D; Rice, C M; Dalgarno, L

1988-04-01

The nucleotide sequence of the genomic RNA of a mouse-avirulent strain of Ross River virus, RRV NB5092 (isolated in 1969), has been determined and the corresponding sequence for the prototype mouse-virulent strain, RRV T48 (isolated in 1959), has been completed. The RRV NB5092 genome is approximately 11,674 nucleotides in length, compared with 11,853 nucleotides for RRV T48. RRV NB5092 and RRV T48 have the same genome organization. For both viruses an untranslated region of 80 nucleotides at the 5' end of the genome is followed by a 7440-nucleotide open reading frame which is interrupted after 5586 nucleotides by a single opal termination codon. By homology with other alphaviruses, the 5586-nucleotide open reading frame encodes the nonstructural proteins nsP1, nsP2, and nsP3; a fourth nonstructural protein, nsP4, is produced by read-through of the opal codon. The RRV nonstructural proteins show strong homology with the corresponding proteins of Sindbis virus and Semliki Forest virus in terms of size, net charge, and hydropathy characteristics. However, homology is not uniform between or within the proteins; nsP1, nsP2, and nsP4 contain extended domains which are highly conserved between alphaviruses, while the C-terminal region of nsP3 shows little conservation in sequence or length between alphaviruses. An untranslated "junction" region of 44 nucleotides (for RRV NB5092) or 47 nucleotides (for RRV T48) separates the nonstructural and structural protein coding regions. The structural proteins (capsid-E3-E2-6K-E1) are translated from an open reading frame of 3762 nucleotides which is followed by a 3'-untranslated region of approximately 348 nucleotides (for RRV NB5092) or 524 nucleotides (for RRV T48). Excluding deletions and insertions, the genomes of RRV NB5092 and RRV T48 differ at 284 nucleotides, representing a sequence divergence of 2.38%. Sequence deletions or insertions were found only in the noncoding regions and include a 173-nucleotide deletion in the 3'-untranslated region of RRV NB5092, compared with RRV T48. In the coding regions, most of the nucleotide differences are silent; there are 36 amino acid differences in the nonstructural proteins and 12 in the structural proteins. The distribution of amino acid differences between the two RRV strains correlates with the location of domains which are poorly conserved in sequence between alphaviruses. The possible role of amino acid differences in envelope glycoproteins E1 and E2 in determining the different antigenic and biological properties of RRV NB5092 and RRV T48 is discussed.
Whole-Genome Sequencing and Variant Analysis of Human Papillomavirus 16 Infections.

PubMed

van der Weele, Pascal; Meijer, Chris J L M; King, Audrey J

2017-10-01

Human papillomavirus (HPV) is a strongly conserved DNA virus, high-risk types of which can cause cervical cancer in persistent infections. The most common type found in HPV-attributable cancer is HPV16, which can be subdivided into four lineages (A to D) with different carcinogenic properties. Studies have shown HPV16 sequence diversity in different geographical areas, but only limited information is available regarding HPV16 diversity within a population, especially at the whole-genome level. We analyzed HPV16 major variant diversity and conservation in persistent infections and performed a single nucleotide polymorphism (SNP) comparison between persistent and clearing infections. Materials were obtained in the Netherlands from a cohort study with longitudinal follow-up for up to 3 years. Our analysis shows a remarkably large variant diversity in the population. Whole-genome sequences were obtained for 57 persistent and 59 clearing HPV16 infections, resulting in 109 unique variants. Interestingly, persistent infections were completely conserved through time. One reinfection event was identified where the initial and follow-up samples clustered differently. Non-A1/A2 variants seemed to clear preferentially ( P = 0.02). Our analysis shows that population-wide HPV16 sequence diversity is very large. In persistent infections, the HPV16 sequence was fully conserved. Sequencing can identify HPV16 reinfections, although occurrence is rare. SNP comparison identified no strongly acting effect of the viral genome affecting HPV16 infection clearance or persistence in up to 3 years of follow-up. These findings suggest the progression of an early HPV16 infection could be host related. IMPORTANCE Human papillomavirus 16 (HPV16) is the predominant type found in cervical cancer. Progression of initial infection to cervical cancer has been linked to sequence properties; however, knowledge of variants circulating in European populations, especially with longitudinal follow-up, is limited. By sequencing a number of infections with known follow-up for up to 3 years, we gained initial insights into the genetic diversity of HPV16 and the effects of the viral genome on the persistence of infections. A SNP comparison between sequences obtained from clearing and persistent infections did not identify strongly acting DNA variations responsible for these infection outcomes. In addition, we identified an HPV16 reinfection event where sequencing of initial and follow-up samples showed different HPV16 variants. Based on conventional genotyping, this infection would incorrectly be considered a persistent HPV16 infection. In the context of vaccine efficacy and monitoring studies, such infections could potentially cause reduced reported efficacy or efficiency. Copyright © 2017 van der Weele et al.
Extensive Variation and Sub-Structuring in Lineage A mtDNA in Indian Sheep: Genetic Evidence for Domestication of Sheep in India

PubMed Central

Singh, Sachin; Kumar Jr, Satish; Kolte, Atul P.; Kumar, Satish

2013-01-01

Previous studies on mitochondrial DNA analysis of sheep from different regions of the world have revealed the presence of two major- A and B, and three minor- C, D and E maternal lineages. Lineage A is more frequent in Asia and lineage B is more abundant in regions other than Asia. We have analyzed mitochondrial DNA sequences of 330 sheep from 12 different breeds of India. Neighbor-joining analysis revealed lineage A, B and C in Indian sheep. Surprisingly, multidimensional scaling plot based on FST values of control region of mtDNA sequences showed significant breed differentiation in contrast to poor geographical structuring reported earlier in this species. The breed differentiation in Indian sheep was essentially due to variable contribution of two major lineages to different breeds, and sub- structuring of lineage A, possibly the latter resulting from genetic drift. Nucleotide diversity of this lineage was higher in Indian sheep (0.014 ± 0.007) as compared to that of sheep from other regions of the world (0.009 ± 0.005 to 0.01 ± 0.005). Reduced median network analysis of control region and cytochrome b gene sequences of Indian sheep when analyzed along with available published sequences of sheep from other regions of the world showed that several haplotypes of lineage A were exclusive to Indian sheep. Given the high nucleotide diversity in Indian sheep and the poor sharing of lineage A haplotypes between Indian and non-Indian sheep, we propose that lineage A sheep has also been domesticated in the east of Near East, possibly in Indian sub-continent. Finally, our data provide support that lineage B and additional lineage A haplotypes of sheep might have been introduced to Indian sub-continent from Near East, probably by ancient sea trade route. PMID:24244282
Genetic Diversity Among Botulinum Neurotoxin Producing Clostridial Strains

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hill, K K; Smith, T J; Helma, C H

2006-07-06

Clostridium botulinum is a taxonomic designation for many diverse anaerobic spore forming rod-shaped bacteria which have the common property of producing botulinum neurotoxins (BoNTs). The BoNTs are exoneurotoxins that can cause severe paralysis and even death in humans and various other animal species. A collection of 174 C. botulinum strains were examined by amplified fragment length polymorphism (AFLP) analysis and by sequencing of the 16S rRNA gene and BoNT genes to examine genetic diversity within this species. This collection contained representatives of each of the seven different serotypes of botulinum neurotoxins (BoNT A-G). Analysis of the16S rRNA sequences confirmed earliermore » reports of at least four distinct genomic backgrounds (Groups I-IV) each of which has independently acquired one or more BoNT serotypes through horizontal gene transfer. AFLP analysis provided higher resolution, and can be used to further subdivide the four groups into sub-groups. Sequencing of the BoNT genes from serotypes A, B and E in multiple strains confirmed significant sequence variation within each serotype. Four distinct lineages within each of the BoNT A and B serotypes, and five distinct lineages of serotype E strains were identified. The nucleotide sequences of the seven serotypes of BoNT were compared and show varying degrees of interrelatedness and recombination as has been previously noted for the NTNH gene which is linked to BoNT. These analyses contribute to the understanding of the evolution and phylogeny within this species and assist in the development of improved diagnostics and therapeutics for treatment of botulism.« less

Isolating a functionally relevant guild of fungi from the root microbiome of Populus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bonito, Gregory; Hameed, Khalid; Ventura, Rafael

Plant roots interact with a bewilderingly complex community of microbes, including root-associated fungi that are essential for maintaining plant health. To improve understanding of the diversity of fungi in the rhizobiome of Populus deltoides, Populus trichocarpa and co-occurring plant hosts Quercus alba and Pinus taeda, we conducted field and greenhouse studies and sampled, isolated, and characterized the diversity of culturable root-associated fungi on these hosts. Using both general and selective isolation media we obtained more than 1800 fungal isolates from individual surface sterilized root tips. Sequences from the ITS and/or D1– D2 regions of the LSU rDNA were obtained frommore » 1042 of the >1800 pure culture isolates and were compared to accessions in the NCBI nucleotide database and analyzed through phylogenetics for preliminary taxonomic identification. Sequences from these isolates were also compared to 454 sequence datasets obtained directly from the Populus rhizosphere and endosphere. Although most of the ectomycorrhizal taxa known to associate with Populus evaded isolation, many of the abundant sequence types from rhizosphere and endosphere 454 datasets were isolated, including novel species belonging to the Atractiellales. Isolation and identification of key endorrhizal fungi will enable more targeted study of plant-fungal interactions. Genome sequencing is currently underway for a subset of our culture library with the aim of understanding the mechanisms involved in host-endophyte establishment and function. As a result, this diverse culture library of fungal root associates will be a valuable resource for metagenomic research, experimentation and further studies on plant-fungal interactions.« less
Isolating a functionally relevant guild of fungi from the root microbiome of Populus

DOE PAGES

Bonito, Gregory; Hameed, Khalid; Ventura, Rafael; ...

2016-05-27

Plant roots interact with a bewilderingly complex community of microbes, including root-associated fungi that are essential for maintaining plant health. To improve understanding of the diversity of fungi in the rhizobiome of Populus deltoides, Populus trichocarpa and co-occurring plant hosts Quercus alba and Pinus taeda, we conducted field and greenhouse studies and sampled, isolated, and characterized the diversity of culturable root-associated fungi on these hosts. Using both general and selective isolation media we obtained more than 1800 fungal isolates from individual surface sterilized root tips. Sequences from the ITS and/or D1– D2 regions of the LSU rDNA were obtained frommore » 1042 of the >1800 pure culture isolates and were compared to accessions in the NCBI nucleotide database and analyzed through phylogenetics for preliminary taxonomic identification. Sequences from these isolates were also compared to 454 sequence datasets obtained directly from the Populus rhizosphere and endosphere. Although most of the ectomycorrhizal taxa known to associate with Populus evaded isolation, many of the abundant sequence types from rhizosphere and endosphere 454 datasets were isolated, including novel species belonging to the Atractiellales. Isolation and identification of key endorrhizal fungi will enable more targeted study of plant-fungal interactions. Genome sequencing is currently underway for a subset of our culture library with the aim of understanding the mechanisms involved in host-endophyte establishment and function. As a result, this diverse culture library of fungal root associates will be a valuable resource for metagenomic research, experimentation and further studies on plant-fungal interactions.« less
Ploidy Variation in Kluyveromyces marxianus Separates Dairy and Non-dairy Isolates

PubMed Central

Ortiz-Merino, Raúl A.; Varela, Javier A.; Coughlan, Aisling Y.; Hoshida, Hisashi; da Silveira, Wendel B.; Wilde, Caroline; Kuijpers, Niels G. A.; Geertman, Jan-Maarten; Wolfe, Kenneth H.; Morrissey, John P.

2018-01-01

Kluyveromyces marxianus is traditionally associated with fermented dairy products, but can also be isolated from diverse non-dairy environments. Because of thermotolerance, rapid growth and other traits, many different strains are being developed for food and industrial applications but there is, as yet, little understanding of the genetic diversity or population genetics of this species. K. marxianus shows a high level of phenotypic variation but the only phenotype that has been clearly linked to a genetic polymorphism is lactose utilisation, which is controlled by variation in the LAC12 gene. The genomes of several strains have been sequenced in recent years and, in this study, we sequenced a further nine strains from different origins. Analysis of the Single Nucleotide Polymorphisms (SNPs) in 14 strains was carried out to examine genome structure and genetic diversity. SNP diversity in K. marxianus is relatively high, with up to 3% DNA sequence divergence between alleles. It was found that the isolates include haploid, diploid, and triploid strains, as shown by both SNP analysis and flow cytometry. Diploids and triploids contain long genomic tracts showing loss of heterozygosity (LOH). All six isolates from dairy environments were diploid or triploid, whereas 6 out 7 isolates from non-dairy environment were haploid. This also correlated with the presence of functional LAC12 alleles only in dairy haplotypes. The diploids were hybrids between a non-dairy and a dairy haplotype, whereas triploids included three copies of a dairy haplotype. PMID:29619042
Exploring origins, invasion history and genetic diversity of Imperata cylindrica (L.) P. Beauv. (Cogongrass) in the United States using genotyping by sequencing.

PubMed

Burrell, A Millie; Pepper, Alan E; Hodnett, George; Goolsby, John A; Overholt, William A; Racelis, Alexis E; Diaz, Rodrigo; Klein, Patricia E

2015-05-01

Imperata cylindrica (Cogongrass, Speargrass) is a diploid C4 grass that is a noxious weed in 73 countries and constitutes a significant threat to global biodiversity and sustainable agriculture. We used a cost-effective genotyping-by-sequencing (GBS) approach to identify the reproductive system, genetic diversity and geographic origins of invasions in the south-eastern United States. In this work, we demonstrated the advantage of employing the closely related, fully sequenced crop species Sorghum bicolor (L.) Moench as a proxy reference genome to identify a set of 2320 informative single nucleotide and insertion-deletion polymorphisms. Genetic analyses identified four clonal lineages of cogongrass and one clonal lineage of Imperata brasiliensis Trin. in the United States. Each lineage was highly homogeneous, and we found no evidence of hybridization among the different lineages, despite geographical overlap. We found evidence that at least three of these lineages showed clonal reproduction prior to introduction to the United States. These results indicate that cogongrass has limited evolutionary potential to adapt to novel environments and further suggest that upon arrival to its invaded range, this species did not require local adaptation through hybridization/introgression or selection of favourable alleles from a broad genetic base. Thus, cogongrass presents a clear case of broad invasive success, across a diversity of environments, in a clonal organism with limited genetic diversity. © 2015 John Wiley & Sons Ltd.
Innate Immune Complexity in the Purple Sea Urchin: Diversity of the Sp185/333 System

PubMed Central

Smith, L. Courtney

2012-01-01

The California purple sea urchin, Strongylocentrotus purpuratus, is a long-lived echinoderm with a complex and sophisticated innate immune system. There are several large gene families that function in immunity in this species including the Sp185/333 gene family that has ∼50 (±10) members. The family shows intriguing sequence diversity and encodes a broad array of diverse yet similar proteins. The genes have two exons of which the second encodes the mature protein and has repeats and blocks of sequence called elements. Mosaics of element patterns plus single nucleotide polymorphisms-based variants of the elements result in significant sequence diversity among the genes yet maintains similar structure among the members of the family. Sequence of a bacterial artificial chromosome insert shows a cluster of six, tightly linked Sp185/333 genes that are flanked by GA microsatellites. The sequences between the GA microsatellites in which the Sp185/333 genes and flanking regions are located, are much more similar to each other than are the sequences outside the microsatellites suggesting processes such as gene conversion, recombination, or duplication. However, close linkage does not correspond with greater sequence similarity compared to randomly cloned and sequenced genes that are unlikely to be linked. There are three segmental duplications that are bounded by GAT microsatellites and include three almost identical genes plus flanking regions. RNA editing is detectible throughout the mRNAs based on comparisons to the genes, which, in combination with putative post-translational modifications to the proteins, results in broad arrays of Sp185/333 proteins that differ among individuals. The mature proteins have an N-terminal glycine-rich region, a central RGD motif, and a C-terminal histidine-rich region. The Sp185/333 proteins are localized to the cell surface and are found within vesicles in subsets of polygonal and small phagocytes. The coelomocyte proteome shows full-length and truncated proteins, including some with missense sequence. Current results suggest that both native Sp185/333 proteins and a recombinant protein bind bacteria and are likely important in sea urchin innate immunity. PMID:22566951
The complete nucleotide sequence of the glnALG operon of Escherichia coli K12.

PubMed Central

Miranda-Ríos, J; Sánchez-Pescador, R; Urdea, M; Covarrubias, A A

1987-01-01

The nucleotide sequence of the E. coli glnALG operon has been determined. The glnL (ntrB) and glnG (ntrC) genes present a high homology, at the nucleotide and aminoacid levels, with the corresponding genes of Klebsiella pneumoniae. The predicted aminoacid sequence for glutamine synthetase allowed us to locate some of the enzyme domains. The structure of this operon is discussed. PMID:2882477
The nucleotide sequences of 5S rRNAs from a rotifer, Brachionus plicatilis, and two nematodes, Rhabditis tokai and Caenorhabditis elegans.

PubMed

Kumazaki, T; Hori, H; Osawa, S; Ishii, N; Suzuki, K

1982-11-11

The nucleotide sequences of 5S rRNAs from a rotifer, Brachionus plicatilis, and two nematodes, Rhabditis tokai and Caenorhabditis elegans have been determined. The rotifer has two 5S rRNA species that are composed of 120 and 121 nucleotides, respectively. The sequences of these two 5S rRNAs are the same except that the latter has an additional base at its 3'-terminus. The 5S rRNAs from the two nematode species are both 119 nucleotides long. The sequence similarity percents are 79% (Brachionus/Rhabditis), 80% (Brachionus/Caenorhabditis), and 95% (Rhabditis/Caenorhabditis) among these three species. Brachionus revealed the highest similarity to Lingula (89%), but not to the nematodes (79%).
A global reference for human genetic variation

PubMed Central

2016-01-01

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. PMID:26432245
Development and characterization of novel EST-SSR markers and their application for genetic diversity analysis of Jerusalem artichoke (Helianthus tuberosus L.).

PubMed

Mornkham, T; Wangsomnuk, P P; Mo, X C; Francisco, F O; Gao, L Z; Kurzweil, H

2016-10-24

Jerusalem artichoke (Helianthus tuberosus L.) is a perennial tuberous plant and a traditional inulin-rich crop in Thailand. It has become the most important source of inulin and has great potential for use in chemical and food industries. In this study, expressed sequence tag (EST)-based simple sequence repeat (SSR) markers were developed from 40,362 Jerusalem artichoke ESTs retrieved from the NCBI database. Among 23,691 non-redundant identified ESTs, 1949 SSR motifs harboring 2 to 6 nucleotides with varied repeat motifs were discovered from 1676 assembled sequences. Seventy-nine primer pairs were generated from EST sequences harboring SSR motifs. Our results show that 43 primers are polymorphic for the six studied populations, while the remaining 36 were either monomorphic or failed to amplify. These 43 SSR loci exhibited a high level of genetic diversity among populations, with allele numbers varying from 2 to 7, with an average of 3.95 alleles per loci. Heterozygosity ranged from 0.096 to 0.774, with an average of 0.536; polymorphic index content ranged from 0.096 to 0.854, with an average of 0.568. Principal component analysis and neighbor-joining analysis revealed that the six populations could be divided into six clusters. Our results indicate that these newly characterized EST-SSR markers may be useful in the exploration of genetic diversity and range expansion of the Jerusalem artichoke, and in cross-species application for the genus Helianthus.
Characterization of Sarcocystis from four species of hawks from Georgia, USA.

PubMed

Yabsley, Michael J; Ellis, Angela E; Stallknecht, David E; Howerth, Elizabeth W

2009-02-01

During 2001 to 2004, 4 species of hawks (Buteo and Accipiter spp.) from Georgia were surveyed for Sarcocystis spp. infections by examining intestinal sections. In total, 159 of 238 (66.8%) hawks examined were infected with Sarcocystis spp. Samples from 10 birds were characterized by sequence analysis of a portion of the 18S rRNA gene (783 base pairs). Only 3 of the 10 sequences from the hawks were identical; the remainder differed by at least 1 nucleotide. Phylogenetic analysis failed to resolve the position of the hawk Sarcocystis species, but they were closely related several Sarcocystis species from raptors, rodents, and Sarcocystis neurona. The high genetic diversity of Sarcocystis suggests that more than 1 species infects these 4 hawk species; however, additional molecular or experimental work will be required to determine the speciation and diversity of parasites infecting these avian hosts. In addition to assisting with determining species richness of Sarcocystis in raptors, molecular analysis should be useful in the identification of potential intermediate hosts.
Haplotag: Software for Haplotype-Based Genotyping-by-Sequencing Analysis

PubMed Central

Tinker, Nicholas A.; Bekele, Wubishet A.; Hattori, Jiro

2016-01-01

Genotyping-by-sequencing (GBS), and related methods, are based on high-throughput short-read sequencing of genomic complexity reductions followed by discovery of single nucleotide polymorphisms (SNPs) within sequence tags. This provides a powerful and economical approach to whole-genome genotyping, facilitating applications in genomics, diversity analysis, and molecular breeding. However, due to the complexity of analyzing large data sets, applications of GBS may require substantial time, expertise, and computational resources. Haplotag, the novel GBS software described here, is freely available, and operates with minimal user-investment on widely available computer platforms. Haplotag is unique in fulfilling the following set of criteria: (1) operates without a reference genome; (2) can be used in a polyploid species; (3) provides a discovery mode, and a production mode; (4) discovers polymorphisms based on a model of tag-level haplotypes within sequenced tags; (5) reports SNPs as well as haplotype-based genotypes; and (6) provides an intuitive visual “passport” for each inferred locus. Haplotag is optimized for use in a self-pollinating plant species. PMID:26818073
Computational Analysis of Mouse piRNA Sequence and Biogenesis

PubMed Central

Betel, Doron; Sheridan, Robert; Marks, Debora S; Sander, Chris

2007-01-01

The recent discovery of a new class of 30-nucleotide long RNAs in mammalian testes, called PIWI-interacting RNA (piRNA), with similarities to microRNAs and repeat-associated small interfering RNAs (rasiRNAs), has raised puzzling questions regarding their biogenesis and function. We report a comparative analysis of currently available piRNA sequence data from the pachytene stage of mouse spermatogenesis that sheds light on their sequence diversity and mechanism of biogenesis. We conclude that (i) there are at least four times as many piRNAs in mouse testes than currently known; (ii) piRNAs, which originate from long precursor transcripts, are generated by quasi-random enzymatic processing that is guided by a weak sequence signature at the piRNA 5′ends resulting in a large number of distinct sequences; and (iii) many of the piRNA clusters contain inverted repeats segments capable of forming double-strand RNA fold-back segments that may initiate piRNA processing analogous to transposon silencing. PMID:17997596
Whole genome sequences of Japanese porcine species C rotaviruses reveal a high diversity of genotypes of individual genes and will contribute to a comprehensive, generally accepted classification system.

PubMed

Niira, Kazutaka; Ito, Mika; Masuda, Tsuneyuki; Saitou, Toshiya; Abe, Tadatsugu; Komoto, Satoshi; Sato, Mitsuo; Yamasato, Hiroshi; Kishimoto, Mai; Naoi, Yuki; Sano, Kaori; Tuchiaka, Shinobu; Okada, Takashi; Omatsu, Tsutomu; Furuya, Tetsuya; Aoki, Hiroshi; Katayama, Yukie; Oba, Mami; Shirai, Junsuke; Taniguchi, Koki; Mizutani, Tetsuya; Nagai, Makoto

2016-10-01

Porcine rotavirus C (RVC) is distributed throughout the world and is thought to be a pathogenic agent of diarrhea in piglets. Although, the VP7, VP4, and VP6 gene sequences of Japanese porcine RVCs are currently available, there is no whole-genome sequence data of Japanese RVC. Furthermore, only one to three sequences are available for porcine RVC VP1-VP3 and NSP1-NSP3 genes. Therefore, we determined nearly full-length whole-genome sequences of nine Japanese porcine RVCs from seven piglets with diarrhea and two healthy pigs and compared them with published RVC sequences from a database. The VP7 genes of two Japanese RVCs from healthy pigs were highly divergent from other known RVC strains and were provisionally classified as G12 and G13 based on the 86% nucleotide identity cut-off value. Pairwise sequence identity calculations and phylogenetic analyses revealed that candidate novel genotypes of porcine Japanese RVC were identified in the NSP1, NSP2 and NSP3 encoding genes, respectively. Furthermore, VP3 of Japanese porcine RVCs was shown to be closely related to human RVCs, suggesting a gene reassortment event between porcine and human RVCs and past interspecies transmission. The present study demonstrated that porcine RVCs show greater genetic diversity among strains than human and bovine RVCs. Copyright © 2016 Elsevier B.V. All rights reserved.
Diversity and Molecular Phylogeny of Mitochondrial DNA of Rhesus Macaques (Macaca mulatta) in Bangladesh

PubMed Central

HASAN, M. KAMRUL; FEEROZ, M. MOSTAFA; JONES-ENGEL, LISA; ENGEL, GREGORY A.; KANTHASWAMY, SREE; SMITH, DAVID GLENN

2015-01-01

While studies of rhesus macaques (Macaca mulatta) in the eastern (e.g., China) and western (e.g., India) parts of their geographic range have revealed major genetic differences that warrant the recognition of two different subspecies, little is known about genetic characteristics of rhesus macaques in the transitional zone extending from eastern India and Bangladesh through the northern part of Indo-China, the probable original homeland of the species. We analyzed genetic variation of 762 base pairs of mitochondrial DNA from 86 fecal swab samples and 19 blood samples from 25 local populations of rhesus macaque in Bangladesh collected from January 2010 to August 2012. These sequences were compared with those of rhesus macaques from India, China, and Myanmar. Forty-six haplotypes defined by 200 (26%) polymorphic nucleotide sites were detected. Estimates of gene diversity, expected heterozygosity, and nucleotide diversity for the total population were 0.9599 ± 0.0097, 0.0193 ± 0.0582, and 0.0196 ± 0.0098, respectively. A mismatch distribution of paired nucleotide differences yielded a statistically significantly negative value of Tajima's D, reflecting a population that rapidly expanded after the terminal Pleistocene. Most haplotypes throughout regions of Bangladesh, including an isolated region in the southwestern area (Sundarbans), clustered with haplotypes assigned to the minor haplogroup Ind-2 from India reflecting an east to west dispersal of rhesus macaques to India. Haplotypes from the southeast region of Bangladesh formed a cluster with those from Myanmar, and represent the oldest rhesus macaque haplotypes of Bangladesh. These results are consistent with the hypothesis that rhesus macaques first entered Bangladesh from the southeast, probably from Indo-China, then dispersed westward throughout eastern and central India. PMID:24810278
Diversity and molecular phylogeny of mitochondrial DNA of rhesus macaques (Macaca mulatta) in Bangladesh.

PubMed

Hasan, M Kamrul; Feeroz, M Mostafa; Jones-Engel, Lisa; Engel, Gregory A; Kanthaswamy, Sree; Smith, David Glenn

2014-11-01

While studies of rhesus macaques (Macaca mulatta) in the eastern (e.g., China) and western (e.g., India) parts of their geographic range have revealed major genetic differences that warrant the recognition of two different subspecies, little is known about genetic characteristics of rhesus macaques in the transitional zone extending from eastern India and Bangladesh through the northern part of Indo-China, the probable original homeland of the species. We analyzed genetic variation of 762 base pairs of mitochondrial DNA from 86 fecal swab samples and 19 blood samples from 25 local populations of rhesus macaque in Bangladesh collected from January 2010 to August 2012. These sequences were compared with those of rhesus macaques from India, China, and Myanmar. Forty-six haplotypes defined by 200 (26%) polymorphic nucleotide sites were detected. Estimates of gene diversity, expected heterozygosity, and nucleotide diversity for the total population were 0.9599 ± 0.0097, 0.0193 ± 0.0582, and 0.0196 ± 0.0098, respectively. A mismatch distribution of paired nucleotide differences yielded a statistically significantly negative value of Tajima's D, reflecting a population that rapidly expanded after the terminal Pleistocene. Most haplotypes throughout regions of Bangladesh, including an isolated region in the southwestern area (Sundarbans), clustered with haplotypes assigned to the minor haplogroup Ind-2 from India reflecting an east to west dispersal of rhesus macaques to India. Haplotypes from the southeast region of Bangladesh formed a cluster with those from Myanmar, and represent the oldest rhesus macaque haplotypes of Bangladesh. These results are consistent with the hypothesis that rhesus macaques first entered Bangladesh from the southeast, probably from Indo-China, then dispersed westward throughout eastern and central India. © 2014 Wiley Periodicals, Inc.
The EMBL nucleotide sequence database

PubMed Central

Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann

2001-01-01

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039
Interactive computer programs for the graphic analysis of nucleotide sequence data.

PubMed Central

Luckow, V A; Littlewood, R K; Rownd, R H

1984-01-01

A group of interactive computer programs have been developed which aid in the collection and graphical analysis of nucleotide and protein sequence data. The programs perform the following basic functions: a) enter, edit, list, and rearrange sequence data; b) permit automatic entry of nucleotide sequence data directly from an autoradiograph into the computer; c) search for restriction sites or other specified patterns and plot a linear or circular restriction map, or print their locations; d) plot base composition; e) analyze homology between sequences by plotting a two-dimensional graphic matrix; and f) aid in plotting predicted secondary structures of RNA molecules. PMID:6546437
Determination of a Screening Metric for High Diversity DNA Libraries.

PubMed

Guido, Nicholas J; Handerson, Steven; Joseph, Elaine M; Leake, Devin; Kung, Li A

2016-01-01

The fields of antibody engineering, enzyme optimization and pathway construction rely increasingly on screening complex variant DNA libraries. These highly diverse libraries allow researchers to sample a maximized sequence space; and therefore, more rapidly identify proteins with significantly improved activity. The current state of the art in synthetic biology allows for libraries with billions of variants, pushing the limits of researchers' ability to qualify libraries for screening by measuring the traditional quality metrics of fidelity and diversity of variants. Instead, when screening variant libraries, researchers typically use a generic, and often insufficient, oversampling rate based on a common rule-of-thumb. We have developed methods to calculate a library-specific oversampling metric, based on fidelity, diversity, and representation of variants, which informs researchers, prior to screening the library, of the amount of oversampling required to ensure that the desired fraction of variant molecules will be sampled. To derive this oversampling metric, we developed a novel alignment tool to efficiently measure frequency counts of individual nucleotide variant positions using next-generation sequencing data. Next, we apply a method based on the "coupon collector" probability theory to construct a curve of upper bound estimates of the sampling size required for any desired variant coverage. The calculated oversampling metric will guide researchers to maximize their efficiency in using highly variant libraries.
Mitochondrial DNA Markers Reveal High Genetic Diversity but Low Genetic Differentiation in the Black Fly Simulium tani Takaoka & Davies along an Elevational Gradient in Malaysia

PubMed Central

Low, Van Lun; Adler, Peter H.; Takaoka, Hiroyuki; Ya’cob, Zubaidah; Lim, Phaik Eem; Tan, Tiong Kai; Lim, Yvonne A. L.; Chen, Chee Dhang; Norma-Rashid, Yusoff; Sofian-Azirun, Mohd

2014-01-01

The population genetic structure of Simulium tani was inferred from mitochondria-encoded sequences of cytochrome c oxidase subunits I (COI) and II (COII) along an elevational gradient in Cameron Highlands, Malaysia. A statistical parsimony network of 71 individuals revealed 71 haplotypes in the COI gene and 43 haplotypes in the COII gene; the concatenated sequences of the COI and COII genes revealed 71 haplotypes. High levels of genetic diversity but low levels of genetic differentiation were observed among populations of S. tani at five elevations. The degree of genetic diversity, however, was not in accordance with an altitudinal gradient, and a Mantel test indicated that elevation did not have a limiting effect on gene flow. No ancestral haplotype of S. tani was found among the populations. Pupae with unique structural characters at the highest elevation showed a tendency to form their own haplotype cluster, as revealed by the COII gene. Tajima’s D, Fu’s Fs, and mismatch distribution tests revealed population expansion of S. tani in Cameron Highlands. A strong correlation was found between nucleotide diversity and the levels of dissolved oxygen in the streams where S. tani was collected. PMID:24941043
Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands

USGS Publications Warehouse

Jarvi, S.I.; Farias, M.E.; Lapointe, D.A.; Belcaid, M.; Atkinson, C.T.

2013-01-01

Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.

Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands.

PubMed

Jarvi, S I; Farias, M E; Lapointe, D A; Belcaid, M; Atkinson, C T

2013-12-01

Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.
Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

PubMed Central

Khan, A S

1984-01-01

The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
Genetic Diversity Analysis of Highly Incomplete SNP Genotype Data with Imputations: An Empirical Assessment

PubMed Central

Fu, Yong-Bi

2014-01-01

Genotyping by sequencing (GBS) recently has emerged as a promising genomic approach for assessing genetic diversity on a genome-wide scale. However, concerns are not lacking about the uniquely large unbalance in GBS genotype data. Although some genotype imputation has been proposed to infer missing observations, little is known about the reliability of a genetic diversity analysis of GBS data, with up to 90% of observations missing. Here we performed an empirical assessment of accuracy in genetic diversity analysis of highly incomplete single nucleotide polymorphism genotypes with imputations. Three large single-nucleotide polymorphism genotype data sets for corn, wheat, and rice were acquired, and missing data with up to 90% of missing observations were randomly generated and then imputed for missing genotypes with three map-independent imputation methods. Estimating heterozygosity and inbreeding coefficient from original, missing, and imputed data revealed variable patterns of bias from assessed levels of missingness and genotype imputation, but the estimation biases were smaller for missing data without genotype imputation. The estimates of genetic differentiation were rather robust up to 90% of missing observations but became substantially biased when missing genotypes were imputed. The estimates of topology accuracy for four representative samples of interested groups generally were reduced with increased levels of missing genotypes. Probabilistic principal component analysis based imputation performed better in terms of topology accuracy than those analyses of missing data without genotype imputation. These findings are not only significant for understanding the reliability of the genetic diversity analysis with respect to large missing data and genotype imputation but also are instructive for performing a proper genetic diversity analysis of highly incomplete GBS or other genotype data. PMID:24626289
On the comparison of population-level estimates of haplotype and nucleotide diversity: a case study using the gene cox1 in animals.

PubMed

Goodall-Copestake, W P; Tarling, G A; Murphy, E J

2012-07-01

Estimates of genetic diversity represent a valuable resource for biodiversity assessments and are increasingly used to guide conservation and management programs. The most commonly reported estimates of DNA sequence diversity in animal populations are haplotype diversity (h) and nucleotide diversity (π) for the mitochondrial gene cytochrome c oxidase subunit I (cox1). However, several issues relevant to the comparison of h and π within and between studies remain to be assessed. We used population-level cox1 data from peer-reviewed publications to quantify the extent to which data sets can be re-assembled, to provide a standardized summary of h and π estimates, to explore the relationship between these metrics and to assess their sensitivity to under-sampling. Only 19 out of 42 selected publications had archived data that could be unambiguously re-assembled; this comprised 127 population-level data sets (n ≥ 15) from 23 animal species. Estimates of h and π were calculated using a 456-base region of cox1 that was common to all the data sets (median h=0.70130, median π=0.00356). Non-linear regression methods and Bayesian information criterion analysis revealed that the most parsimonious model describing the relationship between the estimates of h and π was π=0.0081 h(2). Deviations from this model can be used to detect outliers due to biological processes or methodological issues. Subsampling analyses indicated that samples of n>5 were sufficient to discriminate extremes of high from low population-level cox1 diversity, but samples of n ≥ 25 are recommended for greater accuracy.
[Age structure and genetic diversity of Homatula pycnolepis in the Nujiang River basin].

PubMed

Yue, Xing-Jian; Liu, Shao-Ping; Liu, Ming-Dian; Duan, Xin-Bin; Wang, Deng-Qiang; Chen, Da-Qing

2013-08-01

This study examined the age structure of the Loach, Homatula pycnolepis through the otolith growth rings in 204 individual specimens collected from the Xiaomengtong River of the Nujiang River (Salween River) basin in April, 2008. There were only two different age classes, 1 and 2 years of age-no 3 year olds were detected. The age structure of H. pycnolepis was simple. The complete mitochondrial DNA cytochrome b gene sequences (1140) of 80 individuals from 4 populations collected in the Nujiang River drainage were sequenced and a total of 44 variable sites were found among 4 different haplotypes. The global haplotype diversity (Hd) and nucleotide diversity (Pi) were calculated at 0.7595, 0.0151 respectively, and 0, 0 in each population, indicating a consistent lack of genetic diversity in each small population. There was obvious geographic structure in both the Nujiang River basin (NJB) group, and the Nanding River (NDR) group. The genetic distance between NJB and NDR was calculated at 0.0356, suggesting that genetic divergence resulted from long-term isolation of individual population. Such a simple age structure and a lack of genetic diversity in H. pycnolepis may potentially be due to small populations and locale fishing pressures. Accordingly, the results of this study prompt us to recommend that the NJB, NDR and Lancang River populations should be protected as three different evolutionary significant units or separated management units.
Complete nucleotide sequence and genome organization of a novel allexivirus from alfalfa (Medicago sativa)

USDA-ARS?s Scientific Manuscript database

A new species of the family Alphaflexiviridae provisionally named Alfalfa virus S (AVS) was diagnosed in alfalfa samples originating from Sudan. A complete nucleotide sequence of the viral genome consisting of 8,349 nucleotides excluding the 3’ poly(A) tail was determined by Illumina NGS technology ...
DNA sequence analysis of simian virus 40 mutants with deletions mapping in the leader region of the late viral mRNA's: mutants with deletions similar in size and position exhibit varied phenotypes.

PubMed

Barkan, A; Mertz, J E

1981-02-01

The nucleotide sequences of 10 viable yet partially defective deletion mutants of simian virus 40 were determined. The deletions mapped within, and, in many cases, 5' to, the predominant leader sequence of the late viral mRNA's. They ranged from 74 to 187 nucleotide pairs in length. Six of the mutants had lost the sequence that corresponds to the "cap" site (5' terminus) of the most abundant class of 16S mRNA's. One of these mutants had a deletion that extended 103 nucleotide pairs into the region preceding this primary cap site and, therefore, was missing many secondary cap sites as well. A seventh mutant lacked the entire major 16S leader sequence except for the first six nucleotides at its 5' end and the last nine at its 3' end. Although these mutants differed in the size and position of their deletions, we were unable to discover any simple correlations between their growth characteristics and their DNA sequences. This finding indicates that the secondary structures of the RNA transcripts may play a more important role than the exact nucleotide sequence of the RNAs in determining how they function within the cell.
DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L.

PubMed

González-Martínez, Santiago C; Ersoz, Elhan; Brown, Garth R; Wheeler, Nicholas C; Neale, David B

2006-03-01

Genetic association studies are rapidly becoming the experimental approach of choice to dissect complex traits, including tolerance to drought stress, which is the most common cause of mortality and yield losses in forest trees. Optimization of association mapping requires knowledge of the patterns of nucleotide diversity and linkage disequilibrium and the selection of suitable polymorphisms for genotyping. Moreover, standard neutrality tests applied to DNA sequence variation data can be used to select candidate genes or amino acid sites that are putatively under selection for association mapping. In this article, we study the pattern of polymorphism of 18 candidate genes for drought-stress response in Pinus taeda L., an important tree crop. Data analyses based on a set of 21 putatively neutral nuclear microsatellites did not show population genetic structure or genomewide departures from neutrality. Candidate genes had moderate average nucleotide diversity at silent sites (pi(sil) = 0.00853), varying 100-fold among single genes. The level of within-gene LD was low, with an average pairwise r2 of 0.30, decaying rapidly from approximately 0.50 to approximately 0.20 at 800 bp. No apparent LD among genes was found. A selective sweep may have occurred at the early-response-to-drought-3 (erd3) gene, although population expansion can also explain our results and evidence for selection was not conclusive. One other gene, ccoaomt-1, a methylating enzyme involved in lignification, showed dimorphism (i.e., two highly divergent haplotype lineages at equal frequency), which is commonly associated with the long-term action of balancing selection. Finally, a set of haplotype-tagging SNPs (htSNPs) was selected. Using htSNPs, a reduction of genotyping effort of approximately 30-40%, while sampling most common allelic variants, can be gained in our ongoing association studies for drought tolerance in pine.
The Nematode Eukaryotic Translation Initiation Factor 4E/G Complex Works with a trans-Spliced Leader Stem-Loop To Enable Efficient Translation of Trimethylguanosine-Capped RNAs ▿ †

PubMed Central

Wallace, Adam; Filbin, Megan E.; Veo, Bethany; McFarland, Craig; Stepinski, Janusz; Jankowska-Anyszka, Marzena; Darzynkiewicz, Edward; Davis, Richard E.

2010-01-01

Eukaryotic mRNA translation begins with recruitment of the 40S ribosome complex to the mRNA 5′ end through the eIF4F initiation complex binding to the 5′ m7G-mRNA cap. Spliced leader (SL) RNA trans splicing adds a trimethylguanosine (TMG) cap and a sequence, the SL, to the 5′ end of mRNAs. Efficient translation of TMG-capped mRNAs in nematodes requires the SL sequence. Here we define a core set of nucleotides and a stem-loop within the 22-nucleotide nematode SL that stimulate translation of mRNAs with a TMG cap. The structure and core nucleotides are conserved in other nematode SLs and correspond to regions of SL1 required for early Caenorhabditis elegans development. These SL elements do not facilitate translation of m7G-capped RNAs in nematodes or TMG-capped mRNAs in mammalian or plant translation systems. Similar stem-loop structures in phylogenetically diverse SLs are predicted. We show that the nematode eukaryotic translation initiation factor 4E/G (eIF4E/G) complex enables efficient translation of the TMG-SL RNAs in diverse in vitro translation systems. TMG-capped mRNA translation is determined by eIF4E/G interaction with the cap and the SL RNA, although the SL does not increase the affinity of eIF4E/G for capped RNA. These results suggest that the mRNA 5′ untranslated region (UTR) can play a positive and novel role in translation initiation through interaction with the eIF4E/G complex in nematodes and raise the issue of whether eIF4E/G-RNA interactions play a role in the translation of other eukaryotic mRNAs. PMID:20154140
The nematode eukaryotic translation initiation factor 4E/G complex works with a trans-spliced leader stem-loop to enable efficient translation of trimethylguanosine-capped RNAs.

PubMed

Wallace, Adam; Filbin, Megan E; Veo, Bethany; McFarland, Craig; Stepinski, Janusz; Jankowska-Anyszka, Marzena; Darzynkiewicz, Edward; Davis, Richard E

2010-04-01

Eukaryotic mRNA translation begins with recruitment of the 40S ribosome complex to the mRNA 5' end through the eIF4F initiation complex binding to the 5' m(7)G-mRNA cap. Spliced leader (SL) RNA trans splicing adds a trimethylguanosine (TMG) cap and a sequence, the SL, to the 5' end of mRNAs. Efficient translation of TMG-capped mRNAs in nematodes requires the SL sequence. Here we define a core set of nucleotides and a stem-loop within the 22-nucleotide nematode SL that stimulate translation of mRNAs with a TMG cap. The structure and core nucleotides are conserved in other nematode SLs and correspond to regions of SL1 required for early Caenorhabditis elegans development. These SL elements do not facilitate translation of m(7)G-capped RNAs in nematodes or TMG-capped mRNAs in mammalian or plant translation systems. Similar stem-loop structures in phylogenetically diverse SLs are predicted. We show that the nematode eukaryotic translation initiation factor 4E/G (eIF4E/G) complex enables efficient translation of the TMG-SL RNAs in diverse in vitro translation systems. TMG-capped mRNA translation is determined by eIF4E/G interaction with the cap and the SL RNA, although the SL does not increase the affinity of eIF4E/G for capped RNA. These results suggest that the mRNA 5' untranslated region (UTR) can play a positive and novel role in translation initiation through interaction with the eIF4E/G complex in nematodes and raise the issue of whether eIF4E/G-RNA interactions play a role in the translation of other eukaryotic mRNAs.
Antifungal polypeptides

DOEpatents

Altier, Daniel J.; Dahlbacka, Glen; Ellanskaya, legal representative, Natalia; Herrmann, Rafael; Hunter-Cevera, Jennie; McCutchen, Billy F.; Presnail, James K.; Rice, Janet A.; Schepers, Eric; Simmons, Carl R.; Torok, Tamas; Yalpani, Nasser; Ellanskaya, deceased, Irina

2007-12-11

Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Antifungal polypeptides

DOEpatents

Altier, Daniel J.; Dahlbacka, Glen; Elleskaya, Irina; Ellanskaya, legal representative; Natalia; Herrmann, Rafael; Hunter-Cevera, Jennie; McCutchen, Billy F.; Presnail, James K.; Rice, Janet A.; Schepers, Eric; Simmons, Carl R.; Torok, Tamas; Yalpani, Nasser

2010-08-10

Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Antifungal polypeptides

DOEpatents

Altier, Daniel J [Waukee, IA; Dahlbacka, Glen [Oakland, CA; Elleskaya, Irina [Kyiv, UA; Ellanskaya, legal representative, Natalia; Herrmann, Rafael [Wilmington, DE; Hunter-Cevera, Jennie [Elliott City, MD; McCutchen, Billy F [College Station, IA; Presnail, James K [Avondale, PA; Rice, Janet A [Wilmington, DE; Schepers, Eric [Port Deposit, MD; Simmons, Carl R [Des Moines, IA; Torok, Tamas [Richmond, CA; Yalpani, Nasser [Johnston, IA

2011-04-12

Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Antifungal polypeptides

DOEpatents

Altier, Daniel J [Granger, IA; Dahlbacka, Glen [Oakland, CA; Ellanskaya, Irina [Kyiv, UA; Ellanskaya, legal representative, Natalia; Herrmann, Rafael [Wilmington, DE; Hunter-Cevera, Jennie [Elliott City, MD; McCutchen, Billy F [College Station, TX; Presnail, James K [Avondale, PA; Rice, Janet A [Wilmington, DE; Schepers, Eric [Port Deposit, MD; Simmons, Carl R [Des Moines, IA; Torok, Tamas [Richmond, CA; Yalpani, Nasser [Johnston, IA

2012-04-03

Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms.

PubMed

Buschiazzo, Emmanuel; Ritland, Carol; Bohlmann, Jörg; Ritland, Kermit

2012-01-20

Comparative genomics can inform us about the processes of mutation and selection across diverse taxa. Among seed plants, gymnosperms have been lacking in genomic comparisons. Recent EST and full-length cDNA collections for two conifers, Sitka spruce (Picea sitchensis) and loblolly pine (Pinus taeda), together with full genome sequences for two angiosperms, Arabidopsis thaliana and poplar (Populus trichocarpa), offer an opportunity to infer the evolutionary processes underlying thousands of orthologous protein-coding genes in gymnosperms compared with an angiosperm orthologue set. Based upon pairwise comparisons of 3,723 spruce and pine orthologues, we found an average synonymous genetic distance (dS) of 0.191, and an average dN/dS ratio of 0.314. Using a fossil-established divergence time of 140 million years between spruce and pine, we extrapolated a nucleotide substitution rate of 0.68 × 10(-9) synonymous substitutions per site per year. When compared to angiosperms, this indicates a dramatically slower rate of nucleotide substitution rates in conifers: on average 15-fold. Coincidentally, we found a three-fold higher dN/dS for the spruce-pine lineage compared to the poplar-Arabidopsis lineage. This joint occurrence of a slower evolutionary rate in conifers with higher dN/dS, and possibly positive selection, showcases the uniqueness of conifer genome evolution. Our results are in line with documented reduced nucleotide diversity, conservative genome evolution and low rates of diversification in conifers on the one hand and numerous examples of local adaptation in conifers on the other hand. We propose that reduced levels of nucleotide mutation in large and long-lived conifer trees, coupled with large effective population size, were the main factors leading to slow substitution rates but retention of beneficial mutations.
Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

PubMed Central

Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

2011-01-01

DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738
Algorithms for optimizing cross-overs in DNA shuffling.

PubMed

He, Lu; Friedman, Alan M; Bailey-Kellogg, Chris

2012-03-21

DNA shuffling generates combinatorial libraries of chimeric genes by stochastically recombining parent genes. The resulting libraries are subjected to large-scale genetic selection or screening to identify those chimeras with favorable properties (e.g., enhanced stability or enzymatic activity). While DNA shuffling has been applied quite successfully, it is limited by its homology-dependent, stochastic nature. Consequently, it is used only with parents of sufficient overall sequence identity, and provides no control over the resulting chimeric library. This paper presents efficient methods to extend the scope of DNA shuffling to handle significantly more diverse parents and to generate more predictable, optimized libraries. Our CODNS (cross-over optimization for DNA shuffling) approach employs polynomial-time dynamic programming algorithms to select codons for the parental amino acids, allowing for zero or a fixed number of conservative substitutions. We first present efficient algorithms to optimize the local sequence identity or the nearest-neighbor approximation of the change in free energy upon annealing, objectives that were previously optimized by computationally-expensive integer programming methods. We then present efficient algorithms for more powerful objectives that seek to localize and enhance the frequency of recombination by producing "runs" of common nucleotides either overall or according to the sequence diversity of the resulting chimeras. We demonstrate the effectiveness of CODNS in choosing codons and allocating substitutions to promote recombination between parents targeted in earlier studies: two GAR transformylases (41% amino acid sequence identity), two very distantly related DNA polymerases, Pol X and β (15%), and beta-lactamases of varying identity (26-47%). Our methods provide the protein engineer with a new approach to DNA shuffling that supports substantially more diverse parents, is more deterministic, and generates more predictable and more diverse chimeric libraries.
Typing of canine parvovirus isolates using mini-sequencing based single nucleotide polymorphism analysis.

PubMed

Naidu, Hariprasad; Subramanian, B Mohana; Chinchkar, Shankar Ramchandra; Sriraman, Rajan; Rana, Samir Kumar; Srinivasan, V A

2012-05-01

The antigenic types of canine parvovirus (CPV) are defined based on differences in the amino acids of the major capsid protein VP2. Type specificity is conferred by a limited number of amino acid changes and in particular by few nucleotide substitutions. PCR based methods are not particularly suitable for typing circulating variants which differ in a few specific nucleotide substitutions. Assays for determining SNPs can detect efficiently nucleotide substitutions and can thus be adapted to identify CPV types. In the present study, CPV typing was performed by single nucleotide extension using the mini-sequencing technique. A mini-sequencing signature was established for all the four CPV types (CPV2, 2a, 2b and 2c) and feline panleukopenia virus. The CPV typing using the mini-sequencing reaction was performed for 13 CPV field isolates and the two vaccine strains available in our repository. All the isolates had been typed earlier by full-length sequencing of the VP2 gene. The typing results obtained from mini-sequencing matched completely with that of sequencing. Typing could be achieved with less than 100 copies of standard plasmid DNA constructs or ≤10¹ FAID₅₀ of virus by mini-sequencing technique. The technique was also efficient for detecting multiple types in mixed infections. Copyright © 2012 Elsevier B.V. All rights reserved.
Drought-induced gene expression in Atriplex canescens (salt bush): Transcriptional and post transcriptional response

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cairney, J.; Hays, D.; Stockand, J.D.

1991-05-01

The rangeland shrub Atriplex canescens (saltbush) is extremely drought-tolerant and is capable of growing at water potentials below {minus}40 bar. To discover the molecular basis of this tolerance, the authors have isolated a number of cDNA clones of drought-stress induced genes. Analysis of the nucleotide sequence and expression of these genes in different tissues and in response to different stresses reveals the diversity of the stress response. Members of a drought-induced, multi-gene family, have been sequenced. Although 95% homologous, non-conservative substitutions result in proteins of different tertiary structure. Additionally, the genes are expressed through a number of mature forms ofmore » mRNA which may arise by alternative RNA processing.« less
Fungal Endophyte Diversity in Sarracenia

PubMed Central

Glenn, Anthony; Bodri, Michael S.

2012-01-01

Fungal endophytes were isolated from 4 species of the carnivorous pitcher plant genus Sarracenia: S. minor, S. oreophila, S. purpurea, and S. psittacina. Twelve taxa of fungi, 8 within the Ascomycota and 4 within the Basidiomycota, were identified based on PCR amplification and sequencing of the internal transcribed spacer sequences of nuclear ribosomal DNA (ITS rDNA) with taxonomic identity assigned using the NCBI nucleotide megablast search tool. Endophytes are known to produce a large number of metabolites, some of which may contribute to the protection and survival of the host. We speculate that endophyte-infected Sarracenia may benefit from their fungal associates by their influence on nutrient availability from within pitchers and, possibly, by directly influencing the biota within pitchers. PMID:22427921

Some links on this page may take you to non-federal websites. Their policies may differ from this site.