Sample records for genomic structure polymorphism

  1. Environmental Adaptation Contributes to Gene Polymorphism across the Arabidopsis thaliana Genome

    PubMed Central

    Lee, Cheng-Ruei

    2012-01-01

    The level of within-species polymorphism differs greatly among genes in a genome. Many genomic studies have investigated the relationship between gene polymorphism and factors such as recombination rate or expression pattern. However, the polymorphism of a gene is affected not only by its physical properties or functional constraints but also by natural selection on organisms in their environments. Specifically, if functionally divergent alleles enable adaptation to different environments, locus-specific polymorphism may be maintained by spatially heterogeneous natural selection. To test this hypothesis and estimate the extent to which environmental selection shapes the pattern of genome-wide polymorphism, we define the "environmental relevance" of a gene as the proportion of genetic variation explained by environmental factors, after controlling for population structure. We found substantial effects of environmental relevance on patterns of polymorphism among genes. In addition, the correlation between environmental relevance and gene polymorphism is positive, consistent with the expectation that balancing selection among heterogeneous environments maintains genetic variation at ecologically important genes. Comparison of the gene ontology annotations shows that genes with high environmental relevance are enriched in unknown function categories. These results suggest an important role for environmental factors in shaping genome-wide patterns of polymorphism and indicate another direction of genomic study. PMID:22798389

  2. Insertion and deletion polymorphisms of the ancient AluS family in the human genome.

    PubMed

    Kryatova, Maria S; Steranka, Jared P; Burns, Kathleen H; Payer, Lindsay M

    2017-01-01

    Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some AluS elements have been more active recently than previously thought, or that fixation of AluS insertion alleles remains incomplete. These data expand the potential significance of polymorphic AluS elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic AluS elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted.

  3. Genomic Hypomethylation in the Human Germline Associates with Selective Structural Mutability in the Human Genome

    PubMed Central

    Li, Jian; Harris, R. Alan; Cheung, Sau Wai; Coarfa, Cristian; Jeong, Mira; Goodell, Margaret A.; White, Lisa D.; Patel, Ankita; Kang, Sung-Hae; Shaw, Chad; Chinault, A. Craig; Gambin, Tomasz; Gambin, Anna; Lupski, James R.; Milosavljevic, Aleksandar

    2012-01-01

    The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR–mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease. PMID:22615578

  4. Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins.

    PubMed

    Eriksson, Anders; Manica, Andrea

    2012-08-28

    Recent comparisons between anatomically modern humans and ancient genomes of other hominins have raised the tantalizing, and hotly debated, possibility of hybridization. Although several tests of hybridization have been devised, they all rely on the degree to which different modern populations share genetic polymorphisms with the ancient genomes of other hominins. However, spatial population structure is expected to generate genetic patterns similar to those that might be attributed to hybridization. To investigate this problem, we take Neanderthals as a case study, and build a spatially explicit model of the shared history of anatomically modern humans and this hominin. We show that the excess polymorphism shared between Eurasians and Neanderthals is compatible with scenarios in which no hybridization occurred, and is strongly linked to the strength of population structure in ancient populations. Thus, we recommend caution in inferring admixture from geographic patterns of shared polymorphisms, and argue that future attempts to investigate ancient hybridization between humans and other hominins should explicitly account for population structure.

  5. Complete chloroplast genome sequence of a major allogamous forage species, perennial ryegrass (Lolium perenne L.).

    PubMed

    Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne

    2009-06-01

    Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.

  6. Genetic Diversity, Population Structure, and Linkage Disequilibrium in Bread Wheat (Triticum aestivum L.).

    PubMed

    Tascioglu, Tulin; Metin, Ozge Karakas; Aydin, Yildiz; Sakiroglu, Muhammet; Akan, Kadir; Uncuoglu, Ahu Altinkut

    2016-08-01

    Bread wheat (Triticum aestivum L.) gene pool was analyzed with 117 microsatellite markers scattered throughout A, B, and D genomes. Ninety microsatellite markers were giving 1620 polymorphic alleles in 55 different bread wheat genotypes. These genotypes were found to be divided into three subgroups based on Bayesian model and Principal component analysis. The highest polymorphism information content value for the markers resides on A genome was estimated for wmc262 marker located on 4A chromosome with the polymorphism information content value of 0.960. The highest polymorphism information content value (0.954) among the markers known to be located on B genome was realized for wmc44 marker located on 1B chromosome. The highest polymorphism information content value for the markers specific to D genome was found in gwm174 marker located on 5D chromosome with the polymorphism information content value of 0.948. The presence of linkage disequilibrium between 81 pairwise SSR markers reside on the same chromosome was tested and very limited linkage disequilibrium was observed. The results confirmed that the most distant genotype pairs were as follows Ceyhan-99-Behoth 6, Gerek 79-Douma 40989, and Karahan-99-Douma 48114.

  7. Altools: a user friendly NGS data analyser.

    PubMed

    Camiolo, Salvatore; Sablok, Gaurav; Porceddu, Andrea

    2016-02-17

    Genotyping by re-sequencing has become a standard approach to estimate single nucleotide polymorphism (SNP) diversity, haplotype structure and the biodiversity and has been defined as an efficient approach to address geographical population genomics of several model species. To access core SNPs and insertion/deletion polymorphisms (indels), and to infer the phyletic patterns of speciation, most such approaches map short reads to the reference genome. Variant calling is important to establish patterns of genome-wide association studies (GWAS) for quantitative trait loci (QTLs), and to determine the population and haplotype structure based on SNPs, thus allowing content-dependent trait and evolutionary analysis. Several tools have been developed to investigate such polymorphisms as well as more complex genomic rearrangements such as copy number variations, presence/absence variations and large deletions. The programs available for this purpose have different strengths (e.g. accuracy, sensitivity and specificity) and weaknesses (e.g. low computation speed, complex installation procedure and absence of a user-friendly interface). Here we introduce Altools, a software package that is easy to install and use, which allows the precise detection of polymorphisms and structural variations. Altools uses the BWA/SAMtools/VarScan pipeline to call SNPs and indels, and the dnaCopy algorithm to achieve genome segmentation according to local coverage differences in order to identify copy number variations. It also uses insert size information from the alignment of paired-end reads and detects potential large deletions. A double mapping approach (BWA/BLASTn) identifies precise breakpoints while ensuring rapid elaboration. Finally, Altools implements several processes that yield deeper insight into the genes affected by the detected polymorphisms. Altools was used to analyse both simulated and real next-generation sequencing (NGS) data and performed satisfactorily in terms of positive predictive values, sensitivity, the identification of large deletion breakpoints and copy number detection. Altools is fast, reliable and easy to use for the mining of NGS data. The software package also attempts to link identified polymorphisms and structural variants to their biological functions thus providing more valuable information than similar tools.

  8. How and how much does RAD-seq bias genetic diversity estimates?

    PubMed

    Cariou, Marie; Duret, Laurent; Charlat, Sylvain

    2016-11-08

    RAD-seq is a powerful tool, increasingly used in population genomics. However, earlier studies have raised red flags regarding possible biases associated with this technique. In particular, polymorphism on restriction sites results in preferential sampling of closely related haplotypes, so that RAD data tends to underestimate genetic diversity. Here we (1) clarify the theoretical basis of this bias, highlighting the potential confounding effects of population structure and selection, (2) confront predictions to real data from in silico digestion of full genomes and (3) provide a proof of concept toward an ABC-based correction of the RAD-seq bias. Under a neutral and panmictic model, we confirm the previously established relationship between the true polymorphism and its RAD-based estimation, showing a more pronounced bias when polymorphism is high. Using more elaborate models, we show that selection, resulting in heterogeneous levels of polymorphism along the genome, exacerbates the bias and leads to a more pronounced underestimation. On the contrary, spatial genetic structure tends to reduce the bias. We confront the neutral and panmictic model to "ideal" empirical data (in silico RAD-sequencing) using full genomes from natural populations of the fruit fly Drosophila melanogaster and the fungus Shizophyllum commune, harbouring respectively moderate and high genetic diversity. In D. melanogaster, predictions fit the model, but the small difference between the true and RAD polymorphism makes this comparison insensitive to deviations from the model. In the highly polymorphic fungus, the model captures a large part of the bias but makes inaccurate predictions. Accordingly, ABC corrections based on this model improve the estimations, albeit with some imprecisions. The RAD-seq underestimation of genetic diversity associated with polymorphism in restriction sites becomes more pronounced when polymorphism is high. In practice, this means that in many systems where polymorphism does not exceed 2 %, the bias is of minor importance in the face of other sources of uncertainty, such as heterogeneous bases composition or technical artefacts. The neutral panmictic model provides a practical mean to correct the bias through ABC, albeit with some imprecisions. More elaborate ABC methods might integrate additional parameters, such as population structure and selection, but their opposite effects could hinder accurate corrections.

  9. Fine-scale population structure and the era of next-generation sequencing.

    PubMed

    Henn, Brenna M; Gravel, Simon; Moreno-Estrada, Andres; Acevedo-Acevedo, Suehelay; Bustamante, Carlos D

    2010-10-15

    Fine-scale population structure characterizes most continents and is especially pronounced in non-cosmopolitan populations. Roughly half of the world's population remains non-cosmopolitan and even populations within cities often assort along ethnic and linguistic categories. Barriers to random mating can be ecologically extreme, such as the Sahara Desert, or cultural, such as the Indian caste system. In either case, subpopulations accumulate genetic differences if the barrier is maintained over multiple generations. Genome-wide polymorphism data, initially with only a few hundred autosomal microsatellites, have clearly established differences in allele frequency not only among continental regions, but also within continents and within countries. We review recent evidence from the analysis of genome-wide polymorphism data for genetic boundaries delineating human population structure and the main demographic and genomic processes shaping variation, and discuss the implications of population structure for the distribution and discovery of disease-causing genetic variants, in the light of the imminent availability of sequencing data for a multitude of diverse human genomes.

  10. Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies.

    PubMed

    Feuk, Lars; MacDonald, Jeffrey R; Tang, Terence; Carson, Andrew R; Li, Martin; Rao, Girish; Khaja, Razi; Scherer, Stephen W

    2005-10-01

    With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb) in length, 75% were flanked on one or both sides by (often unrelated) segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85%) semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13%) regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22), 13 kb (at 7q11), and 1 kb (at 16q24) fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.

  11. A sequence-based survey of the complex structural organization of tumor genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less

  12. Mobile Interspersed Repeats Are Major Structural Variants in the Human Genome

    PubMed Central

    Huang, Cheng Ran Lisa; Schneider, Anna M.; Lu, Yunqi; Niranjan, Tejasvi; Shen, Peilin; Robinson, Matoya A.; Steranka, Jared P.; Valle, David; Civin, Curt I.; Wang, Tao; Wheelan, Sarah J.; Ji, Hongkai; Boeke, Jef D.; Burns, Kathleen H.

    2010-01-01

    Summary Characterizing structural variants in the human genome is of great importance, but a genome wide analysis to detect interspersed repeats has not been done. Thus, the degree to which mobile DNAs contribute to genetic diversity, heritable disease, and oncogenesis remains speculative. We perform transposon insertion profiling by microarray (TIP-chip) to map human L1(Ta) retrotransposons (LINE-1 s) genome-wide. This identified numerous novel human L1(Ta) insertional polymorphisms with highly variant allelic frequencies. We also explored TIP-chip's usefulness to identify candidate alleles associated with different phenotypes in clinical cohorts. Our data suggest that the occurrence of new insertions is twice as high as previously estimated, and that these repeats are under-recognized as sources of human genomic and phenotypic diversity. We have just begun to probe the universe of human L1(Ta) polymorphisms, and as TIP-chip is applied to other insertions such as Alu SINEs, it will expand the catalog of genomic variants even further. PMID:20602999

  13. Initiation of a pan-genomic research project for Xylella fastidiosa

    USDA-ARS?s Scientific Manuscript database

    Differences in genomic structure and nucleotide polymorphism among strains form the genetic basis for adaptability of a bacterial species. This can be described by a bacterial pan-genome, which is defined as the full complement of genes in all strains of a species. The pan-genome is composed of a "c...

  14. Characterizing polymorphic inversions in human genomes by single-cell sequencing

    PubMed Central

    Sanders, Ashley D.; Hills, Mark; Porubský, David; Guryev, Victor; Falconer, Ester; Lansdorp, Peter M.

    2016-01-01

    Identifying genomic features that differ between individuals and cells can help uncover the functional variants that drive phenotypes and disease susceptibilities. For this, single-cell studies are paramount, as it becomes increasingly clear that the contribution of rare but functional cellular subpopulations is important for disease prognosis, management, and progression. Until now, studying these associations has been challenged by our inability to map structural rearrangements accurately and comprehensively. To overcome this, we coupled single-cell sequencing of DNA template strands (Strand-seq) with custom analysis software to rapidly discover, map, and genotype genomic rearrangements at high resolution. This allowed us to explore the distribution and frequency of inversions in a heterogeneous cell population, identify several polymorphic domains in complex regions of the genome, and locate rare alleles in the reference assembly. We then mapped the entire genomic complement of inversions within two unrelated individuals to characterize their distinct inversion profiles and built a nonredundant global reference of structural rearrangements in the human genome. The work described here provides a powerful new framework to study structural variation and genomic heterogeneity in single-cell samples, whether from individuals for population studies or tissue types for biomarker discovery. PMID:27472961

  15. Evidence for large inversion polymorphisms in the human genome from HapMap data

    PubMed Central

    Bansal, Vikas; Bashir, Ali; Bafna, Vineet

    2007-01-01

    Knowledge about structural variation in the human genome has grown tremendously in the past few years. However, inversions represent a class of structural variation that remains difficult to detect. We present a statistical method to identify large inversion polymorphisms using unusual Linkage Disequilibrium (LD) patterns from high-density SNP data. The method is designed to detect chromosomal segments that are inverted (in a majority of the chromosomes) in a population with respect to the reference human genome sequence. We demonstrate the power of this method to detect such inversion polymorphisms through simulations done using the HapMap data. Application of this method to the data from the first phase of the International HapMap project resulted in 176 candidate inversions ranging from 200 kb to several megabases in length. Our predicted inversions include an 800-kb polymorphic inversion at 7p22, a 1.1-Mb inversion at 16p12, and a novel 1.2-Mb inversion on chromosome 10 that is supported by the presence of two discordant fosmids. Analysis of the genomic sequence around inversion breakpoints showed that 11 predicted inversions are flanked by pairs of highly homologous repeats in the inverted orientation. In addition, for three candidate inversions, the inverted orientation is represented in the Celera genome assembly. Although the power of our method to detect inversions is restricted because of inherently noisy LD patterns in population data, inversions predicted by our method represent strong candidates for experimental validation and analysis. PMID:17185644

  16. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720

  17. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum , indicating a population bottleneck during domestication of C. baccatum . In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum , 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index ( F ST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.

  18. Genome-wide generation and use of informative intron-spanning and intron-length polymorphism markers for high-throughput genetic analysis in rice

    PubMed Central

    Badoni, Saurabh; Das, Sweta; Sayal, Yogesh K.; Gopalakrishnan, S.; Singh, Ashok K.; Rao, Atmakuri R.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    We developed genome-wide 84634 ISM (intron-spanning marker) and 16510 InDel-fragment length polymorphism-based ILP (intron-length polymorphism) markers from genes physically mapped on 12 rice chromosomes. These genic markers revealed much higher amplification-efficiency (80%) and polymorphic-potential (66%) among rice accessions even by a cost-effective agarose gel-based assay. A wider level of functional molecular diversity (17–79%) and well-defined precise admixed genetic structure was assayed by 3052 genome-wide markers in a structured population of indica, japonica, aromatic and wild rice. Six major grain weight QTLs (11.9–21.6% phenotypic variation explained) were mapped on five rice chromosomes of a high-density (inter-marker distance: 0.98 cM) genetic linkage map (IR 64 x Sonasal) anchored with 2785 known/candidate gene-derived ISM and ILP markers. The designing of multiple ISM and ILP markers (2 to 4 markers/gene) in an individual gene will broaden the user-preference to select suitable primer combination for efficient assaying of functional allelic variation/diversity and realistic estimation of differential gene expression profiles among rice accessions. The genomic information generated in our study is made publicly accessible through a user-friendly web-resource, “Oryza ISM-ILP marker” database. The known/candidate gene-derived ISM and ILP markers can be enormously deployed to identify functionally relevant trait-associated molecular tags by optimal-resource expenses, leading towards genomics-assisted crop improvement in rice. PMID:27032371

  19. Whole genome re-sequencing of date palms yields insights into diversification of a fruit tree crop.

    PubMed

    Hazzouri, Khaled M; Flowers, Jonathan M; Visser, Hendrik J; Khierallah, Hussam S M; Rosas, Ulises; Pham, Gina M; Meyer, Rachel S; Johansen, Caryn K; Fresquez, Zoë A; Masmoudi, Khaled; Haider, Nadia; El Kadri, Nabila; Idaghdour, Youssef; Malek, Joel A; Thirkhill, Deborah; Markhand, Ghulam S; Krueger, Robert R; Zaid, Abdelouahhab; Purugganan, Michael D

    2015-11-09

    Date palms (Phoenix dactylifera) are the most significant perennial crop in arid regions of the Middle East and North Africa. Here, we present a comprehensive catalogue of approximately seven million single nucleotide polymorphisms in date palms based on whole genome re-sequencing of a collection of 62 cultivars. Population structure analysis indicates a major genetic divide between North Africa and the Middle East/South Asian date palms, with evidence of admixture in cultivars from Egypt and Sudan. Genome-wide scans for selection suggest at least 56 genomic regions associated with selective sweeps that may underlie geographic adaptation. We report candidate mutations for trait variation, including nonsense polymorphisms and presence/absence variation in gene content in pathways for key agronomic traits. We also identify a copia-like retrotransposon insertion polymorphism in the R2R3 myb-like orthologue of the oil palm virescens gene associated with fruit colour variation. This analysis documents patterns of post-domestication diversification and provides a genomic resource for this economically important perennial tree crop.

  20. Whole genome re-sequencing of date palms yields insights into diversification of a fruit tree crop

    PubMed Central

    Hazzouri, Khaled M.; Flowers, Jonathan M.; Visser, Hendrik J.; Khierallah, Hussam S. M.; Rosas, Ulises; Pham, Gina M.; Meyer, Rachel S.; Johansen, Caryn K.; Fresquez, Zoë A.; Masmoudi, Khaled; Haider, Nadia; El Kadri, Nabila; Idaghdour, Youssef; Malek, Joel A.; Thirkhill, Deborah; Markhand, Ghulam S.; Krueger, Robert R.; Zaid, Abdelouahhab; Purugganan, Michael D.

    2015-01-01

    Date palms (Phoenix dactylifera) are the most significant perennial crop in arid regions of the Middle East and North Africa. Here, we present a comprehensive catalogue of approximately seven million single nucleotide polymorphisms in date palms based on whole genome re-sequencing of a collection of 62 cultivars. Population structure analysis indicates a major genetic divide between North Africa and the Middle East/South Asian date palms, with evidence of admixture in cultivars from Egypt and Sudan. Genome-wide scans for selection suggest at least 56 genomic regions associated with selective sweeps that may underlie geographic adaptation. We report candidate mutations for trait variation, including nonsense polymorphisms and presence/absence variation in gene content in pathways for key agronomic traits. We also identify a copia-like retrotransposon insertion polymorphism in the R2R3 myb-like orthologue of the oil palm virescens gene associated with fruit colour variation. This analysis documents patterns of post-domestication diversification and provides a genomic resource for this economically important perennial tree crop. PMID:26549859

  1. Genetic Diversity and Demographic History of Cajanus spp. Illustrated from Genome-Wide SNPs

    PubMed Central

    Saxena, Rachit K.; von Wettberg, Eric; Upadhyaya, Hari D.; Sanchez, Vanessa; Songok, Serah; Saxena, Kulbhushan; Kimurto, Paul; Varshney, Rajeev K.

    2014-01-01

    Understanding genetic structure of Cajanus spp. is essential for achieving genetic improvement by quantitative trait loci (QTL) mapping or association studies and use of selected markers through genomic assisted breeding and genomic selection. After developing a comprehensive set of 1,616 single nucleotide polymorphism (SNPs) and their conversion into cost effective KASPar assays for pigeonpea (Cajanus cajan), we studied levels of genetic variability both within and between diverse set of Cajanus lines including 56 breeding lines, 21 landraces and 107 accessions from 18 wild species. These results revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, 75.8% of successful SNP assays revealed polymorphism, and more than 95% of these assays could be successfully transferred to related wild species. To show regional patterns of variation, we used STRUCTURE and Analysis of Molecular Variance (AMOVA) to partition variance among hierarchical sets of landraces and wild species at either the continental scale or within India. STRUCTURE separated most of the domesticated germplasm from wild ecotypes, and separates Australian and Asian wild species as has been found previously. Among Indian regions and states within regions, we found 36% of the variation between regions, and 64% within landraces or wilds within states. The highest level of polymorphism in wild relatives and landraces was found in Madhya Pradesh and Andhra Pradesh provinces of India representing the centre of origin and domestication of pigeonpea respectively. PMID:24533111

  2. Structure and polymorphism of the mouse myelin/oligodendrocyte glycoprotein gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daubas, P.; Pham-Dinh, D.; Dautigny, A.

    1994-09-01

    The authors have isolated and characterized genomic clones containing the mouse myelin/oligodendrocyte glycoprotein (MOG) gene. It spans a region of 12.5 kb and consists of eight exons. Its exon-intron structure differs from that of classical MHC-class I genes, with which it is linked in the mouse genome. Nucleotide sequencing of the 5{prime} flanking region revelas that it contains several putative protein-binding sites, some of them in common with other myelin gene promoters. One intragenic polymorphism has been identified: it consists of a GA repeat, defining at least three alleles in mouse inbred strains, and is easily detectable using the polymerasemore » chain reaction method.« less

  3. Molecular genetics and genomics of the Rosoideae: state of the art and future perspectives

    PubMed Central

    Longhi, Sara; Giongo, Lara; Buti, Matteo; Surbanovski, Nada; Viola, Roberto; Velasco, Riccardo; Ward, Judson A; Sargent, Daniel J

    2014-01-01

    The Rosoideae is a subfamily of the Rosaceae that contains a number of species of economic importance, including the soft fruit species strawberry (Fragaria ×ananassa), red (Rubus idaeus) and black (Rubus occidentalis) raspberries, blackberries (Rubus spp.) and one of the most economically important cut flower genera, the roses (Rosa spp.). Molecular genetics and genomics resources for the Rosoideae have developed rapidly over the past two decades, beginning with the development and application of a number of molecular marker types including restriction fragment length polymorphisms, amplified fragment length polymorphisms and microsatellites, and culminating in the recent publication of the genome sequence of the woodland strawberry, Fragaria vesca, and the development of high throughput single nucleotide polymorphism (SNP)-genotyping resources for Fragaria, Rosa and Rubus. These tools have been used to identify genes and other functional elements that control traits of economic importance, to study the evolution of plant genome structure within the subfamily, and are beginning to facilitate genomic-assisted breeding through the development and deployment of markers linked to traits such as aspects of fruit quality, disease resistance and the timing of flowering. In this review, we report on the developments that have been made over the last 20 years in the field of molecular genetics and structural genomics within the Rosoideae, comment on how the knowledge gained will improve the efficiency of cultivar development and discuss how these advances will enhance our understanding of the biological processes determining agronomically important traits in all Rosoideae species. PMID:26504527

  4. Read count-based method for high-throughput allelic genotyping of transposable elements and structural variants.

    PubMed

    Kuhn, Alexandre; Ong, Yao Min; Quake, Stephen R; Burkholder, William F

    2015-07-08

    Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.

  5. Extensive Copy-Number Variation of Young Genes across Stickleback Populations

    PubMed Central

    Eizaguirre, Christophe; Samonte, Irene E.; Kalbe, Martin; Lenz, Tobias L.; Stoll, Monika; Bornberg-Bauer, Erich; Milinski, Manfred; Reusch, Thorsten B. H.

    2014-01-01

    Duplicate genes emerge as copy-number variations (CNVs) at the population level, and remain copy-number polymorphic until they are fixed or lost. The successful establishment of such structural polymorphisms in the genome plays an important role in evolution by promoting genetic diversity, complexity and innovation. To characterize the early evolutionary stages of duplicate genes and their potential adaptive benefits, we combine comparative genomics with population genomics analyses to evaluate the distribution and impact of CNVs across natural populations of an eco-genomic model, the three-spined stickleback. With whole genome sequences of 66 individuals from populations inhabiting three distinct habitats, we find that CNVs generally occur at low frequencies and are often only found in one of the 11 populations surveyed. A subset of CNVs, however, displays copy-number differentiation between populations, showing elevated within-population frequencies consistent with local adaptation. By comparing teleost genomes to identify lineage-specific genes and duplications in sticklebacks, we highlight rampant gene content differences among individuals in which over 30% of young duplicate genes are CNVs. These CNV genes are evolving rapidly at the molecular level and are enriched with functional categories associated with environmental interactions, depicting the dynamic early copy-number polymorphic stage of genes during population differentiation. PMID:25474574

  6. Herbicide targets and detoxification proteins in sugarcane: from gene assembly to structure modelling.

    PubMed

    Lloyd Evans, Dyfed; Joshi, Shailesh Vinay

    2017-07-01

    In a genome context, sugarcane is a classic orphan crop, in that no genome and only very few genes have been assembled. We have devised a novel exome assembly methodology that has allowed us to assemble and characterize 49 genes that serve as herbicide targets, safener interacting proteins, and members of herbicide detoxification pathways within the sugarcane genome. We have structurally modelled the products of each of these genes, as well as determining allelic, genomic, and RNA-Seq based polymorphisms for each gene. This study provides the largest collection of sugarcane structures modelled to date. We demonstrate that sugarcane genes are highly polymorphic, revealing that each genotype is evolving both uniquely and independently. In addition, we present an exome assembly system for orphan crops that can be executed on commodity infrastructure, making exome assembly practical for any group. In terms of knowledge about herbicide modes of action and detoxification, we have advanced sugarcane from a crop where no information about any herbicide-associated gene was available to the situation where sugarcane is now a species with the single largest collection of known and annotated herbicide-associated genes.

  7. Rapid isolation of microsatellite DNAs and identification of polymorphic mitochondrial DNA regions in the fish rotan (Perccottus glenii) invading European Russia

    USGS Publications Warehouse

    King, Timothy L.; Eackles, Michael S.; Reshetnikov, Andrey N.

    2015-01-01

    Human-mediated translocations and subsequent large-scale colonization by the invasive fish rotan (Perccottus glenii Dybowski, 1877; Perciformes, Odontobutidae), also known as Amur or Chinese sleeper, has resulted in dramatic transformations of small lentic ecosystems. However, no detailed genetic information exists on population structure, levels of effective movement, or relatedness among geographic populations of P. glenii within the European part of the range. We used massively parallel genomic DNA shotgun sequencing on the semiconductor-based Ion Torrent Personal Genome Machine (PGM) sequencing platform to identify nuclear microsatellite and mitochondrial DNA sequences in P. glenii from European Russia. Here we describe the characterization of nine nuclear microsatellite loci, ascertain levels of allelic diversity, heterozygosity, and demographic status of P. glenii collected from Ilev, Russia, one of several initial introduction points in European Russia. In addition, we mapped sequence reads to the complete P. glenii mitochondrial DNA sequence to identify polymorphic regions. Nuclear microsatellite markers developed for P. glenii yielded sufficient genetic diversity to: (1) produce unique multilocus genotypes; (2) elucidate structure among geographic populations; and (3) provide unique perspectives for analysis of population sizes and historical demographics. Among 4.9 million filtered P. glenii Ion Torrent PGM sequence reads, 11,304 mapped to the mitochondrial genome (NC_020350). This resulted in 100 % coverage of this genome to a mean coverage depth of 102X. A total of 130 variable sites were observed between the publicly available genome from China and the studied composite mitochondrial genome. Among these, 82 were diagnostic and monomorphic between the mitochondrial genomes and distributed among 15 genome regions. The polymorphic sites (N = 48) were distributed among 11 mitochondrial genome regions. Our results also indicate that sequence reads generated from two three-hour runs on the Ion Torrent PGM can generate a sufficient number of nuclear and mitochondrial markers to improve understanding of the evolutionary and ecological dynamics of non-model and in particular, invasive species.

  8. Genomic Variation in Natural Populations of Drosophila melanogaster

    PubMed Central

    Langley, Charles H.; Stevens, Kristian; Cardeno, Charis; Lee, Yuh Chwen G.; Schrider, Daniel R.; Pool, John E.; Langley, Sasha A.; Suarez, Charlyn; Corbett-Detig, Russell B.; Kolaczkowski, Bryan; Fang, Shu; Nista, Phillip M.; Holloway, Alisha K.; Kern, Andrew D.; Dewey, Colin N.; Song, Yun S.; Hahn, Matthew W.; Begun, David J.

    2012-01-01

    This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5′- and 3′-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species. PMID:22673804

  9. Construction of Pseudomolecule Sequences of the aus Rice Cultivar Kasalath for Comparative Genomics of Asian Cultivated Rice

    PubMed Central

    Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong

    2014-01-01

    Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372

  10. Mice, humans and haplotypes--the hunt for disease genes in SLE.

    PubMed

    Rigby, R J; Fernando, M M A; Vyse, T J

    2006-09-01

    Defining the polymorphisms that contribute to the development of complex genetic disease traits is a challenging, although increasingly tractable problem. Historically, the technical difficulties in conducting association studies across the entire human genome are such that murine models have been used to generate candidate genes for analysis in human complex diseases, such as SLE. In this article we discuss the advantages and disadvantages of this approach and specifically address some assumptions made in the transition from studying one species to another, using lupus as an example. These issues include differences in genetic structure and genetic organisation which are a reflection on the population history. Clearly there are major differences in the histories of the human population and inbred laboratory strains of mice. Both human and murine genomes do exhibit structure at the genetic level. That is to say, they comprise haplotypes which are genomic regions that carry runs of polymorphisms that are not independently inherited. Haplotypes therefore reduce the number of combinations of the polymorphisms in the DNA in that region and facilitate the identification of disease susceptibility genes in both mice and humans. There are now novel means of generating candidate genes in SLE using mutagenesis (with ENU) in mice and identifying mice that generate antinuclear autoimmunity. In addition, murine models still provide a valuable means of exploring the functional consequences of genetic variation. However, advances in technology are such that human geneticists can now screen large fractions of the human genome for disease associations using microchip technologies that provide information on upwards of 100,000 different polymorphisms. These approaches are aimed at identifying haplotypes that carry disease susceptibility mutations and rely less on the generation of candidate genes.

  11. Hybridization capture reveals evolution and conservation across the entire Koala retrovirus genome.

    PubMed

    Tsangaras, Kyriakos; Siracusa, Matthew C; Nikolaidis, Nikolas; Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M; Roca, Alfred L; Greenwood, Alex D

    2014-01-01

    The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin.

  12. Hybridization Capture Reveals Evolution and Conservation across the Entire Koala Retrovirus Genome

    PubMed Central

    Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M.; Roca, Alfred L.; Greenwood, Alex D.

    2014-01-01

    The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin. PMID:24752422

  13. In silico screening of the chicken genome for overlaps between genomic regions: microRNA genes, coding and non-coding transcriptional units, QTL, and genetic variations.

    PubMed

    Zorc, Minja; Kunej, Tanja

    2016-05-01

    MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a starting point for further functional studies and association studies with poultry production and health traits and the basis for systematic screening of exonic miRNAs and missense/miRNA seed polymorphisms in other genomes.

  14. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

    PubMed Central

    Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

    2015-01-01

    Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089

  15. Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

    PubMed Central

    Vembar, Shruthi Sridhar; Seetin, Matthew; Lambert, Christine; Nattestad, Maria; Schatz, Michael C.; Baybayan, Primo; Scherf, Artur; Smith, Melissa Laird

    2016-01-01

    The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. PMID:27345719

  16. Natural Allelic Diversity, Genetic Structure and Linkage Disequilibrium Pattern in Wild Chickpea

    PubMed Central

    Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

    2014-01-01

    Characterization of natural allelic diversity and understanding the genetic structure and linkage disequilibrium (LD) pattern in wild germplasm accessions by large-scale genotyping of informative microsatellite and single nucleotide polymorphism (SNP) markers is requisite to facilitate chickpea genetic improvement. Large-scale validation and high-throughput genotyping of genome-wide physically mapped 478 genic and genomic microsatellite markers and 380 transcription factor gene-derived SNP markers using gel-based assay, fluorescent dye-labelled automated fragment analyser and matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass array have been performed. Outcome revealed their high genotyping success rate (97.5%) and existence of a high level of natural allelic diversity among 94 wild and cultivated Cicer accessions. High intra- and inter-specific polymorphic potential and wider molecular diversity (11–94%) along with a broader genetic base (13–78%) specifically in the functional genic regions of wild accessions was assayed by mapped markers. It suggested their utility in monitoring introgression and transferring target trait-specific genomic (gene) regions from wild to cultivated gene pool for the genetic enhancement. Distinct species/gene pool-wise differentiation, admixed domestication pattern, and differential genome-wide recombination and LD estimates/decay observed in a six structured population of wild and cultivated accessions using mapped markers further signifies their usefulness in chickpea genetics, genomics and breeding. PMID:25222488

  17. Gene Presence-Absence Polymorphism in Castrating Anther-Smut Fungi: Recent Gene Gains and Phylogeographic Structure.

    PubMed

    Hartmann, Fanny E; Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana

    2018-04-01

    Gene presence-absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence-absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence-absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence-absence polymorphism in the two species. Genes displaying presence-absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence-absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence-absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies.

  18. Gene Presence–Absence Polymorphism in Castrating Anther-Smut Fungi: Recent Gene Gains and Phylogeographic Structure

    PubMed Central

    Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana

    2018-01-01

    Abstract Gene presence–absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence–absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence–absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence–absence polymorphism in the two species. Genes displaying presence–absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence–absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence–absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies. PMID:29722826

  19. Mitochondrial pathogenic mutations are population-specific.

    PubMed

    Breen, Michael S; Kondrashov, Fyodor A

    2010-12-31

    Surveying deleterious variation in human populations is crucial for our understanding, diagnosis and potential treatment of human genetic pathologies. A number of recent genome-wide analyses focused on the prevalence of segregating deleterious alleles in the nuclear genome. However, such studies have not been conducted for the mitochondrial genome. We present a systematic survey of polymorphisms in the human mitochondrial genome, including those predicted to be deleterious and those that correspond to known pathogenic mutations. Analyzing 4458 completely sequenced mitochondrial genomes we characterize the genetic diversity of different types of single nucleotide polymorphisms (SNPs) in African (L haplotypes) and non-African (M and N haplotypes) populations. We find that the overall level of polymorphism is higher in the mitochondrial compared to the nuclear genome, although the mitochondrial genome appears to be under stronger selection as indicated by proportionally fewer nonsynonymous than synonymous substitutions. The African mitochondrial genomes show higher heterozygosity, a greater number of polymorphic sites and higher frequencies of polymorphisms for synonymous, benign and damaging polymorphism than non-African genomes. However, African genomes carry significantly fewer SNPs that have been previously characterized as pathogenic compared to non-African genomes. Finding SNPs classified as pathogenic to be the only category of polymorphisms that are more abundant in non-African genomes is best explained by a systematic ascertainment bias that favours the discovery of pathogenic polymorphisms segregating in non-African populations. This further suggests that, contrary to the common disease-common variant hypothesis, pathogenic mutations are largely population-specific and different SNPs may be associated with the same disease in different populations. Therefore, to obtain a comprehensive picture of the deleterious variability in the human population, as well as to improve the diagnostics of individuals carrying African mitochondrial haplotypes, it is necessary to survey different populations independently. This article was reviewed by Dr Mikhail Gelfand, Dr Vasily Ramensky (nominated by Dr Eugene Koonin) and Dr David Rand (nominated by Dr Laurence Hurst).

  20. Pyrosequencing of the northern red oak (Quercus rubra L.) chloroplast genome reveals high quality polymorphisms for population management

    Treesearch

    Lisa W. Alexander; Keith E. Woeste

    2014-01-01

    Given the low intraspecific chloroplast diversity detected in northern red oak (Quercus rubra L.), more powerful genetic tools are necessary to accurately characterize Q. rubra chloroplast diversity and structure. We report the sequencing, assembly, and annotation of the chloroplast genome of northern red oak via pyrosequencing and...

  1. Genome-wide divergence and linkage disequilibrium analyses for Capsicum baccatum revealed by genome-anchored single nucleotide polymorphisms

    USDA-ARS?s Scientific Manuscript database

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...

  2. Characterization of six human disease-associated inversion polymorphisms.

    PubMed

    Antonacci, Francesca; Kidd, Jeffrey M; Marques-Bonet, Tomas; Ventura, Mario; Siswara, Priscillia; Jiang, Zhaoshi; Eichler, Evan E

    2009-07-15

    The human genome is a highly dynamic structure that shows a wide range of genetic polymorphic variation. Unlike other types of structural variation, little is known about inversion variants within normal individuals because such events are typically balanced and are difficult to detect and analyze by standard molecular approaches. Using sequence-based, cytogenetic and genotyping approaches, we characterized six large inversion polymorphisms that map to regions associated with genomic disorders with complex segmental duplications mapping at the breakpoints. We developed a metaphase FISH-based assay to genotype inversions and analyzed the chromosomes of 27 individuals from three HapMap populations. In this subset, we find that these inversions are less frequent or absent in Asians when compared with European and Yoruban populations. Analyzing multiple individuals from outgroup species of great apes, we show that most of these large inversion polymorphisms are specific to the human lineage with two exceptions, 17q21.31 and 8p23 inversions, which are found to be similarly polymorphic in other great ape species and where the inverted allele represents the ancestral state. Investigating linkage disequilibrium relationships with genotyped SNPs, we provide evidence that most of these inversions appear to have arisen on at least two different haplotype backgrounds. In these cases, discovery and genotyping methods based on SNPs may be confounded and molecular cytogenetics remains the only method to genotype these inversions.

  3. Genome comparison of two Magnaporthe oryzae field isolates reveals genome variations and potential virulence effectors

    PubMed Central

    2013-01-01

    Background Rice blast caused by the fungus Magnaporthe oryzae is an important disease in virtually every rice growing region of the world, which leads to significant annual decreases of grain quality and yield. To prevent disease, resistance genes in rice have been cloned and introduced into susceptible cultivars. However, introduced resistance can often be broken within few years of release, often due to mutation of cognate avirulence genes in fungal field populations. Results To better understand the pattern of mutation of M. oryzae field isolates under natural selection forces, we used a next generation sequencing approach to analyze the genomes of two field isolates FJ81278 and HN19311, as well as the transcriptome of FJ81278. By comparing the de novo genome assemblies of the two isolates against the finished reference strain 70–15, we identified extensive polymorphisms including unique genes, SNPs (single nucleotide polymorphism) and indels, structural variations, copy number variations, and loci under strong positive selection. The 1.75 MB of isolate-specific genome content carrying 118 novel genes from FJ81278, and 0.83 MB from HN19311 were also identified. By analyzing secreted proteins carrying polymorphisms, in total 256 candidate virulence effectors were found and 6 were chosen for functional characterization. Conclusions We provide results from genome comparison analysis showing extensive genome variation, and generated a list of M. oryzae candidate virulence effectors for functional characterization. PMID:24341723

  4. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies

    PubMed Central

    Gimode, Davis; Odeny, Damaris A.; de Villiers, Etienne P.; Wanyonyi, Solomon; Dida, Mathews M.; Mneney, Emmarold E.; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M.

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional breeding programs in order to efficiently optimize productivity. PMID:27454301

  5. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    PubMed

    Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional breeding programs in order to efficiently optimize productivity.

  6. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale.

    PubMed

    Liu, Siyang; Huang, Shujia; Rao, Junhua; Ye, Weijian; Krogh, Anders; Wang, Jun

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.

  7. Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance

    PubMed Central

    Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S

    2009-01-01

    In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455

  8. Cloning of polymorphisms (COP): enrichment of polymorphic sequences from complex genomes

    PubMed Central

    Li, Jingfeng; Wang, Fuli; Zabarovska, Veronika; Wahlestedt, Claes; Zabarovsky, Eugene R.

    2000-01-01

    Here we describe a new procedure (cloning of polymorphisms, COP) for enrichment of single nucleotide polymorphisms (SNPs) that represent restriction fragment length polymorphisms (RFLPs). COP would be applicable to the isolation of SNPs from particular regions of the genome, e.g. CpG islands, chromosomal bands, YACs or PAC contigs. A combination of digestion with restriction enzymes, treatment with uracil-DNA glycosylase and mung bean nuclease, PCR amplification and purification with streptavidin magnetic beads was used to isolate polymorphic sequences from the genomes of two human samples. After only two cycles of enrichment, 80% of the isolated clones were found to contain RFLPs. A simple method for the PCR detection of these polymorphisms was also developed. PMID:10606669

  9. Transposon Insertions, Structural Variations, and SNPs Contribute to the Evolution of the Melon Genome.

    PubMed

    Sanseverino, Walter; Hénaff, Elizabeth; Vives, Cristina; Pinosio, Sara; Burgos-Paz, William; Morgante, Michele; Ramos-Onsins, Sebastián E; Garcia-Mas, Jordi; Casacuberta, Josep Maria

    2015-10-01

    The availability of extensive databases of crop genome sequences should allow analysis of crop variability at an unprecedented scale, which should have an important impact in plant breeding. However, up to now the analysis of genetic variability at the whole-genome scale has been mainly restricted to single nucleotide polymorphisms (SNPs). This is a strong limitation as structural variation (SV) and transposon insertion polymorphisms are frequent in plant species and have had an important mutational role in crop domestication and breeding. Here, we present the first comprehensive analysis of melon genetic diversity, which includes a detailed analysis of SNPs, SV, and transposon insertion polymorphisms. The variability found among seven melon varieties representing the species diversity and including wild accessions and highly breed lines, is relatively high due in part to the marked divergence of some lineages. The diversity is distributed nonuniformly across the genome, being lower at the extremes of the chromosomes and higher in the pericentromeric regions, which is compatible with the effect of purifying selection and recombination forces over functional regions. Additionally, this variability is greatly reduced among elite varieties, probably due to selection during breeding. We have found some chromosomal regions showing a high differentiation of the elite varieties versus the rest, which could be considered as strongly selected candidate regions. Our data also suggest that transposons and SV may be at the origin of an important fraction of the variability in melon, which highlights the importance of analyzing all types of genetic variability to understand crop genome evolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  10. Defining Genetic Risk for GVHD and Mortality Following Allogeneic Hematopoietic Stem Cell Transplantation

    PubMed Central

    Hansen, John A; Chien, Jason W; Warren, Edus H; Zhao, Lue Ping; Martin, Paul J

    2011-01-01

    Purpose of review To explore what is known about the genetics of hematopoietic stem cell transplantation (HCT) and how genetic polymorphism affects risk of graft-versus-host disease (GVHD) and mortality. Recent findings Genetic variation found across the human genome can impact HCT outcome by 1) causing genetic disparity between patient and donor, and 2) modifying gene function. Single nucleotide polymorphisms (SNP) and structural variation can result in mismatching for cellular peptides known as histocompatibility antigens (HA). At least 25 to 30 polymorphic genes are known to encode functional HA in mismatched individuals, but their individual contribution to clinical GVHD is unclear. HCT outcome may also be affected by polymorphism in donor or recipient. Association studies have implicated several genes with GVHD and mortality, however results have been inconsistent most likely due to limited sample size, and differences in racial diversity and clinical covariates. New technologies using DNA arrays genotyping for a million or more SNPs promise genome-wide discovery of HCT associated genes, however adequate statistical power requires study populations of several thousand patient-donor pairs. Summary Available data offers strong preliminary support for the impact that genetic variation has on risk of GVHD and mortality following HCT. Definitive results however await future genome-wide studies of large multi-center HCT cohorts. PMID:20827186

  11. Genomic Epidemiology of Salmonella enterica Serotype Enteritidis based on Population Structure of Prevalent Lineages

    PubMed Central

    Desai, Prerak T.; den Bakker, Henk C.; Mikoleit, Matthew; Tolar, Beth; Trees, Eija; Hendriksen, Rene S.; Frye, Jonathan G.; Porwollik, Steffen; Weimer, Bart C.; Wiedmann, Martin; Weinstock, George M.; Fields, Patricia I.; McClelland, Michael

    2014-01-01

    Salmonella enterica serotype Enteritidis is one of the most commonly reported causes of human salmonellosis. Its low genetic diversity, measured by fingerprinting methods, has made subtyping a challenge. We used whole-genome sequencing to characterize 125 S. enterica Enteritidis and 3 S. enterica serotype Nitra strains. Single-nucleotide polymorphisms were filtered to identify 4,887 reliable loci that distinguished all isolates from each other. Our whole-genome single-nucleotide polymorphism typing approach was robust for S. enterica Enteritidis subtyping with combined data for different strains from 2 different sequencing platforms. Five major genetic lineages were recognized, which revealed possible patterns of geographic and epidemiologic distribution. Analyses on the population dynamics and evolutionary history estimated that major lineages emerged during the 17th–18th centuries and diversified during the 1920s and 1950s. PMID:25147968

  12. Identification of new polymorphic regions and differentiation of cultivated olives (Olea europaea L.) through plastome sequence comparison

    PubMed Central

    2010-01-01

    Background The cultivated olive (Olea europaea L.) is the most agriculturally important species of the Oleaceae family. Although many studies have been performed on plastid polymorphisms to evaluate taxonomy, phylogeny and phylogeography of Olea subspecies, only few polymorphic regions discriminating among the agronomically and economically important olive cultivars have been identified. The objective of this study was to sequence the entire plastome of olive and analyze many potential polymorphic regions to develop new inter-cultivar genetic markers. Results The complete plastid genome of the olive cultivar Frantoio was determined by direct sequence analysis using universal and novel PCR primers designed to amplify all overlapping regions. The chloroplast genome of the olive has an organisation and gene order that is conserved among numerous Angiosperm species and do not contain any of the inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses that have been found in the chloroplast genomes of the genera Jasminum and Menodora, from the same family as Olea. The annotated sequence was used to evaluate the content of coding genes, the extent, and distribution of repeated and long dispersed sequences and the nucleotide composition pattern. These analyses provided essential information for structural, functional and comparative genomic studies in olive plastids. Furthermore, the alignment of the olive plastome sequence to those of other varieties and species identified 30 new organellar polymorphisms within the cultivated olive. Conclusions In addition to identifying mutations that may play a functional role in modifying the metabolism and adaptation of olive cultivars, the new chloroplast markers represent a valuable tool to assess the level of olive intercultivar plastome variation for use in population genetic analysis, phylogenesis, cultivar characterisation and DNA food tracking. PMID:20868482

  13. Cross-genera transferability of rice and finger millet genomic SSRs to barnyard millet (Echinochloa spp.).

    PubMed

    Kalyana Babu, B; Sood, Salej; Kumar, Dinesh; Joshi, Anjeli; Pattanayak, A; Kant, Lakshmi; Upadhyaya, H D

    2018-02-01

    Barnyard millet ( Echinochloa spp.) is an important crop from nutritional point of view, nevertheless, the genetic information is very scarce. In the present investigation, rice and finger millet genomic SSRs were used for assessing cross transferability, identification of polymorphic markers, syntenic regions, genetic diversity and population structure analysis of barnyard millet genotypes. We observed 100% cross transferability for finger millet SSRs, of which 91% were polymorphic, while 71% of rice markers were cross transferable with 48% polymorphic out of them. Twenty-nine and sixteen highly polymorphic finger millet and rice SSRs yielded a mean of 4.3 and 3.38 alleles per locus in barnyard millet genotypes, respectively. The PIC values varied from 0.27 to 0.73 at an average of 0.54 for finger millet SSRs, whereas it was from 0.15 to 0.67 at an average of 0.44 for rice SSRs. High synteny was observed for markers related to panicle length, yield-related traits, spikelet fertility, plant height, root traits, leaf senescence, blast and brown plant hopper resistance. Although the rice SSRs located on chromosome 10 followed by chromosome 6 and 11 were found to be more transferable to barnyard millet, the finger millet SSRs were more polymorphic and transferable to barnyard millet genotypes. These SSR data of finger millet and rice individually as well as combined together grouped the 11 barnyard millet genotypes into 2 major clusters. The results of population structure analysis were similar to cluster analysis.

  14. The Evolution and Functional Impact of Human Deletion Variants Shared with Archaic Hominin Genomes

    PubMed Central

    Lin, Yen-Lung; Pavlidis, Pavlos; Karakoc, Emre; Ajay, Jerry; Gokcumen, Omer

    2015-01-01

    Allele sharing between modern and archaic hominin genomes has been variously interpreted to have originated from ancestral genetic structure or through non-African introgression from archaic hominins. However, evolution of polymorphic human deletions that are shared with archaic hominin genomes has yet to be studied. We identified 427 polymorphic human deletions that are shared with archaic hominin genomes, approximately 87% of which originated before the Human–Neandertal divergence (ancient) and only approximately 9% of which have been introgressed from Neandertals (introgressed). Recurrence, incomplete lineage sorting between human and chimp lineages, and hominid-specific insertions constitute the remaining approximately 4% of allele sharing between humans and archaic hominins. We observed that ancient deletions correspond to more than 13% of all common (>5% allele frequency) deletion variation among modern humans. Our analyses indicate that the genomic landscapes of both ancient and introgressed deletion variants were primarily shaped by purifying selection, eliminating large and exonic variants. We found 17 exonic deletions that are shared with archaic hominin genomes, including those leading to three fusion transcripts. The affected genes are involved in metabolism of external and internal compounds, growth and sperm formation, as well as susceptibility to psoriasis and Crohn’s disease. Our analyses suggest that these “exonic” deletion variants have evolved through different adaptive forces, including balancing and population-specific positive selection. Our findings reveal that genomic structural variants that are shared between humans and archaic hominin genomes are common among modern humans and can influence biomedically and evolutionarily important phenotypes. PMID:25556237

  15. Distribution and localization of microsatellites in the Perigord black truffle genome and identification of new molecular markers (2010) Fungal Genetics and Biology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murat, Claude; Riccioni, C; Belfiori, B

    The level of genetic diversity and genetic structure in the Perigord black truffle (Tuber melanosporum Vittad.) has been debated for several years, mainly due to the lack of appropriate genetic markers. Microsatellites or simple sequence repeats (SSRs) are important for the genome organisation, phenotypic diversity and are one of the most popular molecular markers. In this study, we surveyed the T. melanosporum genome (1) to characterise its SSR pattern; (2) to compare it with SSR patterns found in 48 other fungal and three oomycetes genomes and (3) to identify new polymorphic SSR markers for population genetics. The T. melanosporum genomemore » is rich in SSRs with 22,425 SSRs with mono-nucleotides being the most frequent motifs. SSRs were found in all genomic regions although they are more frequent in non-coding regions (introns and intergenic regions). Sixty out of 135 PCR-amplified mono-, di-, tri-, tetra, penta, and hexanucleotides were polymorphic (44%) within black truffle populations and 27 were randomly selected and analysed on 139 T. melanosporum isolates from France, Italy and Spain. The number of alleles varied from 2 to 18 and the expected heterozygosity from 0.124 to 0.815. One hundred and thirty-two different multilocus genotypes out of the 139 T. melanosporum isolates were identified and the genotypic diversity was high (0.999). Polymorphic SSRs were found in UTR regulatory regions of fruiting bodies and ectomycorrhiza regulated genes, suggesting that they may play a role in phenotypic variation. In conclusion, SSRs developed in this study were highly polymorphic and our results showed that T. melanosporum is a species with an important genetic diversity, which is in agreement with its recently uncovered heterothallic mating system.« less

  16. Careful with That Axe, Gene, Genome Perturbation after a PEG-Mediated Protoplast Transformation in Fusarium verticillioides.

    PubMed

    Scala, Valeria; Grottoli, Alessandro; Aiese Cigliano, Riccardo; Anzar, Irantzu; Beccaccioli, Marzia; Fanelli, Corrado; Dall'Asta, Chiara; Battilani, Paola; Reverberi, Massimo; Sanseverino, Walter

    2017-05-31

    Fusarium verticillioides causes ear rot disease in maize and its contamination with fumonisins, mycotoxins harmful for humans and livestock. Lipids, and their oxidized forms, may drive the fate of this disease. In a previous study, we have explored the role of oxylipins in this interaction by deleting by standard transformation procedures a linoleate diol synthase-coding gene, lds1 , in F. verticillioides . A profound phenotypic diversity in the mutants generated has prompted us to investigate more deeply the whole genome of two lds1 -deleted strains. Bioinformatics analyses pinpoint significant differences in the genome sequences emerged between the wild type and the lds1 -mutants further than those trivially attributable to the deletion of the lds1 locus, such as single nucleotide polymorphisms, small deletion/insertion polymorphisms and structural variations. Results suggest that the effect of a (theoretically) punctual transformation event might have enhanced the natural mechanisms of genomic variability and that transformation practices, commonly used in the reverse genetics of fungi, may potentially be responsible for unexpected, stochastic and henceforth off-target rearrangements throughout the genome.

  17. Careful with That Axe, Gene, Genome Perturbation after a PEG-Mediated Protoplast Transformation in Fusarium verticillioides

    PubMed Central

    Scala, Valeria; Grottoli, Alessandro; Aiese Cigliano, Riccardo; Anzar, Irantzu; Beccaccioli, Marzia; Fanelli, Corrado; Dall’Asta, Chiara; Battilani, Paola; Reverberi, Massimo; Sanseverino, Walter

    2017-01-01

    Fusarium verticillioides causes ear rot disease in maize and its contamination with fumonisins, mycotoxins harmful for humans and livestock. Lipids, and their oxidized forms, may drive the fate of this disease. In a previous study, we have explored the role of oxylipins in this interaction by deleting by standard transformation procedures a linoleate diol synthase-coding gene, lds1, in F. verticillioides. A profound phenotypic diversity in the mutants generated has prompted us to investigate more deeply the whole genome of two lds1-deleted strains. Bioinformatics analyses pinpoint significant differences in the genome sequences emerged between the wild type and the lds1-mutants further than those trivially attributable to the deletion of the lds1 locus, such as single nucleotide polymorphisms, small deletion/insertion polymorphisms and structural variations. Results suggest that the effect of a (theoretically) punctual transformation event might have enhanced the natural mechanisms of genomic variability and that transformation practices, commonly used in the reverse genetics of fungi, may potentially be responsible for unexpected, stochastic and henceforth off-target rearrangements throughout the genome. PMID:28561789

  18. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    PubMed Central

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  19. [Comparative analysis of ISSR markers polymorphism in populations of yak (Bos mutus) and in F1 hybrids between yak and cattle in the Sayan-Altai region].

    PubMed

    Stolpovsky, Yu A; Kol, N V; Evsyukov, A N; Nesteruk, L V; Dorzhu, Ch M; Tsendsuren, Ts; Sulimova, G E

    2014-10-01

    The genetic variability in seven yak populations from the Sayan-Altai region and in F1 hybrids between yak and cattle (khainags) was investigated with the help of a technique that involves the use of inter simple sequence repeat (ISSR) markers generated with PCR primers (AG)9C and (GA)9C. Samples for the analysis were collected in Mongolia, Tuva, and Altai from 2008 through 2012. The examined yak populations differed in in the presence/absence of ISSR fragments, as well as in their frequency. In total, 46 ISSR fragments were identified using two marker systems; the proportion of polymorphic loci constituted 76% and 90% for the AG-ISSR and GA-ISSR markers, respectively. For the total sample of yaks, total genetic diversity (Ht), within-population diversity (Hs), and interpopulation diversity (Gst) constituted 0.081, 0.044, and 0.459 for the AG-ISSR and 0.137, 0.057, and 0.582 for the GA-ISSR markers, respectively. Based on ISSR finger printing, species- and breed-specific DNA patterns were described for the three groups of animals (yaks, cattle, khainags). For the domestic yak, the species-specific profile was represented by eight ISSR fragments. Genetic relationships between the yak populations, cattle breeds, and khainags were examined with the help of four different approaches used in the analysis of population structure: estimation of phylogenetic similarity, multidimensional scaling, principal component analysis, and cluster analysis. Clear evidence on the differentiation of the populations examined at the interspecific, as well as at intraspecific, level were obtained. Similar (relative); as well as remote (isolated), yak populations were identified. Khainags occupy an intermediate position between yak and cattle. However, the data on the ISSR-PCR marker polymorphism (genome polymorphism, population structure).indicate that part of the analyzed khainag genome was more similar to the yak genome than to the cattle genome.

  20. Characterization of Capsicum annuum Genetic Diversity and Population Structure Based on Parallel Polymorphism Discovery with a 30K Unigene Pepper GeneChip

    PubMed Central

    Hill, Theresa A.; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W.; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum. PMID:23409153

  1. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    PubMed

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum.

  2. Polymorphism at codon 36 of the p53 gene.

    PubMed

    Felix, C A; Brown, D L; Mitsudomi, T; Ikagaki, N; Wong, A; Wasserman, R; Womer, R B; Biegel, J A

    1994-01-01

    A polymorphism at codon 36 in exon 4 of the p53 gene was identified by single strand conformation polymorphism (SSCP) analysis and direct sequencing of genomic DNA PCR products. The polymorphic allele, present in the heterozygous state in genomic DNAs of four of 100 individuals (4%), changes the codon 36 CCG to CCA, eliminates a FinI restriction site and creates a BccI site. Including this polymorphism there are four known polymorphisms in the p53 coding sequence.

  3. Scanning the Effects of Ethyl Methanesulfonate on the Whole Genome of Lotus japonicus Using Second-Generation Sequencing Analysis

    PubMed Central

    Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M.; Biswas, Bandana; Batley, Jacqueline

    2015-01-01

    Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm. PMID:25660167

  4. Mycobacterium leprae: genes, pseudogenes and genetic diversity

    PubMed Central

    Singh, Pushpendra; Cole, Stewart T

    2011-01-01

    Leprosy, which has afflicted human populations for millenia, results from infection with Mycobacterium leprae, an unculturable pathogen with an exceptionally long generation time. Considerable insight into the biology and drug resistance of the leprosy bacillus has been obtained from genomics. M. leprae has undergone reductive evolution and pseudogenes now occupy half of its genome. Comparative genomics of four different strains revealed remarkable conservation of the genome (99.995% identity) yet uncovered 215 polymorphic sites, mainly single nucleotide polymorphisms, and a handful of new pseudogenes. Mapping these polymorphisms in a large panel of strains defined 16 single nucleotide polymorphism-subtypes that showed strong geographical associations and helped retrace the evolution of M. leprae. PMID:21162636

  5. SuperDCA for genome-wide epistasis analysis.

    PubMed

    Puranen, Santeri; Pesonen, Maiju; Pensar, Johan; Xu, Ying Ying; Lees, John A; Bentley, Stephen D; Croucher, Nicholas J; Corander, Jukka

    2018-05-29

    The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10 4 -10 5 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10 5 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level.

  6. Assembly of cucumber (Cucumis sativus L.) somaclones

    NASA Astrophysics Data System (ADS)

    Skarzyńska, Agnieszka; Kuśmirek, Wiktor; Pawełkowicz, Magdalena; PlÄ der, Wojciech; Nowak, Robert M.

    2017-08-01

    The development of next generation sequencing opens the possibility of using sequencing in various plant studies, such as finding structural changes and small polymorphisms between species and within them. Most analyzes rely on genomic sequences and it is crucial to use well-assembled genomes of high quality and completeness. Herein we compare commonly available programs for genomic assembling and newly developed software - dnaasm. Assemblies were tested on cucumber (Cucumis sativus L.) lines obtained by in vitro regeneration (somaclones), showing different phenotypes. Obtained results shows that dnaasm assembler is a good tool for short read assembly, which allows obtaining genomes of high quality and completeness.

  7. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-08

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

  8. Impact of genomic polymorphisms on the repertoire of human MHC class I-associated peptides

    PubMed Central

    Granados, Diana Paola; Sriranganadane, Dev; Daouda, Tariq; Zieger, Antoine; Laumont, Céline M.; Caron-Lizotte, Olivier; Boucher, Geneviève; Hardy, Marie-Pierre; Gendron, Patrick; Côté, Caroline; Lemieux, Sébastien; Thibault, Pierre; Perreault, Claude

    2014-01-01

    For decades, the global impact of genomic polymorphisms on the repertoire of peptides presented by major histocompatibility complex (MHC) has remained a matter of speculation. Here we present a novel approach that enables high-throughput discovery of polymorphic MHC class I-associated peptides (MIPs), which play a major role in allorecognition. On the basis of comprehensive analyses of the genomic landscape of MIPs eluted from B lymphoblasts of two MHC-identical siblings, we show that 0.5% of non-synonymous single nucleotide variations are represented in the MIP repertoire. The 34 polymorphic MIPs found in our subjects are encoded by bi-allelic loci with dominant and recessive alleles. Our analyses show that, at the population level, 12% of the MIP-coding exome is polymorphic. Our method provides fundamental insights into the relationship between the genomic self and the immune self and accelerates the discovery of polymorphic MIPs (also known as minor histocompatibility antigens). PMID:24714562

  9. Complex Patterns of Local Adaptation in Teosinte

    PubMed Central

    Pyhäjärvi, Tanja; Hufford, Matthew B.; Mezmouk, Sofiane; Ross-Ibarra, Jeffrey

    2013-01-01

    Populations of widely distributed species encounter and must adapt to local environmental conditions. However, comprehensive characterization of the genetic basis of adaptation is demanding, requiring genome-wide genotype data, multiple sampled populations, and an understanding of population structure and potential selection pressures. Here, we used single-nucleotide polymorphism genotyping and data on numerous environmental variables to describe the genetic basis of local adaptation in 21 populations of teosinte, the wild ancestor of maize. We found complex hierarchical genetic structure created by altitude, dispersal events, and admixture among subspecies, which complicated identification of locally beneficial alleles. Patterns of linkage disequilibrium revealed four large putative inversion polymorphisms showing clinal patterns of frequency. Population differentiation and environmental correlations suggest that both inversions and intergenic polymorphisms are involved in local adaptation. PMID:23902747

  10. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    PubMed

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  11. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    PubMed

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.

  12. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    PubMed Central

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096

  13. Genetic structure of soil population of fungus Fusarium oxysporum Schlechtend.: Fr.: Molecular reidentification of the species and genetic differentiation of isolates using polymerase chain reaction technique with universal primers (UP-PCR)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bulat, S.A.; Mironenko, N.V.; Zholkevich, Yu.G.

    1995-07-01

    The genetic structure of three soil populations of fungus Fusarium oxysporum was analyzed using polymerase chain reaction with universal primers (UP-PCR). Distinct UP-PCR variants revealed by means of cross-dot hybridization of amplified DNA and restriction analysis of nuclear ribosomal DNA represent subspecies or sibling species of F. oxysporum. The remaining isolates of F. oxysporum showed moderate UP-PCR polymorphism characterized by numerous types, whose relatedness was analyzed by computer treatment of the UP-PCR patterns. The genetic distance trees based on the UP-PCR patterns, which were obtained with different universal primers, demonstrated similar topology. This suggests that evolutionarily important genome rearrangements correlativelymore » occur within the entire genome. Isolates representing different UP-PCR polymorphisms were encountered in all populations, being distributed asymmetrically in two of these. In general, soil populations of F. oxysporum were represented by numerous genetically isolated groups with a similar genome structure. The genetic heterogeneity of the isolates within these groups is likely to be caused by the parasexual process. The usefulness of the UP-PCR technique for population studies of F. oxysporum was demonstrated. 39 refs., 7 figs., 2 tabs.« less

  14. Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II

    PubMed Central

    Norman, Paul J.; Norberg, Steven J.; Guethlein, Lisbeth A.; Nemat-Gorgani, Neda; Royce, Thomas; Wroblewski, Emily E.; Dunn, Tamsen; Mann, Tobias; Alicata, Claudia; Hollenbach, Jill A.; Chang, Weihua; Shults Won, Melissa; Gunderson, Kevin L.; Abi-Rached, Laurent; Ronaghi, Mostafa; Parham, Peter

    2017-01-01

    The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome. PMID:28360230

  15. Development and Application of Genomic Resources in an Endangered Palaeoendemic Tree, Parrotia subaequalis (Hamamelidaceae) From Eastern China

    PubMed Central

    Zhang, Yun-Yan; Shi, En; Yang, Zhao-Ping; Geng, Qi-Fang; Qiu, Ying-Xiong; Wang, Zhong-Sheng

    2018-01-01

    Parrotia subaequalis is an endangered palaeoendemic tree from disjunct montane sites in eastern China. Due to the lack of effective genomic resources, the genetic diversity and population structure of this endangered species are not clearly understood. In this study, we conducted paired-end shotgun sequencing (2 × 125 bp) of genomic DNA for two individuals of P. subaequalis on the Illumina HiSeq platform. Based on the resulting sequences, we have successfully assembled the complete chloroplast genome of P. subaequalis, as well as identified the polymorphic chloroplast microsatellites (cpSSRs), nuclear microsatellites (nSSRs) and mutational hotspots of chloroplast. Ten polymorphic cpSSR loci and 12 polymorphic nSSR loci were used to genotype 96 individuals of P. subaequalis from six populations to estimate genetic diversity and population structure. Our results revealed that P. subaequalis exhibited abundant genetic diversity (e.g., cpSSRs: Hcp = 0.862; nSSRs: HT = 0.559) and high genetic differentiation (e.g., cpSSRs: RST = 0.652; nSSRs: RST = 0.331), and characterized by a low pollen-to-seed migration ratio (r ≈ 1.78). These genetic patterns are attributable to its long evolutionary histories and low levels of contemporary inter-population gene flow by pollen and seed. In addition, lack of isolation-by-distance pattern and strong population genetic structuring in both marker systems, suggests that long-term isolation and/or habitat fragmentation as well as genetic drift may have also contributed to the geographic differentiation of P. subaequalis. Therefore, long-term habitat protection is the most important methods to prevent further loss of genetic variation and a decrease in effective population size. Furthermore, both cpSSRs and nSSRs revealed that P. subaequalis populations consisted of three genetic clusters, which should be considered as separated conservation units. PMID:29545814

  16. Polymorphic microsatellite loci for the sand pocket mouse Chaetodipus arenarius, an endemic from the Baja California Peninsula

    USGS Publications Warehouse

    Munguia-Vega, A.; Rodriguez-Estrella, R.; Nachman, M.; Culver, M.

    2009-01-01

    Fifteen polymorphic microsatellite loci were isolated from an enriched genomic library of the sand pocket mouse Chaetodipus arenarius. The mean number of alleles per locus was 11.53 (range five to 19) and the average observed heterozygosity was 0.764 (range 0.121 to 1.0). The markers will be used for detecting the impact of human-induced habitat fragmentation on patterns of gene flow, genetic structure, and extinction risk. In addition, these markers will be useful across the genus because most of the loci cross-amplified and were polymorphic in three other species of Chaetodipus. ?? 2008 The Authors.

  17. Failure of replicating the association between hippocampal volume and 3 single-nucleotide polymorphisms identified from the European genome-wide association study in Asian populations.

    PubMed

    Li, Ming; Ohi, Kazutaka; Chen, Chunhui; He, Qinghua; Liu, Jie-Wei; Chen, Chuansheng; Luo, Xiong-Jian; Dong, Qi; Hashimoto, Ryota; Su, Bing

    2014-12-01

    Hippocampal volume is a key brain structure for learning ability and memory process, and hippocampal atrophy is a recognized biological marker of Alzheimer's disease. However, the genetic bases of hippocampal volume are still unclear although it is a heritable trait. Genome-wide association studies (GWASs) on hippocampal volume have implicated several significantly associated genetic variants in Europeans. Here, to test the contributions of these GWASs identified genetic variants to hippocampal volume in different ethnic populations, we screened the GWAS-identified candidate single-nucleotide polymorphisms in 3 independent healthy Asian brain imaging samples (a total of 990 subjects). The results showed that none of these single-nucleotide polymorphisms were associated with hippocampal volume in either individual or combined Asian samples. The replication results suggested a complexity of genetic architecture for hippocampal volume and potential genetic heterogeneity between different ethnic populations. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    NASA Astrophysics Data System (ADS)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  19. G23D: Online tool for mapping and visualization of genomic variants on 3D protein structures.

    PubMed

    Solomon, Oz; Kunik, Vered; Simon, Amos; Kol, Nitzan; Barel, Ortal; Lev, Atar; Amariglio, Ninette; Somech, Raz; Rechavi, Gidi; Eyal, Eran

    2016-08-26

    Evaluation of the possible implications of genomic variants is an increasingly important task in the current high throughput sequencing era. Structural information however is still not routinely exploited during this evaluation process. The main reasons can be attributed to the partial structural coverage of the human proteome and the lack of tools which conveniently convert genomic positions, which are the frequent output of genomic pipelines, to proteins and structure coordinates. We present G23D, a tool for conversion of human genomic coordinates to protein coordinates and protein structures. G23D allows mapping of genomic positions/variants on evolutionary related (and not only identical) protein three dimensional (3D) structures as well as on theoretical models. By doing so it significantly extends the space of variants for which structural insight is feasible. To facilitate interpretation of the variant consequence, pathogenic variants, functional sites and polymorphism sites are displayed on protein sequence and structure diagrams alongside the input variants. G23D also provides modeling of the mutant structure, analysis of intra-protein contacts and instant access to functional predictions and predictions of thermo-stability changes. G23D is available at http://www.sheba-cancer.org.il/G23D . G23D extends the fraction of variants for which structural analysis is applicable and provides better and faster accessibility for structural data to biologists and geneticists who routinely work with genomic information.

  20. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    PubMed

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  1. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    PubMed Central

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  2. Intra-isolate genome variation in arbuscular mycorrhizal fungi persists in the transcriptome.

    PubMed

    Boon, E; Zimmerman, E; Lang, B F; Hijri, M

    2010-07-01

    Arbuscular mycorrhizal fungi (AMF) are heterokaryotes with an unusual genetic makeup. Substantial genetic variation occurs among nuclei within a single mycelium or isolate. AMF reproduce through spores that contain varying fractions of this heterogeneous population of nuclei. It is not clear whether this genetic variation on the genome level actually contributes to the AMF phenotype. To investigate the extent to which polymorphisms in nuclear genes are transcribed, we analysed the intra-isolate genomic and cDNA sequence variation of two genes, the large subunit ribosomal RNA (LSU rDNA) of Glomus sp. DAOM-197198 (previously known as G. intraradices) and the POL1-like sequence (PLS) of Glomus etunicatum. For both genes, we find high sequence variation at the genome and transcriptome level. Reconstruction of LSU rDNA secondary structure shows that all variants are functional. Patterns of PLS sequence polymorphism indicate that there is one functional gene copy, PLS2, which is preferentially transcribed, and one gene copy, PLS1, which is a pseudogene. This is the first study that investigates AMF intra-isolate variation at the transcriptome level. In conclusion, it is possible that, in AMF, multiple nuclear genomes contribute to a single phenotype.

  3. Population and allelic variation of A-to-I RNA editing in human transcriptomes.

    PubMed

    Park, Eddie; Guo, Jiguang; Shen, Shihao; Demirdjian, Levon; Wu, Ying Nian; Lin, Lan; Xing, Yi

    2017-07-28

    A-to-I RNA editing is an important step in RNA processing in which specific adenosines in some RNA molecules are post-transcriptionally modified to inosines. RNA editing has emerged as a widespread mechanism for generating transcriptome diversity. However, there remain significant knowledge gaps about the variation and function of RNA editing. In order to determine the influence of genetic variation on A-to-I RNA editing, we integrate genomic and transcriptomic data from 445 human lymphoblastoid cell lines by combining an RNA editing QTL (edQTL) analysis with an allele-specific RNA editing (ASED) analysis. We identify 1054 RNA editing events associated with cis genetic polymorphisms. Additionally, we find that a subset of these polymorphisms is linked to genome-wide association study signals of complex traits or diseases. Finally, compared to random cis polymorphisms, polymorphisms associated with RNA editing variation are located closer spatially to their respective editing sites and have a more pronounced impact on RNA secondary structure. Our study reveals widespread cis variation in RNA editing among genetically distinct individuals and sheds light on possible phenotypic consequences of such variation on complex traits and diseases.

  4. Restriction fragment length polymorphisms in dairy and beef cattle at the growth hormone and prolactin loci.

    PubMed

    Hallerman, E M; Nave, A; Kashi, Y; Holzer, Z; Soller, M; Beckmann, J S

    1987-01-01

    Two bovine populations, a Holstein-Friesian dairy stock and a synthetic (Baladi X Hereford X Simmental X Charolais) beef stock, were screened for restriction fragment length polymorphisms (RFLPs) at the growth hormone and prolactin genes. Most RFLPs at the growth hormone gene are apparently the consequence of an insertion/deletion event which was localized to a region downstream of the structural gene. The restriction map for the genomic region including the growth hormone gene was extended. Two HindIII RFLPs at the growth hormone locus, as well as several RFLPs at the prolactin gene, seemed to be the consequence of a series of point mutations. The results are discussed in terms of the possibility that minor genomic variability underlies quantitative genetic variation.

  5. Genetic and epigenetic alterations induced by different levels of rye genome integration in wheat recipient.

    PubMed

    Zheng, X L; Zhou, J P; Zang, L L; Tang, A T; Liu, D Q; Deng, K J; Zhang, Y

    2016-06-17

    The narrow genetic variation present in common wheat (Triticum aestivum) varieties has greatly restricted the improvement of crop yield in modern breeding systems. Alien addition lines have proven to be an effective means to broaden the genetic diversity of common wheat. Wheat-rye addition lines, which are the direct bridge materials for wheat improvement, have been wildly used to produce new wheat cultivars carrying alien rye germplasm. In this study, we investigated the genetic and epigenetic alterations in two sets of wheat-rye disomic addition lines (1R-7R) and the corresponding triticales. We used expressed sequence tag-simple sequence repeat, amplified fragment length polymorphism, and methylation-sensitive amplification polymorphism analyses to analyze the effects of the introduction of alien chromosomes (either the entire genome or sub-genome) to wheat genetic background. We found obvious and diversiform variations in the genomic primary structure, as well as alterations in the extent and pattern of the genomic DNA methylation of the recipient. Meanwhile, these results also showed that introduction of different rye chromosomes could induce different genetic and epigenetic alterations in its recipient, and the genetic background of the parents is an important factor for genomic and epigenetic variation induced by alien chromosome addition.

  6. Investigation of inversion polymorphisms in the human genome using principal components analysis.

    PubMed

    Ma, Jianzhong; Amos, Christopher I

    2012-01-01

    Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases.

  7. [The human variome project and its progress].

    PubMed

    Gao, Shan; Zhang, Ning; Zhang, Lei; Duan, Guang-You; Zhang, Tao

    2010-11-01

    The main goal of post genomics is to explain how the genome, the map of which has been constructed in the Human Genome Project, affacts activities of life. This leads to generate multiple "omics": structural genomics, functional genomics, proteomics, metabonomics, et al. In Jun. 2006, Melbourne, Australia, Human Genome Variation Society (HGVS) initiated the Human Variome Project (HVP) to collect all the sequence variation and polymorphism data worldwidely. HVP is to search and determine those mutations related with human diseases by association study between genetype and phenotype on the scale of genome level and other methods. Those results will be translated into clinical application. Considering the potential effects of this project on human health, this paper introduced its origin and main content in detail and discussed its meaning and prospect.

  8. The amphioxus genome and the evolution of the chordate karyotype

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Putnam, Nicholas H.; Butts, Thomas; Ferrier, David E.K.

    2008-04-01

    Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage with a fossil record dating back to the Cambrian. We describe the structure and gene content of the highly polymorphic {approx}520 million base pair genome of the Florida lancelet Branchiostoma floridae, and analyze it in the context of chordate evolution. Whole genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets, and vertebrates), and allow reconstruction of not only the gene complement of the last common chordate ancestor, but also a partial reconstruction of its genomic organization, as well as a description of two genome-wide duplicationsmore » and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.« less

  9. DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

    PubMed

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.

  10. DNApod: DNA polymorphism annotation database from next-generation sequence read archives

    PubMed Central

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924

  11. Integration of genome and phenotypic scanning gives evidence of genetic structure in Mesoamerican common bean (Phaseolus vulgaris L.) landraces from the southwest of Europe.

    PubMed

    Santalla, M; De Ron, A M; De La Fuente, M

    2010-05-01

    Southwestern Europe has been considered as a secondary centre of genetic diversity for the common bean. The dispersal of domesticated materials from their centres of origin provides an experimental system that reveals how human selection during cultivation and adaptation to novel environments affects the genetic composition. In this paper, our goal was to elucidate how distinct events could modify the structure and level of genetic diversity in the common bean. The genome-wide genetic composition was analysed at 42 microsatellite loci in individuals of 22 landraces of domesticated common bean from the Mesoamerican gene pool. The accessions were also characterised for phaseolin seed protein and for nine allozyme polymorphisms and phenotypic traits. One of this study's important findings was the complementary information obtained from all the polymorphisms examined. Most of the markers found to be potentially under the influence of selection were located in the proximity of previously mapped genes and quantitative trait loci (QTLs) related to important agronomic traits, which indicates that population genomics approaches are very efficient in detecting QTLs. As it was revealed by outlier simple sequence repeats, loci analysis with STRUCTURE software and multivariate analysis of phenotypic data, the landraces were grouped into three clusters according to seed size and shape, vegetative growth habit and genetic resistance. A total of 151 alleles were detected with an average of 4 alleles per locus and an average polymorphism information content of 0.31. Using a model-based approach, on the basis of neutral markers implemented in the software STRUCTURE, three clusters were inferred, which were in good agreement with multivariate analysis. Geographic and genetic distances were congruent with the exception of a few putative hybrids identified in this study, suggesting a predominant effect of isolation by distance. Genomic scans using both markers linked to genes affected by selection (outlier) and neutral markers showed advantages relative to other approaches, since they help to create a more complete picture of how adaptation to environmental conditions has sculpted the common bean genomes in southern Europe. The use of outlier loci also gives a clue about what selective forces gave rise to the actual phenotypes of the analysed landraces.

  12. Single nucleotide polymorphisms in common bean: their discovery and genotyping using a multiplex detection system

    USDA-ARS?s Scientific Manuscript database

    Single-nucleotide Polymorphism (SNP) markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean comparing sequences from coding and non-coding regions obtained from Genbank and genomic DNA and to compare sequencing resu...

  13. e23D: database and visualization of A-to-I RNA editing sites mapped to 3D protein structures.

    PubMed

    Solomon, Oz; Eyal, Eran; Amariglio, Ninette; Unger, Ron; Rechavi, Gidi

    2016-07-15

    e23D, a database of A-to-I RNA editing sites from human, mouse and fly mapped to evolutionary related protein 3D structures, is presented. Genomic coordinates of A-to-I RNA editing sites are converted to protein coordinates and mapped onto 3D structures from PDB or theoretical models from ModBase. e23D allows visualization of the protein structure, modeling of recoding events and orientation of the editing with respect to nearby genomic functional sites from databases of disease causing mutations and genomic polymorphism. http://www.sheba-cancer.org.il/e23D CONTACT: oz.solomon@live.biu.ac.il or Eran.Eyal@sheba.health.gov.il. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. SINEs, evolution and genome structure in the opossum.

    PubMed

    Gu, Wanjun; Ray, David A; Walker, Jerilyn A; Barnes, Erin W; Gentles, Andrew J; Samollow, Paul B; Jurka, Jerzy; Batzer, Mark A; Pollock, David D

    2007-07-01

    Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.

  15. Species-specific markers for the differential diagnosis of Trypanosoma cruzi and Trypanosoma rangeli and polymorphisms detection in Trypanosoma rangeli.

    PubMed

    Ferreira, Keila Adriana Magalhães; Fajardo, Emanuella Francisco; Baptista, Rodrigo P; Macedo, Andrea Mara; Lages-Silva, Eliane; Ramírez, Luis Eduardo; Pedrosa, André Luiz

    2014-06-01

    Trypanosoma cruzi and Trypanosoma rangeli are kinetoplastid parasites which are able to infect humans in Central and South America. Misdiagnosis between these trypanosomes can be avoided by targeting barcoding sequences or genes of each organism. This work aims to analyze the feasibility of using species-specific markers for identification of intraspecific polymorphisms and as target for diagnostic methods by PCR. Accordingly, primers which are able to specifically detect T. cruzi or T. rangeli genomic DNA were characterized. The use of intergenic regions, generally divergent in the trypanosomatids, and the serine carboxypeptidase gene were successful. Using T. rangeli genomic sequences for the identification of group-specific polymorphisms and a polymorphic AT(n) dinucleotide repeat permitted the classification of the strains into two groups, which are entirely coincident with T. rangeli main lineages, KP1 (+) and KP1 (-), previously determined by kinetoplast DNA (kDNA) characterization. The sequences analyzed totalize 622 bp (382 bp represent a hypothetical protein sequence, and 240 bp represent an anonymous sequence), and of these, 581 (93.3%) are conserved sites and 41 bp (6.7%) are polymorphic, with 9 transitions (21.9%), 2 transversions (4.9%), and 30 (73.2%) insertion/deletion events. Taken together, the species-specific markers analyzed may be useful for the development of new strategies for the accurate diagnosis of infections. Furthermore, the identification of T. rangeli polymorphisms has a direct impact in the understanding of the population structure of this parasite.

  16. Genome sequencing of disease and carriage isolates of nontypeable Haemophilus influenzae identifies discrete population structure

    PubMed Central

    De Chiara, Matteo; Hood, Derek; Muzzi, Alessandro; Pickard, Derek J.; Perkins, Tim; Pizza, Mariagrazia; Dougan, Gordon; Rappuoli, Rino; Moxon, E. Richard; Soriani, Marco; Donati, Claudio

    2014-01-01

    One of the main hurdles for the development of an effective and broadly protective vaccine against nonencapsulated isolates of Haemophilus influenzae (NTHi) lies in the genetic diversity of the species, which renders extremely difficult the identification of cross-protective candidate antigens. To assess whether a population structure of NTHi could be defined, we performed genome sequencing of a collection of diverse clinical isolates representative of both carriage and disease and of the diversity of the natural population. Analysis of the distribution of polymorphic sites in the core genome and of the composition of the accessory genome defined distinct evolutionary clades and supported a predominantly clonal evolution of NTHi, with the majority of genetic information transmitted vertically within lineages. A correlation between the population structure and the presence of selected surface-associated proteins and lipooligosaccharide structure, known to contribute to virulence, was found. This high-resolution, genome-based population structure of NTHi provides the foundation to obtain a better understanding, of NTHi adaptation to the host as well as its commensal and virulence behavior, that could facilitate intervention strategies against disease caused by this important human pathogen. PMID:24706866

  17. Genome sequencing of disease and carriage isolates of nontypeable Haemophilus influenzae identifies discrete population structure.

    PubMed

    De Chiara, Matteo; Hood, Derek; Muzzi, Alessandro; Pickard, Derek J; Perkins, Tim; Pizza, Mariagrazia; Dougan, Gordon; Rappuoli, Rino; Moxon, E Richard; Soriani, Marco; Donati, Claudio

    2014-04-08

    One of the main hurdles for the development of an effective and broadly protective vaccine against nonencapsulated isolates of Haemophilus influenzae (NTHi) lies in the genetic diversity of the species, which renders extremely difficult the identification of cross-protective candidate antigens. To assess whether a population structure of NTHi could be defined, we performed genome sequencing of a collection of diverse clinical isolates representative of both carriage and disease and of the diversity of the natural population. Analysis of the distribution of polymorphic sites in the core genome and of the composition of the accessory genome defined distinct evolutionary clades and supported a predominantly clonal evolution of NTHi, with the majority of genetic information transmitted vertically within lineages. A correlation between the population structure and the presence of selected surface-associated proteins and lipooligosaccharide structure, known to contribute to virulence, was found. This high-resolution, genome-based population structure of NTHi provides the foundation to obtain a better understanding, of NTHi adaptation to the host as well as its commensal and virulence behavior, that could facilitate intervention strategies against disease caused by this important human pathogen.

  18. Structural and functional impacts of copy number variations on the cattle genome

    USDA-ARS?s Scientific Manuscript database

    Although there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs), similar realizations for larger, more complex forms of genetic variation have just emerged. Several recent publications reveal that copy number variations (CNVs) are common an...

  19. Development of Reproducible EST-derived SSR Markers and Assessment of Genetic Diversity in Panax ginseng Cultivars and Related Species

    PubMed Central

    Choi, Hong-Il; Kim, Nam Hoon; Kim, Jun Ha; Choi, Beom Soon; Ahn, In-Ok; Lee, Joon-Soo; Yang, Tae-Jin

    2011-01-01

    Little is known about the genetics or genomics of Panax ginseng. In this study, we developed 70 expressed sequence tag-derived polymorphic simple sequence repeat markers by trials of 140 primer pairs. All of the 70 markers showed reproducible polymorphism among four Panax speciesand 19 of them were polymorphic in six P. ginseng cultivars. These markers segregated 1:2:1 manner of Mendelian inheritance in an F2 population of a cross between two P. ginseng cultivars, ‘Yunpoong’ and ‘Chunpoong’, indicating that these are reproducible and inheritable mappable markers. A phylogenetic analysis using the genotype data showed three distinctive groups: a P. ginseng-P. japonicus clade, P. notoginseng and P. quinquefolius, with similarity coefficients of 0.70. P. japonicus was intermingled with P. ginseng cultivars, indicating that both species have similar genetic backgrounds. P. ginseng cultivars were subdivided into three minor groups: an independent cultivar ‘Chunpoong’, a subgroup with three accessions including two cultivars, ‘Gumpoong’ and ‘Yunpoong’ and one landrace ‘Hwangsook’ and another subgroup with two accessions including one cultivar, ‘Gopoong’ and one landrace ‘Jakyung’. Each primer pair produced 1 to 4 bands, indicating that the ginseng genome has a highly replicated paleopolyploid genome structure. PMID:23717085

  20. Human Xq28 inversion polymorphism: From sex linkage to Genomics--A genetic mother lode.

    PubMed

    Kirby, Cait S; Kolber, Natalie; Salih Almohaidi, Asmaa M; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to alterations in genome structure (inversions) through nonallelic homologous recombination. The near identity of the inverted repeats is an example of concerted evolution through gene conversion. While the laboratory in its entirety is designed for college level genetics courses, portions of the laboratory are appropriate for courses at other levels. Because the polymorphism is on the X-chromosome, the laboratory can be used in introductory biology courses to enhance understanding of sex-linkage and to test for Hardy-Weinberg equilibrium in females. More advanced topics, such as chromosome interference, the molecular model for recombination, and inversion heterozygosity suppression of recombination can be explored in upper-level genetics and evolution courses. DNA isolation, restriction digests, ligation, long PCR, and iPCR provide experience with techniques in molecular biology. This investigative laboratory weaves together topics stretching from molecular genetics to cytogenetics and sex-linkage, population genetics and evolutionary genetics. © 2016 The International Union of Biochemistry and Molecular Biology.

  1. Identification and Evaluation of Single-Nucleotide Polymorphisms in Allotetraploid Peanut (Arachis hypogaea L.) Based on Amplicon Sequencing Combined with High Resolution Melting (HRM) Analysis.

    PubMed

    Hong, Yanbin; Pandey, Manish K; Liu, Ying; Chen, Xiaoping; Liu, Hong; Varshney, Rajeev K; Liang, Xuanqiang; Huang, Shangzhi

    2015-01-01

    The cultivated peanut (Arachis hypogaea L.) is an allotetraploid (AABB) species derived from the A-genome (Arachis duranensis) and B-genome (Arachis ipaensis) progenitors. Presence of two versions of a DNA sequence based on the two progenitor genomes poses a serious technical and analytical problem during single nucleotide polymorphism (SNP) marker identification and analysis. In this context, we have analyzed 200 amplicons derived from expressed sequence tags (ESTs) and genome survey sequences (GSS) to identify SNPs in a panel of genotypes consisting of 12 cultivated peanut varieties and two diploid progenitors representing the ancestral genomes. A total of 18 EST-SNPs and 44 genomic-SNPs were identified in 12 peanut varieties by aligning the sequence of A. hypogaea with diploid progenitors. The average frequency of sequence polymorphism was higher for genomic-SNPs than the EST-SNPs with one genomic-SNP every 1011 bp as compared to one EST-SNP every 2557 bp. In order to estimate the potential and further applicability of these identified SNPs, 96 peanut varieties were genotyped using high resolution melting (HRM) method. Polymorphism information content (PIC) values for EST-SNPs ranged between 0.021 and 0.413 with a mean of 0.172 in the set of peanut varieties, while genomic-SNPs ranged between 0.080 and 0.478 with a mean of 0.249. Total 33 SNPs were used for polymorphism detection among the parents and 10 selected lines from mapping population Y13Zh (Zhenzhuhei × Yueyou13). Of the total 33 SNPs, nine SNPs showed polymorphism in the mapping population Y13Zh, and seven SNPs were successfully mapped into five linkage groups. Our results showed that SNPs can be identified in allotetraploid peanut with high accuracy through amplicon sequencing and HRM assay. The identified SNPs were very informative and can be used for different genetic and breeding applications in peanut.

  2. An unusual haplotype structure on human chromosome 8p23 derived from the inversion polymorphism.

    PubMed

    Deng, Libin; Zhang, Yuezheng; Kang, Jian; Liu, Tao; Zhao, Hongbin; Gao, Yang; Li, Chaohua; Pan, Hao; Tang, Xiaoli; Wang, Dunmei; Niu, Tianhua; Yang, Huanming; Zeng, Changqing

    2008-10-01

    Chromosomal inversion is an important type of genomic variations involved in both evolution and disease pathogenesis. Here, we describe the refined genetic structure of a 3.8-Mb inversion polymorphism at chromosome 8p23. Using HapMap data of 1,073 SNPs generated from 209 unrelated samples from CEPH-Utah residents with ancestry from northern and western Europe (CEU); Yoruba in Ibadan, Nigeria (YRI); and Asian (ASN) samples, which were comprised of Han Chinese from Beijing, China (CHB) and Japanese from Tokyo, Japan (JPT)-we successfully deduced the inversion orientations of all their 418 haplotypes. In particular, distinct haplotype subgroups were identified based on principal component analysis (PCA). Such genetic substructures were consistent with clustering patterns based on neighbor-joining tree reconstruction, which revealed a total of four haplotype clades across all samples. Metaphase fluorescence in situ hybridization (FISH) in a subset of 10 HapMap samples verified their inversion orientations predicted by PCA or phylogenetic tree reconstruction. Positioning of the outgroup haplotype within one of YRI clades suggested that Human NCBI Build 36-inverted order is most likely the ancestral orientation. Furthermore, the population differentiation test and the relative extended haplotype homozygosity (REHH) analysis in this region discovered multiple selection signals, also in a population-specific manner. A positive selection signal was detected at XKR6 in the ASN population. These results revealed the correlation of inversion polymorphisms to population-specific genetic structures, and various selection patterns as possible mechanisms for the maintenance of a large chromosomal rearrangement at 8p23 region during evolution. In addition, our study also showed that haplotype-based clustering methods, such as PCA, can be applied in scanning for cryptic inversion polymorphisms at a genome-wide scale.

  3. Genetic diversity revealed by single nucleotide polymorphism markers in a worldwide germplasm collection of durum wheat.

    PubMed

    Ren, Jing; Sun, Daokun; Chen, Liang; You, Frank M; Wang, Jirui; Peng, Yunliang; Nevo, Eviatar; Sun, Dongfa; Luo, Ming-Cheng; Peng, Junhua

    2013-03-28

    Evaluation of genetic diversity and genetic structure in crops has important implications for plant breeding programs and the conservation of genetic resources. Newly developed single nucleotide polymorphism (SNP) markers are effective in detecting genetic diversity. In the present study, a worldwide durum wheat collection consisting of 150 accessions was used. Genetic diversity and genetic structure were investigated using 946 polymorphic SNP markers covering the whole genome of tetraploid wheat. Genetic structure was greatly impacted by multiple factors, such as environmental conditions, breeding methods reflected by release periods of varieties, and gene flows via human activities. A loss of genetic diversity was observed from landraces and old cultivars to the modern cultivars released during periods of the Early Green Revolution, but an increase in cultivars released during the Post Green Revolution. Furthermore, a comparative analysis of genetic diversity among the 10 mega ecogeographical regions indicated that South America, North America, and Europe possessed the richest genetic variability, while the Middle East showed moderate levels of genetic diversity.

  4. Divergent population structure and climate associations of a chromosomal inversion polymorphism across the Mimulus guttatus species complex

    PubMed Central

    Oneal, Elen; Lowry, David B.; Wright, Kevin M.; Zhu, Zhirui; Willis, John H.

    2014-01-01

    Chromosomal rearrangement polymorphisms are common and increasingly found to be associated with adaptive ecological divergence and speciation. Rearrangements, such as inversions, reduce recombination in heterozygous individuals and thus can protect favourable allelic combinations at linked loci, facilitating their spread in the presence of gene flow. Recently, we identified a chromosomal inversion polymorphism that contributes to ecological adaptation and reproductive isolation between annual and perennial ecotypes of the yellow monkeyflower, Mimulus guttatus. Here we evaluate the population genetic structure of this inverted region in comparison with the collinear regions of the genome across the M. guttatus species complex. We tested whether annual and perennial M. guttatus exhibit different patterns of divergence for loci in the inverted and noninverted regions of the genome. We then evaluated whether there are contrasting climate associations with these genomic regions through redundancy analysis. We found that the inversion exhibits broadly different patterns of divergence among annual and perennial M. guttatus and is associated with environmental variation across population accessions. This study is the first widespread population genetic survey of the diversity of the M. guttatus species complex. Our findings contribute to a greater understanding of morphological, ecological, and genetic evolutionary divergence across this highly diverse group of closely related ecotypes and species. Finally, understanding species relationships among M. guttatus sp. has hitherto been stymied by accumulated evidence of substantial gene flow among populations as well as designated species. Nevertheless, our results shed light on these relationships and provide insight into adaptation in life history traits within the complex. PMID:24796267

  5. Gene amplification of the Hps locus in Glycine max

    PubMed Central

    Gijzen, Mark; Kuflu, Kuflom; Moy, Pat

    2006-01-01

    Background Hydrophobic protein from soybean (HPS) is an 8 kD cysteine-rich polypeptide that causes asthma in persons allergic to soybean dust. HPS is synthesized in the pod endocarp and deposited on the seed surface during development. Past evidence suggests that the protein may mediate the adherence or dehiscence of endocarp tissues during maturation and affect the lustre, or glossiness of the seed surface. Results A comparison of soybean germplasm by genomic DNA blot hybridization shows that the copy number and structure of the Hps locus is polymorphic among soybean cultivars and related species. Changes in Hps gene copy number were also detected by comparative genomic DNA hybridization using cDNA microarrays. The Hps copy number polymorphisms co-segregated with seed lustre phenotype and HPS surface protein in a cross between dull- and shiny-seeded soybeans. In soybean cultivar Harosoy 63, a minimum of 27 ± 5 copies of the Hps gene were estimated to be present in each haploid genome. The isolation and analysis of genomic clones indicates that the core Hps locus is comprised of a tandem array of reiterated units, with each 8.6 kb unit containing a single HPS open reading frame. Conclusion This study shows that polymorphisms at the Hps locus arise from changes in the gene copy number via gene amplification. We present a model whereby Hps copy number modulates protein expression levels and seed lustre, and we suggest that gene amplification may result from selection pressures imposed on crop plants. PMID:16536872

  6. Initial sequencing and comparative analysis of the mouse genome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of themore » genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.« less

  7. Deep whole-genome sequencing of 90 Han Chinese genomes.

    PubMed

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects. © The Authors 2017. Published by Oxford University Press.

  8. Evidence for adaptation from standing genetic variation on an antimicrobial peptide gene in the mussel Mytilus edulis.

    PubMed

    Gosset, Célia C; Do Nascimento, Joana; Augé, Marie-Thérèse; Bierne, Nicolas

    2014-06-01

    Genome scans of population differentiation identify candidate loci for adaptation but provide little information on how selection has influenced the genetic structure of these loci. Following a genome scan, we investigated the nature of the selection responsible for the outlying differentiation observed between populations of the marine mussel Mytilus edulis at a leucine/arginine polymorphism (L31R) in the antimicrobial peptide MGD2. We analysed DNA sequence polymorphisms, allele frequencies and population differentiation of polymorphisms closely linked to L31R, and pairwise and third-order linkage disequilibria. An outlying level of population differentiation was observed at L31R only, while no departure from panmixia was observed at linked loci surrounding L31R, as in most of the genome. Selection therefore seems to affect L31R directly. Three hypotheses can explain the lack of differentiation in the chromosomal region close to L31R: (i) hitchhiking has occurred but migration and recombination subsequently erased the signal, (ii) selection was weak enough and recombination strong enough to limit the hitchhiking effect to a very small chromosomal region or (iii) selection acted on a pre-existing polymorphism (i.e. standing variation) at linkage equilibrium with its background. Linkage equilibrium was observed between L31R and linked polymorphisms in every population analysed, as expected under the three hypotheses. However, linkage disequilibrium was observed in some populations between pairs of loci located upstream and downstream to L31R, generating a complex pattern of third-order linkage disequilibria which is best explained by the hypothesis of selection on a pre-existing polymorphism. We hypothesise that selection could be either balanced, maintaining alleles at different frequencies depending on the pathogen community encountered locally by mussels, or intermittent, resulting in sporadic fluctuations in allele frequency. © 2014 John Wiley & Sons Ltd.

  9. Genomic variation and DNA repair associated with soybean transgenesis: a comparison to cultivars and mutagenized plants.

    PubMed

    Anderson, Justin E; Michno, Jean-Michel; Kono, Thomas J Y; Stec, Adrian O; Campbell, Benjamin W; Curtin, Shaun J; Stupar, Robert M

    2016-05-12

    The safety of mutagenized and genetically transformed plants remains a subject of scrutiny. Data gathered and communicated on the phenotypic and molecular variation induced by gene transfer technologies will provide a scientific-based means to rationally address such concerns. In this study, genomic structural variation (e.g. large deletions and duplications) and single nucleotide polymorphism rates were assessed among a sample of soybean cultivars, fast neutron-derived mutants, and five genetically transformed plants developed through Agrobacterium based transformation methods. On average, the number of genes affected by structural variations in transgenic plants was one order of magnitude less than that of fast neutron mutants and two orders of magnitude less than the rates observed between cultivars. Structural variants in transgenic plants, while rare, occurred adjacent to the transgenes, and at unlinked loci on different chromosomes. DNA repair junctions at both transgenic and unlinked sites were consistent with sequence microhomology across breakpoints. The single nucleotide substitution rates were modest in both fast neutron and transformed plants, exhibiting fewer than 100 substitutions genome-wide, while inter-cultivar comparisons identified over one-million single nucleotide polymorphisms. Overall, these patterns provide a fresh perspective on the genomic variation associated with high-energy induced mutagenesis and genetically transformed plants. The genetic transformation process infrequently results in novel genetic variation and these rare events are analogous to genetic variants occurring spontaneously, already present in the existing germplasm, or induced through other types of mutagenesis. It remains unclear how broadly these results can be applied to other crops or transformation methods.

  10. Exploring a Nonmodel Teleost Genome Through RAD Sequencing-Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis.

    PubMed

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C; Tsigenopoulos, Costas S

    2015-12-29

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. Copyright © 2016 Manousaki et al.

  11. Levels of taurine introgression in the current Brazilian Nelore and Gir indicine cattle populations

    USDA-ARS?s Scientific Manuscript database

    A high density panel of more than 777000 genome-wide single nucleotide polymorphisms (SNPs) were used to investigate the population structure of Nelore and Gir, compared to seven other populations worldwide. Principal Component Analysis and model-based ancestry estimation clearly separate the indici...

  12. Effects of As2O3 on DNA methylation, genomic instability, and LTR retrotransposon polymorphism in Zea mays.

    PubMed

    Erturk, Filiz Aygun; Aydin, Murat; Sigmaz, Burcu; Taspinar, M Sinan; Arslan, Esra; Agar, Guleray; Yagci, Semra

    2015-12-01

    Arsenic is a well-known toxic substance on the living organisms. However, limited efforts have been made to study its DNA methylation, genomic instability, and long terminal repeat (LTR) retrotransposon polymorphism causing properties in different crops. In the present study, effects of As2O3 (arsenic trioxide) on LTR retrotransposon polymorphism and DNA methylation as well as DNA damage in Zea mays seedlings were investigated. The results showed that all of arsenic doses caused a decreasing genomic template stability (GTS) and an increasing Random Amplified Polymorphic DNAs (RAPDs) profile changes (DNA damage). In addition, increasing DNA methylation and LTR retrotransposon polymorphism characterized a model to explain the epigenetically changes in the gene expression were also found. The results of this experiment have clearly shown that arsenic has epigenetic effect as well as its genotoxic effect. Especially, the increasing of polymorphism of some LTR retrotransposon under arsenic stress may be a part of the defense system against the stress.

  13. New Mycobacterium tuberculosis Complex Sublineage, Brazzaville, Congo

    PubMed Central

    Malm, Sven; Linguissi, Laure S. Ghoma; Tekwu, Emmanuel M.; Vouvoungui, Jeannhey C.; Kohl, Thomas A.; Beckert, Patrick; Sidibe, Anissa; Rüsch-Gerdes, Sabine; Madzou-Laboum, Igor K.; Kwedi, Sylvie; Penlap Beng, Véronique; Frank, Matthias; Ntoumi, Francine

    2017-01-01

    Tuberculosis is a leading cause of illness and death in Congo. No data are available about the population structure and transmission dynamics of the Mycobacterium tuberculosis complex strains prevalent in this central Africa country. On the basis of single-nucleotide polymorphisms detected by whole-genome sequencing, we phylogenetically characterized 74 MTBC isolates from Brazzaville, the capital of Congo. The diversity of the study population was high; most strains belonged to the Euro-American lineage, which split into Latin American Mediterranean, Uganda I, Uganda II, Haarlem, X type, and a new dominant sublineage named Congo type (n = 26). Thirty strains were grouped in 5 clusters (each within 12 single-nucleotide polymorphisms), from which 23 belonged to the Congo type. High cluster rates and low genomic diversity indicate recent emergence and transmission of the Congo type, a new Euro-American sublineage of MTBC. PMID:28221129

  14. New Mycobacterium tuberculosis Complex Sublineage, Brazzaville, Congo.

    PubMed

    Malm, Sven; Linguissi, Laure S Ghoma; Tekwu, Emmanuel M; Vouvoungui, Jeannhey C; Kohl, Thomas A; Beckert, Patrick; Sidibe, Anissa; Rüsch-Gerdes, Sabine; Madzou-Laboum, Igor K; Kwedi, Sylvie; Penlap Beng, Véronique; Frank, Matthias; Ntoumi, Francine; Niemann, Stefan

    2017-03-01

    Tuberculosis is a leading cause of illness and death in Congo. No data are available about the population structure and transmission dynamics of the Mycobacterium tuberculosis complex strains prevalent in this central Africa country. On the basis of single-nucleotide polymorphisms detected by whole-genome sequencing, we phylogenetically characterized 74 MTBC isolates from Brazzaville, the capital of Congo. The diversity of the study population was high; most strains belonged to the Euro-American lineage, which split into Latin American Mediterranean, Uganda I, Uganda II, Haarlem, X type, and a new dominant sublineage named Congo type (n = 26). Thirty strains were grouped in 5 clusters (each within 12 single-nucleotide polymorphisms), from which 23 belonged to the Congo type. High cluster rates and low genomic diversity indicate recent emergence and transmission of the Congo type, a new Euro-American sublineage of MTBC.

  15. Detection and correction of false segmental duplications caused by genome mis-assembly

    PubMed Central

    2010-01-01

    Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes. PMID:20219098

  16. Screening of Israeli Holstein-Friesian cattle for restriction fragment length polymorphisms using homologous and heterologous deoxyribonucleic acid probes.

    PubMed

    Hallerman, E M; Nave, A; Soller, M; Beckmann, J S

    1988-12-01

    Genomic DNA of Israeli Holstein-Friesian dairy cattle were screened with a battery of 17 cloned or subcloned DNA probes in an attempt to document restriction fragment length polymorphisms at a number of genetic loci. Restriction fragment length polymorphisms were observed at the chymosin, oxytocin-neurophysin I, lutropin beta, keratin III, keratin VI, keratin VII, prolactin, and dihydrofolate reductase loci. Use of certain genomic DNA fragments as probes produced hybridization patterns indicative of satellite DNA at the respective loci. Means for distinguishing hybridizations to coding sequences for unique genes from those to satellite DNA were developed. Results of this study are discussed in terms of strategy for the systematic development of large numbers of bovine genomic polymorphisms.

  17. Effects of Yangtze River source water on genomic polymorphisms of male mice detected by RAPD.

    PubMed

    Zhang, Xiaolin; Zhang, Zongyao; Zhang, Xuxiang; Wu, Bing; Zhang, Yan; Yang, Liuyan; Cheng, Shupei

    2010-02-01

    In order to evaluate the environmental health risk of drinking water from Yangtze River source, randomly amplified polymorphic DNA (RAPD) markers were used to detect the effects of the source water on genomic polymorphisms of hepatic cell of male mice (Mus musculus, ICR). After the mice were fed with source water for 90 days, RAPD-polymerase chain reactions (PCRs) were performed on hepatic genomic DNA using 20 arbitrary primers. Totally, 189 loci were generated, including 151 polymorphic loci. On average, one PCR primer produced 5.3, 4.9 and 4.8 bands for each mouse in the control, the groups fed with source water and BaP solution, respectively. Compared with the control, feeding mice with Yangtze River source water caused 33 new loci to appear and 19 to disappear. Statistical analysis of RAPD printfingers revealed that Yangtze River source water exerted a significant influence on the hepatic genomic polymorphisms of male mice. This study suggests that RAPD is a reliable and sensitive method for the environmental health risk of Yangtze River source water.

  18. Global Genomic Diversity of Oryza sativa Varieties Revealed by Comparative Physical Mapping

    PubMed Central

    Wang, Xiaoming; Kudrna, David A.; Pan, Yonglong; Wang, Hao; Liu, Lin; Lin, Haiyan; Zhang, Jianwei; Song, Xiang; Goicoechea, Jose Luis; Wing, Rod A.; Zhang, Qifa; Luo, Meizhong

    2014-01-01

    Bacterial artificial chromosome (BAC) physical maps embedding a large number of BAC end sequences (BESs) were generated for Oryza sativa ssp. indica varieties Minghui 63 (MH63) and Zhenshan 97 (ZS97) and were compared with the genome sequences of O. sativa spp. japonica cv. Nipponbare and O. sativa ssp. indica cv. 93-11. The comparisons exhibited substantial diversities in terms of large structural variations and small substitutions and indels. Genome-wide BAC-sized and contig-sized structural variations were detected, and the shared variations were analyzed. In the expansion regions of the Nipponbare reference sequence, in comparison to the MH63 and ZS97 physical maps, as well as to the previously constructed 93-11 physical map, the amounts and types of the repeat contents, and the outputs of gene ontology analysis, were significantly different from those of the whole genome. Using the physical maps of four wild Oryza species from OMAP (http://www.omap.org) as a control, we detected many conserved and divergent regions related to the evolution process of O. sativa. Between the BESs of MH63 and ZS97 and the two reference sequences, a total of 1532 polymorphic simple sequence repeats (SSRs), 71,383 SNPs, 1767 multiple nucleotide polymorphisms, 6340 insertions, and 9137 deletions were identified. This study provides independent whole-genome resources for intra- and intersubspecies comparisons and functional genomics studies in O. sativa. Both the comparative physical maps and the GBrowse, which integrated the QTL and molecular markers from GRAMENE (http://www.gramene.org) with our physical maps and analysis results, are open to the public through our Web site (http://gresource.hzau.edu.cn/resource/resource.html). PMID:24424778

  19. The Aspergillus Genome Database: multispecies curation and incorporation of RNA-Seq data to improve structural gene annotations.

    PubMed

    Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R

    2014-01-01

    The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.

  20. Single nucleotide polymorphism (SNP) discovery in rainbow trout using restriction site associated DNA (RAD) sequencing of doubled haploids and assessment of polymorphism in a population survey

    USDA-ARS?s Scientific Manuscript database

    Background: Our goal is to produce a high-throughput SNP genotyping platform for genomic analyses in rainbow trout that will enable fine mapping of QTL, whole genome association studies, genomic selection for improved aquaculture production traits, and genetic analyses of wild populations that aid ...

  1. Whole genome evaluation of tandem repeat polymorphisms between two pathogenically similar strains of Xylella fastidiosa isolated from almond and grape in California

    USDA-ARS?s Scientific Manuscript database

    Whole genome tandem repeat polymorphisms were evaluated between two closely related Xylella fastidiosa strains, M23 and Temecula1, both cause almond leaf scorch disease (ALSD) and grape Pierce’s disease (PD) in California. Strain M23 was isolated from almond and the genome was sequenced in this stu...

  2. Epigenetic differentiation and relationship to adaptive genetic divergence in discrete populations of the violet Viola cazorlensis.

    PubMed

    Herrera, Carlos M; Bazaga, Pilar

    2010-08-01

    *In plants, epigenetic variations based on DNA methylation are often heritable and could influence the course of evolution. Before this hypothesis can be assessed, fundamental questions about epigenetic variation remain to be addressed in a real-world context, including its magnitude, structuring within and among natural populations, and autonomy in relation to the genetic context. *Extent and patterns of cytosine methylation, and the relationship to adaptive genetic divergence between populations, were investigated for wild populations of the southern Spanish violet Viola cazorlensis (Violaceae) using the methylation-sensitive amplified polymorphism (MSAP) technique, a modification of the amplified fragment length polymorphism method (AFLP) based on the differential sensitivity of isoschizomeric restriction enzymes to site-specific cytosine methylation. *The genome of V. cazorlensis plants exhibited extensive levels of methylation, and methylation-based epigenetic variation was structured into distinct between- and within- population components. Epigenetic differentiation of populations was correlated with adaptive genetic divergence revealed by a Bayesian population-genomic analysis of AFLP data. Significant associations existed at the individual genome level between adaptive AFLP loci and the methylation state of methylation-susceptible MSAP loci. *Population-specific, divergent patterns of correlated selection on epigenetic and genetic individual variation could account for the coordinated epigenetic-genetic adaptive population differentiation revealed by this study.

  3. Indels, structural variation, and recombination drive genomic diversity in Plasmodium falciparum

    PubMed Central

    Miles, Alistair; Iqbal, Zamin; Vauterin, Paul; Pearson, Richard; Campino, Susana; Theron, Michel; Gould, Kelda; Mead, Daniel; Drury, Eleanor; O'Brien, John; Ruano Rubio, Valentin; MacInnis, Bronwyn; Mwangi, Jonathan; Samarakoon, Upeka; Ranford-Cartwright, Lisa; Ferdig, Michael; Hayton, Karen; Su, Xin-zhuan; Wellems, Thomas; Rayner, Julian; McVean, Gil; Kwiatkowski, Dominic

    2016-01-01

    The malaria parasite Plasmodium falciparum has a great capacity for evolutionary adaptation to evade host immunity and develop drug resistance. Current understanding of parasite evolution is impeded by the fact that a large fraction of the genome is either highly repetitive or highly variable and thus difficult to analyze using short-read sequencing technologies. Here, we describe a resource of deep sequencing data on parents and progeny from genetic crosses, which has enabled us to perform the first genome-wide, integrated analysis of SNP, indel and complex polymorphisms, using Mendelian error rates as an indicator of genotypic accuracy. These data reveal that indels are exceptionally abundant, being more common than SNPs and thus the dominant mode of polymorphism within the core genome. We use the high density of SNP and indel markers to analyze patterns of meiotic recombination, confirming a high rate of crossover events and providing the first estimates for the rate of non-crossover events and the length of conversion tracts. We observe several instances of meiotic recombination within copy number variants associated with drug resistance, demonstrating a mechanism whereby fitness costs associated with resistance mutations could be compensated and greater phenotypic plasticity could be acquired. PMID:27531718

  4. DNA polymorphism in recombining and non-recombing mating-type-specific loci of the smut fungus Microbotryum

    PubMed Central

    Votintseva, A A; Filatov, D A

    2011-01-01

    The population-genetic processes leading to the genetic degeneration of non-recombining regions have mainly been studied in animal and plant sex chromosomes. Here, we report population genetic analysis of the processes in the non-recombining mating-type-specific regions of the smut fungus Microbotryum violaceum. M. violaceum has A1 and A2 mating types, determined by mating-type-specific ‘sex chromosomes' that contain 1–2 Mb long non-recombining regions. If genetic degeneration were occurring, then one would expect reduced DNA polymorphism in the non-recombining regions of this fungus. The analysis of DNA diversity among 19 M. violaceum strains, collected across Europe from Silene latifolia flowers, revealed that (i) DNA polymorphism is relatively low in all 20 studied loci (π∼0.15%), (ii) it is not significantly different between the two mating-type-specific chromosomes nor between the non-recombining and recombining regions, (iii) there is substantial population structure in M. violaceum populations, which resembles that of its host species, S. latifolia, and (iv) there is significant linkage disequilibrium, suggesting that widespread selfing in this species results in a reduction of the effective recombination rate across the genome. We hypothesise that selfing-related reduction of recombination across the M. violaceum genome negates the difference in the level of DNA polymorphism between the recombining and non-recombining regions, and may possibly lead to similar levels of genetic degeneration in the mating-type-specific regions of the non-recombining ‘sex chromosomes' and elsewhere in the genome. PMID:21081967

  5. Templated sequence insertion polymorphisms in the human genome

    NASA Astrophysics Data System (ADS)

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  6. Space environment induced mutations prefer to occur at polymorphic sites of rice genomes

    NASA Astrophysics Data System (ADS)

    Li, Y.; Liu, M.; Cheng, Z.; Sun, Y.

    To explore the genomic characteristics of rice mutants induced by space environment, space-induced mutants 971-5, 972-4, and R955, which acquired new traits after space flight such as increased yield, reduced resistance to rice blast, and semi-dwarfism compared with their on-ground controls, 971ck, 972ck, and Bing95-503, respectively, together with other 8 japonica and 3 indica rice varieties, 17 in total, were analyzed by amplified fragment length polymorphism (AFLP) method. We chose 16 AFLP primer-pairs which generated a total of 1251 sites, of which 745 (59.6%) were polymorphic over all the genotypes. With the 16 pairs of primer combinations, 54 space-induced mutation sites were observed in 971-5, 86 in 972-4, and 5 in R955 compared to their controls, and the mutation rates were 4.3%, 6.9% and 0.4%, respectively. Interestingly, 75.9%, 84.9% and 100% of the mutation sites identified in 971-5, 972-4, and R955 occurred in polymorphic sites. This result suggests that the space environment preferentially induced mutations at polymorphic sites in rice genomes and might share a common mechanism with other types of mutagens. It also implies that polymorphic sites in genomes are potential "hotspots" for mutations induced by the space environment.

  7. Detecting associated single-nucleotide polymorphisms on the X chromosome in case control genome-wide association studies.

    PubMed

    Chen, Zhongxue; Ng, Hon Keung Tony; Li, Jing; Liu, Qingzhong; Huang, Hanwen

    2017-04-01

    In the past decade, hundreds of genome-wide association studies have been conducted to detect the significant single-nucleotide polymorphisms that are associated with certain diseases. However, most of the data from the X chromosome were not analyzed and only a few significant associated single-nucleotide polymorphisms from the X chromosome have been identified from genome-wide association studies. This is mainly due to the lack of powerful statistical tests. In this paper, we propose a novel statistical approach that combines the information of single-nucleotide polymorphisms on the X chromosome from both males and females in an efficient way. The proposed approach avoids the need of making strong assumptions about the underlying genetic models. Our proposed statistical test is a robust method that only makes the assumption that the risk allele is the same for both females and males if the single-nucleotide polymorphism is associated with the disease for both genders. Through simulation study and a real data application, we show that the proposed procedure is robust and have excellent performance compared to existing methods. We expect that many more associated single-nucleotide polymorphisms on the X chromosome will be identified if the proposed approach is applied to current available genome-wide association studies data.

  8. Characterization of 10 new nuclear microsatellite markers in Acca sellowiana (Myrtaceae).

    PubMed

    Klabunde, Gustavo H F; Olkoski, Denise; Vilperte, Vinicius; Zucchi, Maria I; Nodari, Rubens O

    2014-06-01

    Microsatellite primers were identified and characterized in Acca sellowiana in order to expand the limited number of pre-existing polymorphic markers for use in population genetic studies for conservation, phylogeography, breeding, and domestication. • A total of 10 polymorphic microsatellite primers were designed from clones obtained from a simple sequence repeat (SSR)-enriched genomic library. The primers amplified di- and trinucleotide repeats with four to 27 alleles per locus. In all tested populations, the observed heterozygosity ranged from 0.269 to 1.0. • These new polymorphic SSR markers will allow future genetic studies to be denser, either for genetic structure characterization of natural populations or for studies involving genetic breeding and domestication process in A. sellowiana.

  9. Discovery and mapping of single feature polymorphisms in wheat using Affymetrix arrays

    PubMed Central

    Bernardo, Amy N; Bradbury, Peter J; Ma, Hongxiang; Hu, Shengwa; Bowden, Robert L; Buckler, Edward S; Bai, Guihua

    2009-01-01

    Background Wheat (Triticum aestivum L.) is a staple food crop worldwide. The wheat genome has not yet been sequenced due to its huge genome size (~17,000 Mb) and high levels of repetitive sequences; the whole genome sequence may not be expected in the near future. Available linkage maps have low marker density due to limitation in available markers; therefore new technologies that detect genome-wide polymorphisms are still needed to discover a large number of new markers for construction of high-resolution maps. A high-resolution map is a critical tool for gene isolation, molecular breeding and genomic research. Single feature polymorphism (SFP) is a new microarray-based type of marker that is detected by hybridization of DNA or cRNA to oligonucleotide probes. This study was conducted to explore the feasibility of using the Affymetrix GeneChip to discover and map SFPs in the large hexaploid wheat genome. Results Six wheat varieties of diverse origins (Ning 7840, Clark, Jagger, Encruzilhada, Chinese Spring, and Opata 85) were analyzed for significant probe by variety interactions and 396 probe sets with SFPs were identified. A subset of 164 unigenes was sequenced and 54% showed polymorphism within probes. Microarray analysis of 71 recombinant inbred lines from the cross Ning 7840/Clark identified 955 SFPs and 877 of them were mapped together with 269 simple sequence repeat markers. The SFPs were randomly distributed within a chromosome but were unevenly distributed among different genomes. The B genome had the most SFPs, and the D genome had the least. Map positions of a selected set of SFPs were validated by mapping single nucleotide polymorphism using SNaPshot and comparing with expressed sequence tags mapping data. Conclusion The Affymetrix array is a cost-effective platform for SFP discovery and SFP mapping in wheat. The new high-density map constructed in this study will be a useful tool for genetic and genomic research in wheat. PMID:19480702

  10. Assessment of FAE1 polymorphisms in three Brassica species using EcoTILLING and their association with differences in seed erucic acid contents

    PubMed Central

    2010-01-01

    Background FAE1 (fatty acid elongase1) is the key gene in the control of erucic acid synthesis in seeds of Brassica species. Due to oil with low erucic acid (LEA) content is essential for human health and not enough LEA resource could be available, thus new LEA genetic resources are being sought for Brassica breeding. EcoTILLING, a powerful genotyping method, can readily be used to identify polymorphisms in Brassica. Results Seven B. rapa, nine B. oleracea and 101 B. napus accessions were collected for identification of FAE1 polymorphisms. Three polymorphisms were detected in the two FAE1 paralogues of B. napus using EcoTILLING and were found to be strongly associated with differences in the erucic acid contents of seeds. In genomic FAE1 sequences obtained from seven B. rapa accessions, one SNP in the coding region was deduced to cause loss of gene function. Molecular evolution analysis of FAE1 homologues showed that the relationship between the Brassica A and C genomes is closer than that between the A/C genomes and Arabidopsis genome. Alignment of the coding sequences of these FAE1 homologues indicated that 18 SNPs differed between the A and C genomes and could be used as genome-specific markers in Brassica. Conclusion This study showed the applicability of EcoTILLING for detecting gene polymorphisms in Brassica. The association between B. napus FAE1 polymorphisms and the erucic acid contents of seeds may provide useful guidance for LEA breeding. The discovery of the LEA resource in B. rapa can be exploited in Brasscia cultivation. PMID:20594317

  11. Assessment of FAE1 polymorphisms in three Brassica species using EcoTILLING and their association with differences in seed erucic acid contents.

    PubMed

    Wang, Nian; Shi, Lei; Tian, Fang; Ning, Huicai; Wu, Xiaoming; Long, Yan; Meng, Jinling

    2010-07-01

    FAE1 (fatty acid elongase1) is the key gene in the control of erucic acid synthesis in seeds of Brassica species. Due to oil with low erucic acid (LEA) content is essential for human health and not enough LEA resource could be available, thus new LEA genetic resources are being sought for Brassica breeding. EcoTILLING, a powerful genotyping method, can readily be used to identify polymorphisms in Brassica. Seven B. rapa, nine B. oleracea and 101 B. napus accessions were collected for identification of FAE1 polymorphisms. Three polymorphisms were detected in the two FAE1 paralogues of B. napus using EcoTILLING and were found to be strongly associated with differences in the erucic acid contents of seeds. In genomic FAE1 sequences obtained from seven B. rapa accessions, one SNP in the coding region was deduced to cause loss of gene function. Molecular evolution analysis of FAE1 homologues showed that the relationship between the Brassica A and C genomes is closer than that between the A/C genomes and Arabidopsis genome. Alignment of the coding sequences of these FAE1 homologues indicated that 18 SNPs differed between the A and C genomes and could be used as genome-specific markers in Brassica. This study showed the applicability of EcoTILLING for detecting gene polymorphisms in Brassica. The association between B. napus FAE1 polymorphisms and the erucic acid contents of seeds may provide useful guidance for LEA breeding. The discovery of the LEA resource in B. rapa can be exploited in Brasscia cultivation.

  12. Development of highly polymorphic simple sequence repeat markers using genome-wide microsatellite variant analysis in Foxtail millet [Setaria italica (L.) P. Beauv.

    PubMed Central

    2014-01-01

    Background Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. Result A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1’ by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei’s genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. Conclusions A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species. PMID:24472631

  13. Development of highly polymorphic simple sequence repeat markers using genome-wide microsatellite variant analysis in Foxtail millet [Setaria italica (L.) P. Beauv].

    PubMed

    Zhang, Shuo; Tang, Chanjuan; Zhao, Qiang; Li, Jing; Yang, Lifang; Qie, Lufeng; Fan, Xingke; Li, Lin; Zhang, Ning; Zhao, Meicheng; Liu, Xiaotong; Chai, Yang; Zhang, Xue; Wang, Hailong; Li, Yingtao; Li, Wen; Zhi, Hui; Jia, Guanqing; Diao, Xianmin

    2014-01-28

    Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1' by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei's genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species.

  14. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing

    PubMed Central

    Sadsad, Rosemarie; Martinez, Elena; Jelfs, Peter; Hill-Cawthorne, Grant A.; Gilbert, Gwendolyn L.; Marais, Ben J.; Sintchenko, Vitali

    2016-01-01

    Background Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways. Methods We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants. Results Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade. Conclusion Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster. PMID:26938641

  15. Genetic diversity and structure of elite cotton germplasm (Gossypium hirsutum L.) using genome-wide SNP data.

    PubMed

    Ai, XianTao; Liang, YaJun; Wang, JunDuo; Zheng, JuYun; Gong, ZhaoLong; Guo, JiangPing; Li, XueYuan; Qu, YanYing

    2017-10-01

    Cotton (Gossypium spp.) is the most important natural textile fiber crop, and Gossypium hirsutum L. is responsible for 90% of the annual cotton crop in the world. Information on cotton genetic diversity and population structure is essential for new breeding lines. In this study, we analyzed population structure and genetic diversity of 288 elite Gossypium hirsutum cultivar accessions collected from around the world, and especially from China, using genome-wide single nucleotide polymorphisms (SNP) markers. The average polymorphsim information content (PIC) was 0.25, indicating a relatively low degree of genetic diversity. Population structure analysis revealed extensive admixture and identified three subgroups. Phylogenetic analysis supported the subgroups identified by STRUCTURE. The results from both population structure and phylogenetic analysis were, for the most part, in agreement with pedigree information. Analysis of molecular variance revealed a larger amount of variation was due to diversity within the groups. Establishment of genetic diversity and population structure from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in upland cotton.

  16. [Dynamics of genome changes in Rauwolfia serpentina callus tissue upon the switch to conditions of submerged cultivation].

    PubMed

    Spiridonova, E V; Adnof, D M; Andreev, I O; Kunakh, V A

    2008-01-01

    Genome of Rauwolfia serpentina callus cells was found to fail undergo the noticeable changes for several early passages upon the switch from surface to submerged cultivation in the liquid medium of special composition. After subsequent 4-6 passages in submerged culture RAPD spectra polymorphism was revealed which may reflect the changes in DNA sequence as well as in the structure of cell population that forms the strain. Introduction of the intermediary passage on the agar-solidified medium of more simple composition prior to transfer into liquid medium appeared not to affect essentially the level and the pattern of genome changes.

  17. Glutathione S-transferase gene polymorphisms in celiac disease and their correlation with genomic instability phenotype.

    PubMed

    Fundia, Ariela F; Weich, Natalia; Crivelli, Adriana; La Motta, Graciela; Larripa, Irene B; Slavutsky, Irma

    2014-06-01

    Genomic instability and reduced glutathione S-transferase (GST) activity have been identified as potential risk factors for malignant complications in celiac disease (CD). In this study, we assessed the possible influence of GST polymorphisms on genome instability phenotypes in a genetically characterised group of celiac patients from previous studies. The deletion polymorphisms in GSTM1 and GSTT1 genes and the single-nucleotide polymorphism GSTP1 c.313A>G were genotyped using PCR in a set of 20 untreated adult patients with a known genomic instability phenotype and 69 age- and sex-matched healthy individuals. The frequencies of variant genotypes in patients were GSTM1-null (30%), GSTT1-null (5%), GSTP1-AG (60%) and GSTP1-GG (15%), and they showed no differences from controls. No significant differences were found in the genotype distribution based on telomere length. Cases with GSTM1-null genotype (83%) and microsatellite stability were more frequent than those with genomic instability. Moreover, carriers of GSTP1-variant genotype (73%) and stable phenotype were significantly increased compared to unstable patients (27%) (P=0.031). No differences were found according to the clinical-pathological characteristics of celiac cases. No association between GST polymorphic variants and celiac-associated genomic instability was proven in our cohort. Future studies should explore the usefulness of other biomarkers to distinguish celiac patients who are susceptible to cancer development. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  18. Genome-Wide Patterns of Polymorphism in an Inbred Line of the African Malaria Mosquito Anopheles gambiae

    PubMed Central

    Turissini, David A.; Gamez, Stephanie; White, Bradley J.

    2014-01-01

    Anopheles gambiae is a major mosquito vector of malaria in Africa. Although increased use of insecticide-based vector control tools has decreased malaria transmission, elimination is likely to require novel genetic control strategies. It can be argued that the absence of an A. gambiae inbred line has slowed progress toward genetic vector control. In order to empower genetic studies and enable precise and reproducible experimentation, we set out to create an inbred line of this species. We found that amenability to inbreeding varied between populations of A. gambiae. After full-sib inbreeding for ten generations, we genotyped 112 individuals—56 saved prior to inbreeding and 56 collected after inbreeding—at a genome-wide panel of single nucleotide polymorphisms (SNPs). Although inbreeding dramatically reduced diversity across much of the genome, we discovered numerous, discrete genomic blocks that maintained high heterozygosity. For one large genomic region, we were able to definitively show that high diversity is due to the persistent polymorphism of a chromosomal inversion. Inbred lines in other eukaryotes often exhibit a qualitatively similar retention of polymorphism when typed at a small number of markers. Our whole-genome SNP data provide the first strong, empirical evidence supporting associative overdominance as the mechanism maintaining higher than expected diversity in inbred lines. Although creation of A. gambiae lines devoid of nearly all polymorphism may not be feasible, our results provide critical insights into how more fully isogenic lines can be created. PMID:25377942

  19. Construction of a large collection of small genome variations in French dairy and beef breeds using whole-genome sequences.

    PubMed

    Boussaha, Mekki; Michot, Pauline; Letaief, Rabia; Hozé, Chris; Fritz, Sébastien; Grohs, Cécile; Esquerré, Diane; Duchesne, Amandine; Philippe, Romain; Blanquet, Véronique; Phocas, Florence; Floriot, Sandrine; Rocha, Dominique; Klopp, Christophe; Capitan, Aurélien; Boichard, Didier

    2016-11-15

    In recent years, several bovine genome sequencing projects were carried out with the aim of developing genomic tools to improve dairy and beef production efficiency and sustainability. In this study, we describe the first French cattle genome variation dataset obtained by sequencing 274 whole genomes representing several major dairy and beef breeds. This dataset contains over 28 million single nucleotide polymorphisms (SNPs) and small insertions and deletions. Comparisons between sequencing results and SNP array genotypes revealed a very high genotype concordance rate, which indicates the good quality of our data. To our knowledge, this is the first large-scale catalog of small genomic variations in French dairy and beef cattle. This resource will contribute to the study of gene functions and population structure and also help to improve traits through genotype-guided selection.

  20. Whole Genome Sequencing of Greater Amberjack (Seriola dumerili) for SNP Identification on Aligned Scaffolds and Genome Structural Variation Analysis Using Parallel Resequencing

    PubMed Central

    Aokic, Jun-ya; Kawase, Junya; Hamada, Kazuhisa; Fujimoto, Hiroshi; Yamamoto, Ikki; Usuki, Hironori

    2018-01-01

    Greater amberjack (Seriola dumerili) is distributed in tropical and temperate waters worldwide and is an important aquaculture fish. We carried out de novo sequencing of the greater amberjack genome to construct a reference genome sequence to identify single nucleotide polymorphisms (SNPs) for breeding amberjack by marker-assisted or gene-assisted selection as well as to identify functional genes for biological traits. We obtained 200 times coverage and constructed a high-quality genome assembly using next generation sequencing technology. The assembled sequences were aligned onto a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map by sequence homology. A total of 215 of the longest amberjack sequences, with a total length of 622.8 Mbp (92% of the total length of the genome scaffolds), were lined up on the yellowtail RH map. We resequenced the whole genomes of 20 greater amberjacks and mapped the resulting sequences onto the reference genome sequence. About 186,000 nonredundant SNPs were successfully ordered on the reference genome. Further, we found differences in the genome structural variations between two greater amberjack populations using BreakDancer. We also analyzed the greater amberjack transcriptome and mapped the annotated sequences onto the reference genome sequence. PMID:29785397

  1. Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN

    PubMed Central

    Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger

    2016-01-01

    Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831

  2. A High-Density Genetic Map with Array-Based Markers Facilitates Structural and Quantitative Trait Locus Analyses of the Common Wheat Genome

    PubMed Central

    Iehisa, Julio Cesar Masaru; Ohno, Ryoko; Kimura, Tatsuro; Enoki, Hiroyuki; Nishimura, Satoru; Okamoto, Yuki; Nasuda, Shuhei; Takumi, Shigeo

    2014-01-01

    The large genome and allohexaploidy of common wheat have complicated construction of a high-density genetic map. Although improvements in the throughput of next-generation sequencing (NGS) technologies have made it possible to obtain a large amount of genotyping data for an entire mapping population by direct sequencing, including hexaploid wheat, a significant number of missing data points are often apparent due to the low coverage of sequencing. In the present study, a microarray-based polymorphism detection system was developed using NGS data obtained from complexity-reduced genomic DNA of two common wheat cultivars, Chinese Spring (CS) and Mironovskaya 808. After design and selection of polymorphic probes, 13,056 new markers were added to the linkage map of a recombinant inbred mapping population between CS and Mironovskaya 808. On average, 2.49 missing data points per marker were observed in the 201 recombinant inbred lines, with a maximum of 42. Around 40% of the new markers were derived from genic regions and 11% from repetitive regions. The low number of retroelements indicated that the new polymorphic markers were mainly derived from the less repetitive region of the wheat genome. Around 25% of the mapped sequences were useful for alignment with the physical map of barley. Quantitative trait locus (QTL) analyses of 14 agronomically important traits related to flowering, spikes, and seeds demonstrated that the new high-density map showed improved QTL detection, resolution, and accuracy over the original simple sequence repeat map. PMID:24972598

  3. A high-density genetic map with array-based markers facilitates structural and quantitative trait locus analyses of the common wheat genome.

    PubMed

    Iehisa, Julio Cesar Masaru; Ohno, Ryoko; Kimura, Tatsuro; Enoki, Hiroyuki; Nishimura, Satoru; Okamoto, Yuki; Nasuda, Shuhei; Takumi, Shigeo

    2014-10-01

    The large genome and allohexaploidy of common wheat have complicated construction of a high-density genetic map. Although improvements in the throughput of next-generation sequencing (NGS) technologies have made it possible to obtain a large amount of genotyping data for an entire mapping population by direct sequencing, including hexaploid wheat, a significant number of missing data points are often apparent due to the low coverage of sequencing. In the present study, a microarray-based polymorphism detection system was developed using NGS data obtained from complexity-reduced genomic DNA of two common wheat cultivars, Chinese Spring (CS) and Mironovskaya 808. After design and selection of polymorphic probes, 13,056 new markers were added to the linkage map of a recombinant inbred mapping population between CS and Mironovskaya 808. On average, 2.49 missing data points per marker were observed in the 201 recombinant inbred lines, with a maximum of 42. Around 40% of the new markers were derived from genic regions and 11% from repetitive regions. The low number of retroelements indicated that the new polymorphic markers were mainly derived from the less repetitive region of the wheat genome. Around 25% of the mapped sequences were useful for alignment with the physical map of barley. Quantitative trait locus (QTL) analyses of 14 agronomically important traits related to flowering, spikes, and seeds demonstrated that the new high-density map showed improved QTL detection, resolution, and accuracy over the original simple sequence repeat map. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    PubMed

    Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

    2012-02-17

    The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  5. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    PubMed

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  6. SiNoPsis: Single Nucleotide Polymorphisms selection and promoter profiling.

    PubMed

    Boloc, Daniel; Rodríguez, Natalia; Gassó, Patricia; Abril, Josep F; Bernardo, Miquel; Lafuente, Amalia; Mas, Sergi

    2017-09-14

    The selection of a Single Nucleotide Polymorphism (SNP) using bibliographic methods can be a very time-consuming task. Moreover, a SNP selected in this way may not be easily visualized in its genomic context by a standard user hoping to correlate it with other valuable information. Here we propose a web form built on top of Circos that can assist SNP-centred screening, based on their location in the genome and the regulatory modules they can disrupt. Its use may allow researchers to prioritize SNPs in genotyping and disease studies. SiNoPsis is bundled as a web portal. It focuses on the different structures involved in the genomic expression of a gene, especially those found in the core promoter upstream region. These structures include transcription factor binding sites (for promoter and enhancer signals), histones, and promoter flanking regions. Additionally, the tool provides eQTL and linkage disequilibrium (LD) properties for a given SNP query, yielding further clues about other indirectly associated SNPs. Possible disruptions of the aforementioned structures affecting gene transcription are reported using multiple resource databases. SiNoPsis has a simple user-friendly interface, which allows single queries by gene symbol, genomic coordinates, Ensembl gene identifiers, RefSeq transcript identifiers and SNPs. It is the only portal providing useful SNP selection based on regulatory modules and LD with functional variants in both textual and graphic modes (by properly defining the arguments and parameters needed to run Circos). SiNoPsis is freely available at https://compgen.bio.ub.edu/SiNoPsis /. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. [Genomic disorders in the mononuclear blood cells of those who worked in the cleanup of the accident at the Chernobyl Atomic Electric Power Station].

    PubMed

    Butenko, Z A; Smirnova, I A; Zak, K P; Mikhaĭlovskaia, E V; Ianok, E A; Kishinskaia, E G

    1998-01-01

    The results of molecular investigations of blood mononuclears from 120 clean-up workers after 7-9 years of Chernobyl accident with the total exposure radiation doses ranging from 5 to 76 cGr are presented. Structural polymorphism of the leukemia associated bcr and ribosomal RNA (rRNA) genes were studied using Southern blot hybridization. Allelic polymorphism of bcr gene with characteristic for leukemia allele distribution was detected in 16.6%. Rearrangements of rRNA genes were observed in 13% of Chernobyl accident clean-up workers.

  8. A new single-nucleotide polymorphism database for rainbow trout generated through whole genome re-sequencing

    USDA-ARS?s Scientific Manuscript database

    Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...

  9. Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array

    USDA-ARS?s Scientific Manuscript database

    High-density single nucleotide polymorphism (SNP) genotyping chips are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships among individuals in populations and studying marker-trait associations in mapping experiments. We developed a genotyping array includ...

  10. New insights into the phylogenetics and population structure of the prairie falcon (Falco mexicanus)

    USGS Publications Warehouse

    Doyle, Jacqueline M.; Bell, Douglas A.; Bloom, Peter H.; Emmons, Gavin; Fesnock, Amy; Katzner, Todd; LePre, Larry; Leonard, Kolbe; SanMiguel, Phillip; Westerman, Rick; DeWoody, J. Andrew

    2018-01-01

    BackgroundManagement requires a robust understanding of between- and within-species genetic variability, however such data are still lacking in many species. For example, although multiple population genetics studies of the peregrine falcon (Falco peregrinus) have been conducted, no similar studies have been done of the closely-related prairie falcon (F. mexicanus) and it is unclear how much genetic variation and population structure exists across the species’ range. Furthermore, the phylogenetic relationship of F. mexicanus relative to other falcon species is contested. We utilized a genomics approach (i.e., genome sequencing and assembly followed by single nucleotide polymorphism genotyping) to rapidly address these gaps in knowledge.ResultsWe sequenced the genome of a single female prairie falcon and generated a 1.17 Gb (gigabases) draft genome assembly. We generated maximum likelihood phylogenetic trees using complete mitochondrial genomes as well as nuclear protein-coding genes. This process provided evidence that F. mexicanus is an outgroup to the clade that includes the peregrine falcon and members of the subgenus Hierofalco. We annotated > 16,000 genes and almost 600,000 high-quality single nucleotide polymorphisms (SNPs) in the nuclear genome, providing the raw material for a SNP assay design featuring > 140 gene-associated markers and a molecular-sexing marker. We subsequently genotyped ~ 100 individuals from California (including the San Francisco East Bay Area, Pinnacles National Park and the Mojave Desert) and Idaho (Snake River Birds of Prey National Conservation Area). We tested for population structure and found evidence that individuals sampled in California and Idaho represent a single panmictic population.ConclusionsOur study illustrates how genomic resources can rapidly shed light on genetic variability in understudied species and resolve phylogenetic relationships. Furthermore, we found evidence of a single, randomly mating population of prairie falcons across our sampling locations. Prairie falcons are highly mobile and relatively rare long-distance dispersal events may promote gene flow throughout the range. As such, California’s prairie falcons might be managed as a single population, indicating that management actions undertaken to benefit the species at the local level have the potential to influence the species as a whole.

  11. Genome survey of pistachio (Pistacia vera L.) by next generation sequencing: Development of novel SSR markers and genetic diversity in Pistacia species.

    PubMed

    Ziya Motalebipour, Elmira; Kafkas, Salih; Khodaeiaminjan, Mortaza; Çoban, Nergiz; Gözel, Hatice

    2016-12-07

    Pistachio (Pistacia vera L.) is one of the most important nut crops in the world. There are about 11 wild species in the genus Pistacia, and they have importance as rootstock seed sources for cultivated P. vera and forest trees. Published information on the pistachio genome is limited. Therefore, a genome survey is necessary to obtain knowledge on the genome structure of pistachio by next generation sequencing. Simple sequence repeat (SSR) markers are useful tools for germplasm characterization, genetic diversity analysis, and genetic linkage mapping, and may help to elucidate genetic relationships among pistachio cultivars and species. To explore the genome structure of pistachio, a genome survey was performed using the Illumina platform at approximately 40× coverage depth in the P. vera cv. Siirt. The K-mer analysis indicated that pistachio has a genome that is about 600 Mb in size and is highly heterozygous. The assembly of 26.77 Gb Illumina data produced 27,069 scaffolds at N50 = 3.4 kb with a total of 513.5 Mb. A total of 59,280 SSR motifs were detected with a frequency of 8.67 kb. A total of 206 SSRs were used to characterize 24 P. vera cultivars and 20 wild Pistacia genotypes (four genotypes from each five wild Pistacia species) belonging to P. atlantica, P. integerrima, P. chinenesis, P. terebinthus, and P. lentiscus genotypes. Overall 135 SSR loci amplified in all 44 cultivars and genotypes, 41 were polymorphic in six Pistacia species. The novel SSR loci developed from cultivated pistachio were highly transferable to wild Pistacia species. The results from a genome survey of pistachio suggest that the genome size of pistachio is about 600 Mb with a high heterozygosity rate. This information will help to design whole genome sequencing strategies for pistachio. The newly developed novel polymorphic SSRs in this study may help germplasm characterization, genetic diversity, and genetic linkage mapping studies in the genus Pistacia.

  12. Investigation of Inversion Polymorphisms in the Human Genome Using Principal Components Analysis

    PubMed Central

    Ma, Jianzhong; Amos, Christopher I.

    2012-01-01

    Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct “populations” of inversion homozygotes of different orientations and their 1∶1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases. PMID:22808122

  13. A Polymorphic p53 Response Element in KIT Ligand Influences Cancer Risk and Has Undergone Natural Selection

    PubMed Central

    Zeron-Medina, Jorge; Wang, Xuting; Repapi, Emmanouela; Campbell, Michelle R.; Su, Dan; Castro-Giner, Francesc; Davies, Benjamin; Peterse, Elisabeth F.P.; Sacilotto, Natalia; Walker, Graeme J.; Terzian, Tamara; Tomlinson, Ian P.; Box, Neil F.; Meinshausen, Nicolai; De Val, Sarah; Bell, Douglas A.; Bond, Gareth L.

    2014-01-01

    SUMMARY The ability of p53 to regulate transcription is crucial for tumor suppression and implies that inherited polymorphisms in functional p53-binding sites could influence cancer. Here, we identify a polymorphic p53 responsive element and demonstrate its influence on cancer risk using genome-wide data sets of cancer susceptibility loci, genetic variation, p53 occupancy, and p53-binding sites. We uncover a single-nucleotide polymorphism (SNP) in a functional p53-binding site and establish its influence on the ability of p53 to bind to and regulate transcription of the KITLG gene. The SNP resides in KITLG and associates with one of the largest risks identified among cancer genome-wide association studies. We establish that the SNP has undergone positive selection throughout evolution, signifying a selective benefit, but go on to show that similar SNPs are rare in the genome due to negative selection, indicating that polymorphisms in p53-binding sites are primarily detrimental to humans. PMID:24120139

  14. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    PubMed

    Krishnan S, Gopala; Waters, Daniel L E; Henry, Robert J

    2014-01-01

    Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts). Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  15. SNiPlay: a web-based tool for detection, management and analysis of SNPs. Application to grapevine diversity projects.

    PubMed

    Dereeper, Alexis; Nicolas, Stéphane; Le Cunff, Loïc; Bacilieri, Roberto; Doligez, Agnès; Peros, Jean-Pierre; Ruiz, Manuel; This, Patrice

    2011-05-05

    High-throughput re-sequencing, new genotyping technologies and the availability of reference genomes allow the extensive characterization of Single Nucleotide Polymorphisms (SNPs) and insertion/deletion events (indels) in many plant species. The rapidly increasing amount of re-sequencing and genotyping data generated by large-scale genetic diversity projects requires the development of integrated bioinformatics tools able to efficiently manage, analyze, and combine these genetic data with genome structure and external data. In this context, we developed SNiPlay, a flexible, user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis. It integrates:1) a pipeline, freely accessible through the internet, combining existing softwares with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data. From standard sequence alignments, genotyping data or Sanger sequencing traces given as input, SNiPlay detects SNPs and indels events and outputs submission files for the design of Illumina's SNP chips. Subsequently, it sends sequences and genotyping data into a series of modules in charge of various processes: physical mapping to a reference genome, annotation (genomic position, intron/exon location, synonymous/non-synonymous substitutions), SNP frequency determination in user-defined groups, haplotype reconstruction and network, linkage disequilibrium evaluation, and diversity analysis (Pi, Watterson's Theta, Tajima's D).Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.2) a database storing polymorphisms, genotyping data and grapevine sequences released by public and private projects. It allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency), to compare SNP patterns between populations, and to export genotyping data or sequences in various formats. Our experiments on grapevine genetic projects showed that SNiPlay allows geneticists to rapidly obtain advanced results in several key research areas of plant genetic diversity. Both the management and treatment of large amounts of SNP data are rendered considerably easier for end-users through automation and integration. Current developments are taking into account new advances in high-throughput technologies.SNiPlay is available at: http://sniplay.cirad.fr/.

  16. Comparative genomics of Bacillus anthracis from the wool industry highlights polymorphisms of lineage A.Br.Vollum.

    PubMed

    Derzelle, Sylviane; Aguilar-Bultet, Lisandra; Frey, Joachim

    2016-12-01

    With the advent of affordable next-generation sequencing (NGS) technologies, major progress has been made in the understanding of the population structure and evolution of the B. anthracis species. Here we report the use of whole genome sequencing and computer-based comparative analyses to characterize six strains belonging to the A.Br.Vollum lineage. These strains were isolated in Switzerland, in 1981, during iterative cases of anthrax involving workers in a textile plant processing cashmere wool from the Indian subcontinent. We took advantage of the hundreds of currently available B. anthracis genomes in public databases, to investigate the genetic diversity existing within the A.Br.Vollum lineage and to position the six Swiss isolates into the worldwide B. anthracis phylogeny. Thirty additional genomes related to the A.Br.Vollum group were identified by whole-genome single nucleotide polymorphism (SNP) analysis, including two strains forming a new evolutionary branch at the basis of the A.Br.Vollum lineage. This new phylogenetic lineage (termed A.Br.H9401) splits off the branch leading to the A.Br.Vollum group soon after its divergence to the other lineages of the major A clade (i.e. 6 SNPs). The available dataset of A.Br.Vollum genomes were resolved into 2 distinct groups. Isolates from the Swiss wool processing facility clustered together with two strains from Pakistan and one strain of unknown origin isolated from yarn. They were clearly differentiated (69 SNPs) from the twenty-five other A.Br.Vollum strains located on the branch leading to the terminal reference strain A0488 of the lineage. Novel analytic assays specific to these new subgroups were developed for the purpose of rapid molecular epidemiology. Whole genome SNP surveys greatly expand upon our knowledge on the sub-structure of the A.Br.Vollum lineage. Possible origin and route of spread of this lineage worldwide are discussed. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  17. Wide Distribution of Mitochondrial Genome Rearrangements in Wild Strains of the Cultivated Basidiomycete Agrocybe aegerita

    PubMed Central

    Barroso, G.; Blesa, S.; Labarere, J.

    1995-01-01

    We used restriction fragment length polymorphisms to examine mitochondrial genome rearrangements in 36 wild strains of the cultivated basidiomycete Agrocybe aegerita, collected from widely distributed locations in Europe. We identified two polymorphic regions within the mitochondrial DNA which varied independently: one carrying the Cox II coding sequence and the other carrying the Cox I, ATP6, and ATP8 coding sequences. Two types of mutations were responsible for the restriction fragment length polymorphisms that we observed and, accordingly, were involved in the A. aegerita mitochondrial genome evolution: (i) point mutations, which resulted in strain-specific mitochondrial markers, and (ii) length mutations due to genome rearrangements, such as deletions, insertions, or duplications. Within each polymorphic region, the length differences defined only two mitochondrial types, suggesting that these length mutations were not randomly generated but resulted from a precise rearrangement mechanism. For each of the two polymorphic regions, the two molecular types were distributed among the 36 strains without obvious correlation with their geographic origin. On the basis of these two polymorphisms, it is possible to define four mitochondrial haplotypes. The four mitochondrial haplotypes could be the result of intermolecular recombination between allelic forms present in the population long enough to reach linkage equilibrium. All of the 36 dikaryotic strains contained only a single mitochondrial type, confirming the previously described mitochondrial sorting out after cytoplasmic mixing in basidiomycetes. PMID:16534984

  18. Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.

    PubMed

    Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y

    2000-01-01

    Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.

  19. Genome-wide association study of fertility traits in dairy cattle using high-density single nucleotide polymorphism marker panels

    USDA-ARS?s Scientific Manuscript database

    Unfavorable genetic correlations between production and fertility traits are well documented. Genetic selection for fertility traits is slow, however, due to low heritabilities. Identification of single nucleotide polymorphisms (SNP) involved in reproduction could improve reliability of genomic esti...

  20. Discovery, Validation and Characterization of 1039 Cattle Single Nucleotide Polymorphisms

    USDA-ARS?s Scientific Manuscript database

    We identified approximately 13000 putative single nucleotide polymorphisms (SNPs) by comparison of repeat-masked BAC-end sequences from the cattle RPCI-42 BAC library with whole-genome shotgun contigs of cattle genome assembly Btau 1.0. Genotyping of a subset of these SNPs was performed on a panel ...

  1. A new single-nucleotide polymorphisms database for rainbow trout generated through whole genome resequencing of selected samples

    USDA-ARS?s Scientific Manuscript database

    Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...

  2. GPU Accelerated Browser for Neuroimaging Genomics.

    PubMed

    Zigon, Bob; Li, Huang; Yao, Xiaohui; Fang, Shiaofen; Hasan, Mohammad Al; Yan, Jingwen; Moore, Jason H; Saykin, Andrew J; Shen, Li

    2018-04-25

    Neuroimaging genomics is an emerging field that provides exciting opportunities to understand the genetic basis of brain structure and function. The unprecedented scale and complexity of the imaging and genomics data, however, have presented critical computational bottlenecks. In this work we present our initial efforts towards building an interactive visual exploratory system for mining big data in neuroimaging genomics. A GPU accelerated browsing tool for neuroimaging genomics is created that implements the ANOVA algorithm for single nucleotide polymorphism (SNP) based analysis and the VEGAS algorithm for gene-based analysis, and executes them at interactive rates. The ANOVA algorithm is 110 times faster than the 4-core OpenMP version, while the VEGAS algorithm is 375 times faster than its 4-core OpenMP counter part. This approach lays a solid foundation for researchers to address the challenges of mining large-scale imaging genomics datasets via interactive visual exploration.

  3. Segmental Duplications and Copy-Number Variation in the Human Genome

    PubMed Central

    Sharp, Andrew J. ; Locke, Devin P. ; McGrath, Sean D. ; Cheng, Ze ; Bailey, Jeffrey A. ; Vallente, Rhea U. ; Pertz, Lisa M. ; Clark, Royden A. ; Schwartz, Stuart ; Segraves, Rick ; Oseroff, Vanessa V. ; Albertson, Donna G. ; Pinkel, Daniel ; Eichler, Evan E. 

    2005-01-01

    The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders. PMID:15918152

  4. Characterization of 10 new nuclear microsatellite markers in Acca sellowiana (Myrtaceae)1

    PubMed Central

    Klabunde, Gustavo H. F.; Olkoski, Denise; Vilperte, Vinicius; Zucchi, Maria I.; Nodari, Rubens O.

    2014-01-01

    • Premise of the study: Microsatellite primers were identified and characterized in Acca sellowiana in order to expand the limited number of pre-existing polymorphic markers for use in population genetic studies for conservation, phylogeography, breeding, and domestication. • Methods and Results: A total of 10 polymorphic microsatellite primers were designed from clones obtained from a simple sequence repeat (SSR)–enriched genomic library. The primers amplified di- and trinucleotide repeats with four to 27 alleles per locus. In all tested populations, the observed heterozygosity ranged from 0.269 to 1.0. • Conclusions: These new polymorphic SSR markers will allow future genetic studies to be denser, either for genetic structure characterization of natural populations or for studies involving genetic breeding and domestication process in A. sellowiana. PMID:25202632

  5. Draft genome sequence of Cicer reticulatum L., the wild progenitor of chickpea provides a resource for agronomic trait improvement.

    PubMed

    Gupta, Sonal; Nawaz, Kashif; Parween, Sabiha; Roy, Riti; Sahu, Kamlesh; Kumar Pole, Anil; Khandal, Hitaishi; Srivastava, Rishi; Kumar Parida, Swarup; Chattopadhyay, Debasis

    2017-02-01

    Cicer reticulatum L. is the wild progenitor of the fourth most important legume crop chickpea (C. arietinum L.). We assembled short-read sequences into 416 Mb draft genome of C. reticulatum and anchored 78% (327 Mb) of this assembly to eight linkage groups. Genome annotation predicted 25,680 protein-coding genes covering more than 90% of predicted gene space. The genome assembly shared a substantial synteny and conservation of gene orders with the genome of the model legume Medicago truncatula. Resistance gene homologs of wild and domesticated chickpeas showed high sequence homology and conserved synteny. Comparison of gene sequences and nucleotide diversity using 66 wild and domesticated chickpea accessions suggested that the desi type chickpea was genetically closer to the wild species than the kabuli type. Comparative analyses predicted gene flow between the wild and the cultivated species during domestication. Molecular diversity and population genetic structure determination using 15,096 genome-wide single nucleotide polymorphisms revealed an admixed domestication pattern among cultivated (desi and kabuli) and wild chickpea accessions belonging to three population groups reflecting significant influence of parentage or geographical origin for their cultivar-specific population classification. The assembly and the polymorphic sequence resources presented here would facilitate the study of chickpea domestication and targeted use of wild Cicer germplasms for agronomic trait improvement in chickpea. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  6. NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data.

    PubMed

    Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

    2016-01-01

    The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data.

  7. Development and validation of polymorphic microsatellite loci for the NA2 lineage of Phytophthora ramorum from whole genome sequence data

    USDA-ARS?s Scientific Manuscript database

    Phytophthora ramorum is the causal agent of sudden oak death and sudden larch death, and is also responsible for causing ramorum blight on woody ornamental plants. Many microsatellite markers are available to characterize the genetic diversity and population structure of P. ramorum. However, only tw...

  8. Sequencing and annotation of the chloroplast DNAs and identification of polymorphisms distinguishing normal male-fertile and male-sterile cytoplasms of onion.

    PubMed

    von Kohn, Christopher; Kiełkowska, Agnieszka; Havey, Michael J

    2013-12-01

    Male-sterile (S) cytoplasm of onion is an alien cytoplasm introgressed into onion in antiquity and is widely used for hybrid seed production. Owing to the biennial generation time of onion, classical crossing takes at least 4 years to classify cytoplasms as S or normal (N) male-fertile. Molecular markers in the organellar DNAs that distinguish N and S cytoplasms are useful to reduce the time required to classify onion cytoplasms. In this research, we completed next-generation sequencing of the chloroplast DNAs of N- and S-cytoplasmic onions; we assembled and annotated the genomes in addition to identifying polymorphisms that distinguish these cytoplasms. The sizes (153 538 and 153 355 base pairs) and GC contents (36.8%) were very similar for the chloroplast DNAs of N and S cytoplasms, respectively, as expected given their close phylogenetic relationship. The size difference was primarily due to small indels in intergenic regions and a deletion in the accD gene of N-cytoplasmic onion. The structures of the onion chloroplast DNAs were similar to those of most land plants with large and small single copy regions separated by inverted repeats. Twenty-eight single nucleotide polymorphisms, two polymorphic restriction-enzyme sites, and one indel distributed across 20 chloroplast genes in the large and small single copy regions were selected and validated using diverse onion populations previously classified as N or S cytoplasmic using restriction fragment length polymorphisms. Although cytoplasmic male sterility is likely associated with the mitochondrial DNA, maternal transmission of the mitochondrial and chloroplast DNAs allows for polymorphisms in either genome to be useful for classifying onion cytoplasms to aid the development of hybrid onion cultivars.

  9. [Alzheimer's disease and methylenetetrahydrofolate reductase gene polymorphisms: a potential nutrigenomic approach for Mexico].

    PubMed

    Castillo-Quan, Jorge I; Pérez-Osorio, Julia M

    2009-01-01

    The establishment of medical genomics in Mexico offers the possibility to study in a more comprehensive manner the etiological factors of different diseases, providing a global view of the interaction between the genome and the environment. Nutrition is recognized as a significant determinant in several diseases, yet its interaction with polymorphisms, and in general with the genome, has not been properly addressed Mexico has a high prevalence of polymorphisms of the methylenetetrahydrofolate reductase gene, and in both clinical and basic studies this has been associated with an increased susceptibility of developing Alzheimer's disease. We propose a potential nutrigenomic approach for the study of Alzheimer disease in Mexico.

  10. Genome-wide single nucleotide polymorphism scan suggests adaptation to urbanization in an important pollinator, the red-tailed bumblebee (Bombus lapidarius L.).

    PubMed

    Theodorou, Panagiotis; Radzevičiūtė, Rita; Kahnt, Belinda; Soro, Antonella; Grosse, Ivo; Paxton, Robert J

    2018-04-25

    Urbanization is considered a global threat to biodiversity; the growth of cities results in an increase in impervious surfaces, soil and air pollution, fragmentation of natural vegetation and invasion of non-native species, along with numerous environmental changes, including the heat island phenomenon. The combination of these effects constitutes a challenge for both the survival and persistence of many native species, while also imposing altered selective regimes. Here, using 110 314 single nucleotide polymorphisms generated by restriction-site-associated DNA sequencing, we investigated the genome-wide effects of urbanization on putative neutral and adaptive genomic diversity in a major insect pollinator, Bombus lapidarius , collected from nine German cities and nine paired rural sites. Overall, genetic differentiation among sites was low and there was no obvious genome-wide genetic structuring, suggesting the absence of strong effects of urbanization on gene flow. We nevertheless identified several loci under directional selection, a subset of which was associated with urban land use, including the percentage of impervious surface surrounding each sampling site. Overall, our results provide evidence of local adaptation to urbanization in the face of gene flow in a highly mobile insect pollinator. © 2018 The Author(s).

  11. The diploid genome sequence of an Asian individual

    PubMed Central

    Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

    2009-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735

  12. Genic rather than genome-wide differences between sexually deceptive Ophrys orchids with different pollinators.

    PubMed

    Sedeek, Khalid E M; Scopece, Giovanni; Staedler, Yannick M; Schönenberger, Jürg; Cozzolino, Salvatore; Schiestl, Florian P; Schlüter, Philipp M

    2014-12-01

    High pollinator specificity and the potential for simple genetic changes to affect pollinator attraction make sexually deceptive orchids an ideal system for the study of ecological speciation, in which change of flower odour is likely important. This study surveys reproductive barriers and differences in floral phenotypes in a group of four closely related, coflowering sympatric Ophrys species and uses a genotyping-by-sequencing (GBS) approach to obtain information on the proportion of the genome that is differentiated between species. Ophrys species were found to effectively lack postpollination barriers, but are strongly isolated by their different pollinators (floral isolation) and, to a smaller extent, by shifts in flowering time (temporal isolation). Although flower morphology and perhaps labellum coloration may contribute to floral isolation, reproductive barriers may largely be due to differences in flower odour chemistry. GBS revealed shared polymorphism throughout the Ophrys genome, with very little population structure between species. Genome scans for FST outliers identified few markers that are highly differentiated between species and repeatable in several populations. These genome scans also revealed highly differentiated polymorphisms in genes with putative involvement in floral odour production, including a previously identified candidate gene thought to be involved in the biosynthesis of pseudo-pheromones by the orchid flowers. Taken together, these data suggest that ecological speciation associated with different pollinators in sexually deceptive orchids has a genic rather than a genomic basis, placing these species at an early phase of genomic divergence within the 'speciation continuum'. © 2014 The Authors. Molecular Ecology published by John Wiley & Sons Ltd.

  13. Splicing-Related Features of Introns Serve to Propel Evolution

    PubMed Central

    Luo, Yuping; Li, Chun; Gong, Xi; Wang, Yanlu; Zhang, Kunshan; Cui, Yaru; Sun, Yi Eve; Li, Siguang

    2013-01-01

    The role of spliceosomal intronic structures played in evolution has only begun to be elucidated. Comparative genomic analyses of fungal snoRNA sequences, which are often contained within introns and/or exons, revealed that about one-third of snoRNA-associated introns in three major snoRNA gene clusters manifested polymorphisms, likely resulting from intron loss and gain events during fungi evolution. Genomic deletions can clearly be observed as one mechanism underlying intron and exon loss, as well as generation of complex introns where several introns lie in juxtaposition without intercalating exons. Strikingly, by tracking conserved snoRNAs in introns, we found that some introns had moved from one position to another by excision from donor sites and insertion into target sties elsewhere in the genome without needing transposon structures. This study revealed the origin of many newly gained introns. Moreover, our analyses suggested that intron-containing sequences were more prone to sustainable structural changes than DNA sequences without introns due to intron's ability to jump within the genome via unknown mechanisms. We propose that splicing-related structural features of introns serve as an additional motor to propel evolution. PMID:23516505

  14. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    USDA-ARS?s Scientific Manuscript database

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  15. A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids

    USDA-ARS?s Scientific Manuscript database

    Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and ...

  16. Single nucleotide polymorphisms generated by genotyping by sequencing to characterize genome-wide diversity, linkage disequilibrium, and selective sweeps in cultivated watermelon

    USDA-ARS?s Scientific Manuscript database

    Large datasets containing single nucleotide polymorphisms (SNPs) are used to analyze genome-wide diversity in a robust collection of cultivars from representative accessions, across the world. The extent of linkage disequilibrium (LD) within a population determines the number of markers required fo...

  17. Genomic diversity of the human intestinal parasite Entamoeba histolytica

    PubMed Central

    2012-01-01

    Background Entamoeba histolytica is a significant cause of disease worldwide. However, little is known about the genetic diversity of the parasite. We re-sequenced the genomes of ten laboratory cultured lines of the eukaryotic pathogen Entamoeba histolytica in order to develop a picture of genetic diversity across the genome. Results The extreme nucleotide composition bias and repetitiveness of the E. histolytica genome provide a challenge for short-read mapping, yet we were able to define putative single nucleotide polymorphisms in a large portion of the genome. The results suggest a rather low level of single nucleotide diversity, although genes and gene families with putative roles in virulence are among the more polymorphic genes. We did observe large differences in coverage depth among genes, indicating differences in gene copy number between genomes. We found evidence indicating that recombination has occurred in the history of the sequenced genomes, suggesting that E. histolytica may reproduce sexually. Conclusions E. histolytica displays a relatively low level of nucleotide diversity across its genome. However, large differences in gene family content and gene copy number are seen among the sequenced genomes. The pattern of polymorphism indicates that E. histolytica reproduces sexually, or has done so in the past, which has previously been suggested but not proven. PMID:22630046

  18. Adaptive divergence in the monkey flower Mimulus guttatus is maintained by a chromosomal inversion

    PubMed Central

    Twyford, Alex D.; Friedman, Jannice

    2015-01-01

    Organisms exhibit an incredible diversity of life history strategies as adaptive responses to environmental variation. The establishment of novel life history strategies involves multilocus polymorphisms, which will be challenging to establish in the face of gene flow and recombination. Theory predicts that adaptive allelic combinations may be maintained and spread if they occur in genomic regions of reduced recombination, such as chromosomal inversion polymorphisms, yet empirical support for this prediction is lacking. Here, we use genomic data to investigate the evolution of divergent adaptive ecotypes of the yellow monkey flower Mimulus guttatus. We show that a large chromosomal inversion polymorphism is the major region of divergence between geographically widespread annual and perennial ecotypes. In contrast, ∼40,000 single nucleotide polymorphisms in collinear regions of the genome show no signal of life history, revealing genomic patterns of diversity have been shaped by localized homogenizing gene flow and large‐scale Pleistocene range expansion. Our results provide evidence for an inversion capturing and protecting loci involved in local adaptation, while also explaining how adaptive divergence can occur with gene flow. PMID:25879251

  19. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.

  20. A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

    PubMed Central

    Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

    2006-01-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  1. Construction of a High-Density American Cranberry (Vaccinium macrocarpon Ait.) Composite Map Using Genotyping-by-Sequencing for Multi-pedigree Linkage Mapping

    PubMed Central

    Schlautman, Brandon; Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Iorizzo, Massimo; Polashock, James; Grygleski, Edward; Vorsa, Nicholi; Zalapa, Juan

    2017-01-01

    The American cranberry (Vaccinium macrocarpon Ait.) is a recently domesticated, economically important, fruit crop with limited molecular resources. New genetic resources could accelerate genetic gain in cranberry through characterization of its genomic structure and by enabling molecular-assisted breeding strategies. To increase the availability of cranberry genomic resources, genotyping-by-sequencing (GBS) was used to discover and genotype thousands of single nucleotide polymorphisms (SNPs) within three interrelated cranberry full-sib populations. Additional simple sequence repeat (SSR) loci were added to the SNP datasets and used to construct bin maps for the parents of the populations, which were then merged to create the first high-density cranberry composite map containing 6073 markers (5437 SNPs and 636 SSRs) on 12 linkage groups (LGs) spanning 1124 cM. Interestingly, higher rates of recombination were observed in maternal than paternal gametes. The large number of markers in common (mean of 57.3) and the high degree of observed collinearity (mean Pair-wise Spearman rank correlations >0.99) between the LGs of the parental maps demonstrates the utility of GBS in cranberry for identifying polymorphic SNP loci that are transferable between pedigrees and populations in future trait-association studies. Furthermore, the high-density of markers anchored within the component maps allowed identification of segregation distortion regions, placement of centromeres on each of the 12 LGs, and anchoring of genomic scaffolds. Collectively, the results represent an important contribution to the current understanding of cranberry genomic structure and to the availability of molecular tools for future genetic research and breeding efforts in cranberry. PMID:28250016

  2. Mosaic genome structure of the barley powdery mildew pathogen and conservation of transcriptional programs in divergent hosts

    PubMed Central

    Hacquard, Stéphane; Kracher, Barbara; Maekawa, Takaki; Vernaldi, Saskia; Schulze-Lefert, Paul; Ver Loren van Themaat, Emiel

    2013-01-01

    Barley powdery mildew, Blumeria graminis f. sp. hordei (Bgh), is an obligate biotrophic ascomycete fungal pathogen that can grow and reproduce only on living cells of wild or domesticated barley (Hordeum sp.). Domestication and deployment of resistant barley cultivars by humans selected for amplification of Bgh isolates with different virulence combinations. We sequenced the genomes of two European Bgh isolates, A6 and K1, for comparative analysis with the reference genome of isolate DH14. This revealed a mosaic genome structure consisting of large isolate-specific DNA blocks with either high or low SNP densities. Some of the highly polymorphic blocks likely accumulated SNPs for over 10,000 years, well before the domestication of barley. These isolate-specific blocks of alternating monomorphic and polymorphic regions imply an exceptionally large standing genetic variation in the Bgh population and might be generated and maintained by rare outbreeding and frequent clonal reproduction. RNA-sequencing experiments with isolates A6 and K1 during four early stages of compatible and incompatible interactions on leaves of partially immunocompromised Arabidopsis mutants revealed a conserved Bgh transcriptional program during pathogenesis compared with the natural host barley despite ∼200 million years of reproductive isolation of these hosts. Transcripts encoding candidate-secreted effector proteins are massively induced in successive waves. A specific decrease in candidate-secreted effector protein transcript abundance in the incompatible interaction follows extensive transcriptional reprogramming of the host transcriptome and coincides with the onset of localized host cell death, suggesting a host-inducible defense mechanism that targets fungal effector secretion or production. PMID:23696672

  3. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium.

    PubMed

    Pajuelo, Mónica J; Eguiluz, María; Dahlstrom, Eric; Requena, David; Guzmán, Frank; Ramirez, Manuel; Sheen, Patricia; Frace, Michael; Sammons, Scott; Cama, Vitaliano; Anzick, Sarah; Bruno, Dan; Mahanty, Siddhartha; Wilkins, Patricia; Nash, Theodore; Gonzalez, Armando; García, Héctor H; Gilman, Robert H; Porcella, Steve; Zimic, Mirko

    2015-12-01

    Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen. For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS) and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples. The predicted size of the hybrid (proglottid genome combined with cyst genome) T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt) were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites. The availability of draft genomes for T. solium represents a significant step towards the understanding the biology of the parasite. We report here a set of T. solium polymorphic microsatellite markers that appear promising for genetic epidemiology studies.

  4. The Mouse Genomes Project: a repository of inbred laboratory mouse strain genomes.

    PubMed

    Adams, David J; Doran, Anthony G; Lilue, Jingtao; Keane, Thomas M

    2015-10-01

    The Mouse Genomes Project was initiated in 2009 with the goal of using next-generation sequencing technologies to catalogue molecular variation in the common laboratory mouse strains, and a selected set of wild-derived inbred strains. The initial sequencing and survey of sequence variation in 17 inbred strains was completed in 2011 and included comprehensive catalogue of single nucleotide polymorphisms, short insertion/deletions, larger structural variants including their fine scale architecture and landscape of transposable element variation, and genomic sites subject to post-transcriptional alteration of RNA. From this beginning, the resource has expanded significantly to include 36 fully sequenced inbred laboratory mouse strains, a refined and updated data processing pipeline, and new variation querying and data visualisation tools which are available on the project's website ( http://www.sanger.ac.uk/resources/mouse/genomes/ ). The focus of the project is now the completion of de novo assembled chromosome sequences and strain-specific gene structures for the core strains. We discuss how the assembled chromosomes will power comparative analysis, data access tools and future directions of mouse genetics.

  5. Inverse Correlation of Population Similarity and Introduction Date for Invasive Ascidians

    PubMed Central

    Silva, Nathan; Smith, William C.

    2008-01-01

    The genomes of many marine invertebrates, including the purple sea urchin and the solitary ascidians Ciona intestinalis and Ciona savignyi, show exceptionally high levels of heterozygosity, implying that these populations are highly polymorphic. Analysis of the C. savignyi genome found little evidence to support an elevated mutation rate, but rather points to a large population size contributing to the polymorphism level. In the present study, the relative genetic polymorphism levels in sampled populations of ten different ascidian species were determined using a similarity index generated by AFLP analysis. The goal was to determine the range of polymorphism within the populations of different species, and to uncover factors that may contribute to the high level of polymorphism. We observe that, surprisingly, the levels of polymorphism within these species show a negative correlation with the reported age of invasive populations, and that closely related species show substantially different levels of genetic polymorphism. These findings show exceptions to the assumptions that invasive species start with a low level of genetic polymorphism that increases over time and that closely related species have similar levels of genetic polymorphism. PMID:18575620

  6. Isolation and characterization of polymorphic microsatellite loci in Spondias radlkoferi (Anacardiaceae)1

    PubMed Central

    Aguilar-Barajas, Esther; Sork, Victoria L.; González-Zamora, Arturo; Rocha-Ramírez, Víctor; Arroyo-Rodríguez, Víctor; Oyama, Ken

    2014-01-01

    • Premise of the study: Microsatellite markers were developed for Spondias radlkoferi to assess the impact of primate seed dispersal on the genetic diversity and structure of this important tree species of Anacardiaceae. • Methods and Results: Fourteen polymorphic loci were isolated from S. radlkoferi through 454 GS-FLX Titanium pyrosequencing of genomic DNA. The number of alleles ranged from three to 12. The observed and expected heterozygosities ranged from 0.382 to 1.00 and from 0.353 to 0.733, respectively. The amplification was also successful in S. mombin and two genera of Anacardiaceae: Rhus aromatica and Toxicodendron radicans. • Conclusions: These microsatellite loci will be useful to assess the genetic diversity and population structure of S. radlkoferi and related species, and will allow us to investigate the effects of seed dispersal by spider monkeys (Ateles geoffroyi) on the genetic structure and diversity of S. radlkoferi populations in a fragmented rainforest. PMID:25383270

  7. Drivers of genetic diversity in secondary metabolic gene clusters within a fungal species

    PubMed Central

    Lind, Abigail L.; Wisecaver, Jennifer H.; Lameiras, Catarina; Wiemann, Philipp; Palmer, Jonathan M.; Keller, Nancy P.; Rodrigues, Fernando; Goldman, Gustavo H.

    2017-01-01

    Filamentous fungi produce a diverse array of secondary metabolites (SMs) critical for defense, virulence, and communication. The metabolic pathways that produce SMs are found in contiguous gene clusters in fungal genomes, an atypical arrangement for metabolic pathways in other eukaryotes. Comparative studies of filamentous fungal species have shown that SM gene clusters are often either highly divergent or uniquely present in one or a handful of species, hampering efforts to determine the genetic basis and evolutionary drivers of SM gene cluster divergence. Here, we examined SM variation in 66 cosmopolitan strains of a single species, the opportunistic human pathogen Aspergillus fumigatus. Investigation of genome-wide within-species variation revealed 5 general types of variation in SM gene clusters: nonfunctional gene polymorphisms; gene gain and loss polymorphisms; whole cluster gain and loss polymorphisms; allelic polymorphisms, in which different alleles corresponded to distinct, nonhomologous clusters; and location polymorphisms, in which a cluster was found to differ in its genomic location across strains. These polymorphisms affect the function of representative A. fumigatus SM gene clusters, such as those involved in the production of gliotoxin, fumigaclavine, and helvolic acid as well as the function of clusters with undefined products. In addition to enabling the identification of polymorphisms, the detection of which requires extensive genome-wide synteny conservation (e.g., mobile gene clusters and nonhomologous cluster alleles), our approach also implicated multiple underlying genetic drivers, including point mutations, recombination, and genomic deletion and insertion events as well as horizontal gene transfer from distant fungi. Finally, most of the variants that we uncover within A. fumigatus have been previously hypothesized to contribute to SM gene cluster diversity across entire fungal classes and phyla. We suggest that the drivers of genetic diversity operating within a fungal species shown here are sufficient to explain SM cluster macroevolutionary patterns. PMID:29149178

  8. Evolutionary genomics of miniature inverted-repeat transposable elements (MITEs) in Brassica.

    PubMed

    Nouroz, Faisal; Noreen, Shumaila; Heslop-Harrison, J S

    2015-12-01

    Miniature inverted-repeat transposable elements (MITEs) are truncated derivatives of autonomous DNA transposons, and are dispersed abundantly in most eukaryotic genomes. We aimed to characterize various MITEs families in Brassica in terms of their presence, sequence characteristics and evolutionary activity. Dot plot analyses involving comparison of homoeologous bacterial artificial chromosome (BAC) sequences allowed identification of 15 novel families of mobile MITEs. Of which, 5 were Stowaway-like with TA Target Site Duplications (TSDs), 4 Tourist-like with TAA/TTA TSDs, 5 Mutator-like with 9-10 bp TSDs and 1 novel MITE (BoXMITE1) flanked by 3 bp TSDs. Our data suggested that there are about 30,000 MITE-related sequences in Brassica rapa and B. oleracea genomes. In situ hybridization showed one abundant family was dispersed in the A-genome, while another was located near 45S rDNA sites. PCR analysis using primers flanking sequences of MITE elements detected MITE insertion polymorphisms between and within the three Brassica (AA, BB, CC) genomes, with many insertions being specific to single genomes and others showing evidence of more recent evolutionary insertions. Our BAC sequence comparison strategy enables identification of evolutionarily active MITEs with no prior knowledge of MITE sequences. The details of MITE families reported in Brassica enable their identification, characterization and annotation. Insertion polymorphisms of MITEs and their transposition activity indicated important mechanism of genome evolution and diversification. MITE families derived from known Mariner, Harbinger and Mutator DNA transposons were discovered, as well as some novel structures. The identification of Brassica MITEs will have broad applications in Brassica genomics, breeding, hybridization and phylogeny through their use as DNA markers.

  9. Whole-genome single-nucleotide polymorphism (SNP) marker discovery and association analysis with the eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content in Larimichthys crocea

    PubMed Central

    Xiao, Shijun; Wang, Panpan; Dong, Linsong; Zhang, Yaguang; Han, Zhaofang; Wang, Qiurong

    2016-01-01

    Whole-genome single-nucleotide polymorphism (SNP) markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS) provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms. PMID:28028455

  10. A genomic approach for isolating chloroplast microsatellite markers for Pachyptera kerere (Bignoniaceae)1

    PubMed Central

    Francisco, Jessica N. C.; Nazareno, Alison G.; Lohmann, Lúcia G.

    2016-01-01

    Premise of the study: In this study, we developed chloroplast microsatellite markers (cpSSRs) for Pachyptera kerere (Bignoniaceae) to investigate the population structure and genetic diversity of this species. Methods and Results: We used Illumina HiSeq data to reconstruct the chloroplast genome of P. kerere by a combination of de novo and reference-guided assembly. We then used the chloroplast genome to develop a set of cpSSRs from intergenic regions. Overall, 24 primer pairs were designed, 21 of which amplified successfully and were polymorphic, presenting three to nine alleles per locus. The unbiased haploid diversity per locus varied from 0.207 (Pac28) to 0.817 (Pac04). All but one locus amplified for all other taxa of Pachyptera. Conclusions: The markers reported here will serve as a basis for studies to assess the genetic structure and phylogeographic history of Pachyptera. PMID:27672522

  11. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species

    PubMed Central

    Wang, Jing; Street, Nathaniel R.; Scofield, Douglas G.; Ingvarsson, Pär K.

    2016-01-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. PMID:26721855

  12. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    PubMed

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  13. Genetic Diversity and Population Structure of F3:6 Nebraska Winter Wheat Genotypes Using Genotyping-By-Sequencing.

    PubMed

    Eltaher, Shamseldeen; Sallam, Ahmed; Belamkar, Vikas; Emara, Hamdy A; Nower, Ahmed A; Salem, Khaled F M; Poland, Jesse; Baenziger, Peter S

    2018-01-01

    The availability of information on the genetic diversity and population structure in wheat ( Triticum aestivum L.) breeding lines will help wheat breeders to better use their genetic resources and manage genetic variation in their breeding program. The recent advances in sequencing technology provide the opportunity to identify tens or hundreds of thousands of single nucleotide polymorphism (SNPs) in large genome species (e.g., wheat). These SNPs can be utilized for understanding genetic diversity and performing genome wide association studies (GWAS) for complex traits. In this study, the genetic diversity and population structure were investigated in a set of 230 genotypes (F 3:6 ) derived from various crosses as a prerequisite for GWAS and genomic selection. Genotyping-by-sequencing provided 25,566 high-quality SNPs. The polymorphism information content (PIC) across chromosomes ranged from 0.09 to 0.37 with an average of 0.23. The distribution of SNPs markers on the 21 chromosomes ranged from 319 on chromosome 3D to 2,370 on chromosome 3B. The analysis of population structure revealed three subpopulations (G1, G2, and G3). Analysis of molecular variance identified 8% variance among and 92% within subpopulations. Of the three subpopulations, G2 had the highest level of genetic diversity based on three genetic diversity indices: Shannon's information index ( I ) = 0.494, diversity index ( h ) = 0.328 and unbiased diversity index (uh) = 0.331, while G3 had lowest level of genetic diversity ( I = 0.348, h = 0.226 and uh = 0.236). This high genetic diversity identified among the subpopulations can be used to develop new wheat cultivars.

  14. Genetic Diversity and Population Structure of F3:6 Nebraska Winter Wheat Genotypes Using Genotyping-By-Sequencing

    PubMed Central

    Eltaher, Shamseldeen; Sallam, Ahmed; Belamkar, Vikas; Emara, Hamdy A.; Nower, Ahmed A.; Salem, Khaled F. M.; Poland, Jesse; Baenziger, Peter S.

    2018-01-01

    The availability of information on the genetic diversity and population structure in wheat (Triticum aestivum L.) breeding lines will help wheat breeders to better use their genetic resources and manage genetic variation in their breeding program. The recent advances in sequencing technology provide the opportunity to identify tens or hundreds of thousands of single nucleotide polymorphism (SNPs) in large genome species (e.g., wheat). These SNPs can be utilized for understanding genetic diversity and performing genome wide association studies (GWAS) for complex traits. In this study, the genetic diversity and population structure were investigated in a set of 230 genotypes (F3:6) derived from various crosses as a prerequisite for GWAS and genomic selection. Genotyping-by-sequencing provided 25,566 high-quality SNPs. The polymorphism information content (PIC) across chromosomes ranged from 0.09 to 0.37 with an average of 0.23. The distribution of SNPs markers on the 21 chromosomes ranged from 319 on chromosome 3D to 2,370 on chromosome 3B. The analysis of population structure revealed three subpopulations (G1, G2, and G3). Analysis of molecular variance identified 8% variance among and 92% within subpopulations. Of the three subpopulations, G2 had the highest level of genetic diversity based on three genetic diversity indices: Shannon’s information index (I) = 0.494, diversity index (h) = 0.328 and unbiased diversity index (uh) = 0.331, while G3 had lowest level of genetic diversity (I = 0.348, h = 0.226 and uh = 0.236). This high genetic diversity identified among the subpopulations can be used to develop new wheat cultivars. PMID:29593779

  15. Transcriptionally active LTR retrotransposons in Eucalyptus genus are differentially expressed and insertionally polymorphic.

    PubMed

    Marcon, Helena Sanches; Domingues, Douglas Silva; Silva, Juliana Costa; Borges, Rafael Junqueira; Matioli, Fábio Filippi; Fontes, Marcos Roberto de Mattos; Marino, Celso Luis

    2015-08-14

    In Eucalyptus genus, studies on genome composition and transposable elements (TEs) are particularly scarce. Nearly half of the recently released Eucalyptus grandis genome is composed by retrotransposons and this data provides an important opportunity to understand TE dynamics in Eucalyptus genome and transcriptome. We characterized nine families of transcriptionally active LTR retrotransposons from Copia and Gypsy superfamilies in Eucalyptus grandis genome and we depicted genomic distribution and copy number in two Eucalyptus species. We also evaluated genomic polymorphism and transcriptional profile in three organs of five Eucalyptus species. We observed contrasting genomic and transcriptional behavior in the same family among different species. RLC_egMax_1 was the most prevalent family and RLC_egAngela_1 was the family with the lowest copy number. Most families of both superfamilies have their insertions occurring <3 million years, except one Copia family, RLC_egBianca_1. Protein theoretical models suggest different properties between Copia and Gypsy domains. IRAP and REMAP markers suggested genomic polymorphisms among Eucalyptus species. Using EST analysis and qRT-PCRs, we observed transcriptional activity in several tissues and in all evaluated species. In some families, osmotic stress increases transcript values. Our strategy was successful in isolating transcriptionally active retrotransposons in Eucalyptus, and each family has a particular genomic and transcriptional pattern. Overall, our results show that retrotransposon activity have differentially affected genome and transcriptome among Eucalyptus species.

  16. Design of a 9K illumina BeadChip for polar bears (Ursus maritimus) from RAD and transcriptome sequencing.

    PubMed

    Malenfant, René M; Coltman, David W; Davis, Corey S

    2015-05-01

    Single-nucleotide polymorphisms (SNPs) offer numerous advantages over anonymous markers such as microsatellites, including improved estimation of population parameters, finer-scale resolution of population structure and more precise genomic dissection of quantitative traits. However, many SNPs are needed to equal the resolution of a single microsatellite, and reliable large-scale genotyping of SNPs remains a challenge in nonmodel species. Here, we document the creation of a 9K Illumina Infinium BeadChip for polar bears (Ursus maritimus), which will be used to investigate: (i) the fine-scale population structure among Canadian polar bears and (ii) the genomic architecture of phenotypic traits in the Western Hudson Bay subpopulation. To this end, we used restriction-site associated DNA (RAD) sequencing from 38 bears across their circumpolar range, as well as blood/fat transcriptome sequencing of 10 individuals from Western Hudson Bay. Six-thousand RAD SNPs and 3000 transcriptomic SNPs were selected for the chip, based primarily on genomic spacing and gene function respectively. Of the 9000 SNPs ordered from Illumina, 8042 were successfully printed, and - after genotyping 1450 polar bears - 5441 of these SNPs were found to be well clustered and polymorphic. Using this array, we show rapid linkage disequilibrium decay among polar bears, we demonstrate that in a subsample of 78 individuals, our SNPs detect known genetic structure more clearly than 24 microsatellites genotyped for the same individuals and that these results are not driven by the SNP ascertainment scheme. Here, we present one of the first large-scale genotyping resources designed for a threatened species. © 2014 John Wiley & Sons Ltd.

  17. Complex Population Structure and Virulence Differences among Serotype 2 Streptococcus suis Strains Belonging to Sequence Type 28

    PubMed Central

    Athey, Taryn B. T.; Auger, Jean-Philippe; Teatero, Sarah; Dumesnil, Audrey; Takamatsu, Daisuke; Wasserscheid, Jessica; Dewar, Ken; Gottschalk, Marcelo; Fittipaldi, Nahuel

    2015-01-01

    Streptococcus suis is a major swine pathogen and a zoonotic agent. Serotype 2 strains are the most frequently associated with disease. However, not all serotype 2 lineages are considered virulent. Indeed, sequence type (ST) 28 serotype 2 S. suis strains have been described as a homogeneous group of low virulence. However, ST28 strains are often isolated from diseased swine in some countries, and at least four human ST28 cases have been reported. Here, we used whole-genome sequencing and animal infection models to test the hypothesis that the ST28 lineage comprises strains of different genetic backgrounds and different virulence. We used 50 S. suis ST28 strains isolated in Canada, the United States and Japan from diseased pigs, and one ST28 strain from a human case isolated in Thailand. We report a complex population structure among the 51 ST28 strains. Diversity resulted from variable gene content, recombination events and numerous genome-wide polymorphisms not attributable to recombination. Phylogenetic analysis using core genome single-nucleotide polymorphisms revealed four discrete clades with strong geographic structure, and a fifth clade formed by US, Thai and Japanese strains. When tested in experimental animal models, strains from this latter clade were significantly more virulent than a Canadian ST28 reference strain, and a closely related Canadian strain. Our results highlight the limitations of MLST for both phylogenetic analysis and virulence prediction and raise concerns about the possible emergence of ST28 strains in human clinical cases. PMID:26375680

  18. Molecular Identification of Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) Markers.

    PubMed

    Al-Khalifah, Nasser S; Shanavaskhan, A E

    2017-01-01

    Ambiguity in the total number of date palm cultivars across the world is pointing toward the necessity for an enumerative study using standard morphological and molecular markers. Among molecular markers, DNA markers are more suitable and ubiquitous to most applications. They are highly polymorphic in nature, frequently occurring in genomes, easy to access, and highly reproducible. Various molecular markers such as restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), simple sequence repeats (SSR), inter-simple sequence repeats (ISSR), and random amplified polymorphic DNA (RAPD) markers have been successfully used as efficient tools for analysis of genetic variation in date palm. This chapter explains a stepwise protocol for extracting total genomic DNA from date palm leaves. A user-friendly protocol for RAPD analysis and a table showing the primers used in different molecular techniques that produce polymorphisms in date palm are also provided.

  19. [Polymorphic loci and polymorphism analysis of short tandem repeats within XNP gene].

    PubMed

    Liu, Qi-Ji; Gong, Yao-Qin; Guo, Chen-Hong; Chen, Bing-Xi; Li, Jiang-Xia; Guo, Yi-Shou

    2002-01-01

    To select polymorphic short tandem repeat markers within X-linked nuclear protein (XNP) gene, genomic clones which contain XNP gene were recognized by homologous analysis with XNP cDNA. By comparing the cDNA with genomic DNA, non-exonic sequences were identified, and short tandem repeats were selected from non-exonic sequences by using BCM search Launcher. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five short tandem repeats were identified from XNP gene, two of which were polymorphic. Four and 11 alleles were observed in Chinese population for XNPSTR1 and XNPSTR4, respectively. Heterozygosities were 47% for XNPSTR1 and 70% for XNPSTR4. XNPSTR1 and XNPSTR4 localized within 3' end and intron 10, respectively. Two polymorphic short tandem repeats have been identified within XNP gene and will be useful for linkage analysis and gene diagnosis of XNP gene.

  20. Adaptive potential of genomic structural variation in human and mammalian evolution.

    PubMed

    Radke, David W; Lee, Charles

    2015-09-01

    Because phenotypic innovations must be genetically heritable for biological evolution to proceed, it is natural to consider new mutation events as well as standing genetic variation as sources for their birth. Previous research has identified a number of single-nucleotide polymorphisms that underlie a subset of adaptive traits in organisms. However, another well-known class of variation, genomic structural variation, could have even greater potential to produce adaptive phenotypes, due to the variety of possible types of alterations (deletions, insertions, duplications, among others) at different genomic positions and with variable lengths. It is from these dramatic genomic alterations, and selection on their phenotypic consequences, that adaptations leading to biological diversification could be derived. In this review, using studies in humans and other mammals, we highlight examples of how phenotypic variation from structural variants might become adaptive in populations and potentially enable biological diversification. Phenotypic change arising from structural variants will be described according to their immediate effect on organismal metabolic processes, immunological response and physical features. Study of population dynamics of segregating structural variation can therefore provide a window into understanding current and historical biological diversification. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  1. Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha)

    PubMed Central

    Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E

    2014-01-01

    Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338

  2. Genetic structure of Eurasian and North American Leymus (Triticeae) wildryes assessed by chloroplast DNA sequences and AFLP profiles

    Treesearch

    C. Mae Culumber; Steve R. Larson; Kevin B. Jensen; Thomas A. Jones

    2011-01-01

    Leymus is a genomically defined allopolyploid of genus Triticeae with two distinct subgenomes. Chloroplast DNA sequences of Eurasian and North American species are distinct and polyphyletic. However, phylogenies derived from chloroplast and nuclear DNA sequences are confounded by polyploidy and lack of polymorphism among many taxa. The AFLP technique can resolve...

  3. Re-annotation, improved large-scale assembly and establishment of a catalogue of noncoding loci for the genome of the model brown alga Ectocarpus.

    PubMed

    Cormier, Alexandre; Avia, Komlan; Sterck, Lieven; Derrien, Thomas; Wucher, Valentin; Andres, Gwendoline; Monsoor, Misharl; Godfroy, Olivier; Lipinska, Agnieszka; Perrineau, Marie-Mathilde; Van De Peer, Yves; Hitte, Christophe; Corre, Erwan; Coelho, Susana M; Cock, J Mark

    2017-04-01

    The genome of the filamentous brown alga Ectocarpus was the first to be completely sequenced from within the brown algal group and has served as a key reference genome both for this lineage and for the stramenopiles. We present a complete structural and functional reannotation of the Ectocarpus genome. The large-scale assembly of the Ectocarpus genome was significantly improved and genome-wide gene re-annotation using extensive RNA-seq data improved the structure of 11 108 existing protein-coding genes and added 2030 new loci. A genome-wide analysis of splicing isoforms identified an average of 1.6 transcripts per locus. A large number of previously undescribed noncoding genes were identified and annotated, including 717 loci that produce long noncoding RNAs. Conservation of lncRNAs between Ectocarpus and another brown alga, the kelp Saccharina japonica, suggests that at least a proportion of these loci serve a function. Finally, a large collection of single nucleotide polymorphism-based markers was developed for genetic analyses. These resources are available through an updated and improved genome database. This study significantly improves the utility of the Ectocarpus genome as a high-quality reference for the study of many important aspects of brown algal biology and as a reference for genomic analyses across the stramenopiles. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  4. Parasitism and the retrotransposon life cycle in plants: a hitchhiker's guide to the genome.

    PubMed

    Sabot, F; Schulman, A H

    2006-12-01

    LTR (long terminal repeat) retrotransposons are the main components of higher plant genomic DNA. They have shaped their host genomes through insertional mutagenesis and by effects on genome size, gene expression and recombination. These Class I transposable elements are closely related to retroviruses such as the HIV by their structure and presumptive life cycle. However, the retrotransposon life cycle has been closely investigated in few systems. For retroviruses and retrotransposons, individual defective copies can parasitize the activity of functional ones. However, some LTR retrotransposon groups as a whole, such as large retrotransposon derivatives and terminal repeats in miniature, are non-autonomous even though their genomic insertion patterns remain polymorphic between organismal accessions. Here, we examine what is known of the retrotransposon life cycle in plants, and in that context discuss the role of parasitism and complementation between and within retrotransposon groups.

  5. NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data

    PubMed Central

    Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

    2016-01-01

    The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data. PMID:26848255

  6. Translating Mendelian and complex inheritance of Alzheimer's disease genes for predicting unique personal genome variants

    PubMed Central

    Regan, Kelly; Wang, Kanix; Doughty, Emily; Li, Haiquan; Li, Jianrong; Lee, Younghee; Kann, Maricel G

    2012-01-01

    Objective Although trait-associated genes identified as complex versus single-gene inheritance differ substantially in odds ratio, the authors nonetheless posit that their mechanistic concordance can reveal fundamental properties of the genetic architecture, allowing the automated interpretation of unique polymorphisms within a personal genome. Materials and methods An analytical method, SPADE-gen, spanning three biological scales was developed to demonstrate the mechanistic concordance between Mendelian and complex inheritance of Alzheimer's disease (AD) genes: biological functions (BP), protein interaction modeling, and protein domain implicated in the disease-associated polymorphism. Results Among Gene Ontology (GO) biological processes (BP) enriched at a false detection rate <5% in 15 AD genes of Mendelian inheritance (Online Mendelian Inheritance in Man) and independently in those of complex inheritance (25 host genes of intragenic AD single-nucleotide polymorphisms confirmed in genome-wide association studies), 16 overlapped (empirical p=0.007) and 45 were similar (empirical p<0.009; information theory). SPAN network modeling extended the canonical pathway of AD (KEGG) with 26 new protein interactions (empirical p<0.0001). Discussion The study prioritized new AD-associated biological mechanisms and focused the analysis on previously unreported interactions associated with the biological processes of polymorphisms that affect specific protein domains within characterized AD genes and their direct interactors using (1) concordant GO-BP and (2) domain interactions within STRING protein–protein interactions corresponding to the genomic location of the AD polymorphism (eg, EPHA1, APOE, and CD2AP). Conclusion These results are in line with unique-event polymorphism theory, indicating how disease-associated polymorphisms of Mendelian or complex inheritance relate genetically to those observed as ‘unique personal variants’. They also provide insight for identifying novel targets, for repositioning drugs, and for personal therapeutics. PMID:22319180

  7. Meta-analysis of genome-wide association studies identifies common susceptibility polymorphisms for colorectal and endometrial cancer near SH2B3 and TSHZ1.

    PubMed

    Cheng, Timothy H T; Thompson, Deborah; Painter, Jodie; O'Mara, Tracy; Gorman, Maggie; Martin, Lynn; Palles, Claire; Jones, Angela; Buchanan, Daniel D; Win, Aung Ko; Hopper, John; Jenkins, Mark; Lindor, Noralane M; Newcomb, Polly A; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Giles, Graham G; Pharoah, Paul; Peto, Julian; Cox, Angela; Swerdlow, Anthony; Couch, Fergus; Cunningham, Julie M; Goode, Ellen L; Winham, Stacey J; Lambrechts, Diether; Fasching, Peter; Burwinkel, Barbara; Brenner, Hermann; Brauch, Hiltrud; Chang-Claude, Jenny; Salvesen, Helga B; Kristensen, Vessela; Darabi, Hatef; Li, Jingmei; Liu, Tao; Lindblom, Annika; Hall, Per; de Polanco, Magdalena Echeverry; Sans, Monica; Carracedo, Angel; Castellvi-Bel, Sergi; Rojas-Martinez, Augusto; Aguiar Jnr, Samuel; Teixeira, Manuel R; Dunning, Alison M; Dennis, Joe; Otton, Geoffrey; Proietto, Tony; Holliday, Elizabeth; Attia, John; Ashton, Katie; Scott, Rodney J; McEvoy, Mark; Dowdy, Sean C; Fridley, Brooke L; Werner, Henrica M J; Trovik, Jone; Njolstad, Tormund S; Tham, Emma; Mints, Miriam; Runnebaum, Ingo; Hillemanns, Peter; Dörk, Thilo; Amant, Frederic; Schrauwen, Stefanie; Hein, Alexander; Beckmann, Matthias W; Ekici, Arif; Czene, Kamila; Meindl, Alfons; Bolla, Manjeet K; Michailidou, Kyriaki; Tyrer, Jonathan P; Wang, Qin; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Annibali, Daniela; Depreeuw, Jeroen; Al-Tassan, Nada A; Harris, Rebecca; Meyer, Brian F; Whiffin, Nicola; Hosking, Fay J; Kinnersley, Ben; Farrington, Susan M; Timofeeva, Maria; Tenesa, Albert; Campbell, Harry; Haile, Robert W; Hodgson, Shirley; Carvajal-Carmona, Luis; Cheadle, Jeremy P; Easton, Douglas; Dunlop, Malcolm; Houlston, Richard; Spurdle, Amanda; Tomlinson, Ian

    2015-12-01

    High-risk mutations in several genes predispose to both colorectal cancer (CRC) and endometrial cancer (EC). We therefore hypothesised that some lower-risk genetic variants might also predispose to both CRC and EC. Using CRC and EC genome-wide association series, totalling 13,265 cancer cases and 40,245 controls, we found that the protective allele [G] at one previously-identified CRC polymorphism, rs2736100 near TERT, was associated with EC risk (odds ratio (OR) = 1.08, P = 0.000167); this polymorphism influences the risk of several other cancers. A further CRC polymorphism near TERC also showed evidence of association with EC (OR = 0.92; P = 0.03). Overall, however, there was no good evidence that the set of CRC polymorphisms was associated with EC risk, and neither of two previously-reported EC polymorphisms was associated with CRC risk. A combined analysis revealed one genome-wide significant polymorphism, rs3184504, on chromosome 12q24 (OR = 1.10, P = 7.23 × 10(-9)) with shared effects on CRC and EC risk. This polymorphism, a missense variant in the gene SH2B3, is also associated with haematological and autoimmune disorders, suggesting that it influences cancer risk through the immune response. Another polymorphism, rs12970291 near gene TSHZ1, was associated with both CRC and EC (OR = 1.26, P = 4.82 × 10(-8)), with the alleles showing opposite effects on the risks of the two cancers.

  8. ReadXplorer—visualization and analysis of mapped sequences

    PubMed Central

    Hilker, Rolf; Stadermann, Kai Bernd; Doppmeier, Daniel; Kalinowski, Jörn; Stoye, Jens; Straube, Jasmin; Winnebald, Jörn; Goesmann, Alexander

    2014-01-01

    Motivation: Fast algorithms and well-arranged visualizations are required for the comprehensive analysis of the ever-growing size of genomic and transcriptomic next-generation sequencing data. Results: ReadXplorer is a software offering straightforward visualization and extensive analysis functions for genomic and transcriptomic DNA sequences mapped on a reference. A unique specialty of ReadXplorer is the quality classification of the read mappings. It is incorporated in all analysis functions and displayed in ReadXplorer's various synchronized data viewers for (i) the reference sequence, its base coverage as (ii) normalizable plot and (iii) histogram, (iv) read alignments and (v) read pairs. ReadXplorer's analysis capability covers RNA secondary structure prediction, single nucleotide polymorphism and deletion–insertion polymorphism detection, genomic feature and general coverage analysis. Especially for RNA-Seq data, it offers differential gene expression analysis, transcription start site and operon detection as well as RPKM value and read count calculations. Furthermore, ReadXplorer can combine or superimpose coverage of different datasets. Availability and implementation: ReadXplorer is available as open-source software at http://www.readxplorer.org along with a detailed manual. Contact: rhilker@mikrobio.med.uni-giessen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24790157

  9. Contrasting Patterns of rDNA Homogenization within the Zygosaccharomyces rouxii Species Complex

    PubMed Central

    Chand Dakal, Tikam; Giudici, Paolo; Solieri, Lisa

    2016-01-01

    Arrays of repetitive ribosomal DNA (rDNA) sequences are generally expected to evolve as a coherent family, where repeats within such a family are more similar to each other than to orthologs in related species. The continuous homogenization of repeats within individual genomes is a recombination process termed concerted evolution. Here, we investigated the extent and the direction of concerted evolution in 43 yeast strains of the Zygosaccharomyces rouxii species complex (Z. rouxii, Z. sapae, Z. mellis), by analyzing two portions of the 35S rDNA cistron, namely the D1/D2 domains at the 5’ end of the 26S rRNA gene and the segment including the internal transcribed spacers (ITS) 1 and 2 (ITS regions). We demonstrate that intra-genomic rDNA sequence variation is unusually frequent in this clade and that rDNA arrays in single genomes consist of an intermixing of Z. rouxii, Z. sapae and Z. mellis-like sequences, putatively evolved by reticulate evolutionary events that involved repeated hybridization between lineages. The levels and distribution of sequence polymorphisms vary across rDNA repeats in different individuals, reflecting four patterns of rDNA evolution: I) rDNA repeats that are homogeneous within a genome but are chimeras derived from two parental lineages via recombination: Z. rouxii in the ITS region and Z. sapae in the D1/D2 region; II) intra-genomic rDNA repeats that retain polymorphisms only in ITS regions; III) rDNA repeats that vary only in their D1/D2 domains; IV) heterogeneous rDNA arrays that have both polymorphic ITS and D1/D2 regions. We argue that an ongoing process of homogenization following allodiplodization or incomplete lineage sorting gave rise to divergent evolutionary trajectories in different strains, depending upon temporal, structural and functional constraints. We discuss the consequences of these findings for Zygosaccharomyces species delineation and, more in general, for yeast barcoding. PMID:27501051

  10. Common variants at 1p36 are associated with superior frontal gyrus volume.

    PubMed

    Hashimoto, R; Ikeda, M; Yamashita, F; Ohi, K; Yamamori, H; Yasuda, Y; Fujimoto, M; Fukunaga, M; Nemoto, K; Takahashi, T; Tochigi, M; Onitsuka, T; Yamasue, H; Matsuo, K; Iidaka, T; Iwata, N; Suzuki, M; Takeda, M; Kasai, K; Ozaki, N

    2014-10-21

    The superior frontal gyrus (SFG), an area of the brain frequently found to have reduced gray matter in patients with schizophrenia, is involved in self-awareness and emotion, which are impaired in schizophrenia. However, no genome-wide association studies of SFG volume have investigated in patients with schizophrenia. To identify single-nucleotide polymorphisms (SNPs) associated with SFG volumes, we demonstrated a genome-wide association study (GWAS) of gray matter volumes in the right or left SFG of 158 patients with schizophrenia and 378 healthy subjects. We attempted to bioinformatically ascertain the potential effects of the top hit polymorphism on the expression levels of genes at the genome-wide region. We found associations between five variants on 1p36.12 and the right SFG volume at a widely used benchmark for genome-wide significance (P<5.0 × 10(-8)). The strongest association was observed at rs4654899, an intronic SNP in the eukaryotic translation initiation factor 4 gamma, 3 (EIF4G3) gene on 1p36.12 (P=7.5 × 10(-9)). No SNP with genome-wide significance was found in the volume of the left SFG (P>5.0 × 10(-8)); however, the rs4654899 polymorphism was identified as the locus with the second strongest association with the volume of the left SFG (P=1.5 × 10(-6)). In silico analyses revealed a proxy SNP of rs4654899 had effect on gene expression of two genes, HP1BP3 lying 3' to EIF4G3 (P=7.8 × 10(-6)) and CAPN14 at 2p (P=6.3 × 10(-6)), which are expressed in moderate-to-high levels throughout the adult human SFG. These results contribute to understand genetic architecture of a brain structure possibly linked to the pathophysiology of schizophrenia.

  11. Studying Genome Heterogeneity within the Arbuscular Mycorrhizal Fungal Cytoplasm

    PubMed Central

    Halary, Sébastien; Bapteste, Eric; Hijri, Mohamed

    2015-01-01

    Although heterokaryons have been reported in nature, multicellular organisms are generally assumed genetically homogeneous. Here, we investigate the case of arbuscular mycorrhizal fungi (AMF) that form symbiosis with plant roots. The growth advantages they confer to their hosts are of great potential benefit to sustainable agricultural practices. However, measuring genetic diversity for these coenocytes is a major challenge: Within the same cytoplasm, AMF contain thousands of nuclei and show extremely high levels of genetic variation for some loci. The extent and physical location of polymorphism within and between AMF genomes is unclear. We used two complementary strategies to estimate genetic diversity in AMF, investigating polymorphism both on a genome scale and in putative single copy loci. First, we used data from whole-genome pyrosequencing of four AMF isolates to describe genetic diversity, based on a conservative network-based clustering approach. AMF isolates showed marked differences in genome-wide diversity patterns in comparison to a panel of control fungal genomes. This clustering approach further allowed us to provide conservative estimates of Rhizophagus spp. genomes sizes. Second, we designed new putative single copy genomic markers, which we investigated by massive parallel amplicon sequencing for two Rhizophagus irregularis and one Rhizophagus sp. isolates. Most loci showed high polymorphism, with up to 103 alleles per marker. This polymorphism could be distributed within or between nuclei. However, we argue that the Rhizophagus isolates under study might be heterokaryotic, at least for the putative single copy markers we studied. Considering that genetic information is the main resource for identification of AMF, we suggest that special attention is warranted for the study of these ecologically important organisms. PMID:25573960

  12. Studying genome heterogeneity within the arbuscular mycorrhizal fungal cytoplasm.

    PubMed

    Boon, Eva; Halary, Sébastien; Bapteste, Eric; Hijri, Mohamed

    2015-01-07

    Although heterokaryons have been reported in nature, multicellular organisms are generally assumed genetically homogeneous. Here, we investigate the case of arbuscular mycorrhizal fungi (AMF) that form symbiosis with plant roots. The growth advantages they confer to their hosts are of great potential benefit to sustainable agricultural practices. However, measuring genetic diversity for these coenocytes is a major challenge: Within the same cytoplasm, AMF contain thousands of nuclei and show extremely high levels of genetic variation for some loci. The extent and physical location of polymorphism within and between AMF genomes is unclear. We used two complementary strategies to estimate genetic diversity in AMF, investigating polymorphism both on a genome scale and in putative single copy loci. First, we used data from whole-genome pyrosequencing of four AMF isolates to describe genetic diversity, based on a conservative network-based clustering approach. AMF isolates showed marked differences in genome-wide diversity patterns in comparison to a panel of control fungal genomes. This clustering approach further allowed us to provide conservative estimates of Rhizophagus spp. genomes sizes. Second, we designed new putative single copy genomic markers, which we investigated by massive parallel amplicon sequencing for two Rhizophagus irregularis and one Rhizophagus sp. isolates. Most loci showed high polymorphism, with up to 103 alleles per marker. This polymorphism could be distributed within or between nuclei. However, we argue that the Rhizophagus isolates under study might be heterokaryotic, at least for the putative single copy markers we studied. Considering that genetic information is the main resource for identification of AMF, we suggest that special attention is warranted for the study of these ecologically important organisms. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Genome-wide DNA polymorphisms in two cultivars of mei (Prunus mume sieb. et zucc.).

    PubMed

    Sun, Lidan; Zhang, Qixiang; Xu, Zongda; Yang, Weiru; Guo, Yu; Lu, Jiuxing; Pan, Huitang; Cheng, Tangren; Cai, Ming

    2013-10-06

    Mei (Prunus mume Sieb. et Zucc.) is a famous ornamental plant and fruit crop grown in East Asian countries. Limited genetic resources, especially molecular markers, have hindered the progress of mei breeding projects. Here, we performed low-depth whole-genome sequencing of Prunus mume 'Fenban' and Prunus mume 'Kouzi Yudie' to identify high-quality polymorphic markers between the two cultivars on a large scale. A total of 1464.1 Mb and 1422.1 Mb of 'Fenban' and 'Kouzi Yudie' sequencing data were uniquely mapped to the mei reference genome with about 6-fold coverage, respectively. We detected a large number of putative polymorphic markers from the 196.9 Mb of sequencing data shared by the two cultivars, which together contained 200,627 SNPs, 4,900 InDels, and 7,063 SSRs. Among these markers, 38,773 SNPs, 174 InDels, and 418 SSRs were distributed in the 22.4 Mb CDS region, and 63.0% of these marker-containing CDS sequences were assigned to GO terms. Subsequently, 670 selected SNPs were validated using an Agilent's SureSelect solution phase hybridization assay. A subset of 599 SNPs was used to assess the genetic similarity of a panel of mei germplasm samples and a plum (P. salicina) cultivar, producing a set of informative diversity data. We also analyzed the frequency and distribution of detected InDels and SSRs in mei genome and validated their usefulness as DNA markers. These markers were successfully amplified in the cultivars and in their segregating progeny. A large set of high-quality polymorphic SNPs, InDels, and SSRs were identified in parallel between 'Fenban' and 'Kouzi Yudie' using low-depth whole-genome sequencing. The study presents extensive data on these polymorphic markers, which can be useful for constructing high-resolution genetic maps, performing genome-wide association studies, and designing genomic selection strategies in mei.

  14. Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

    PubMed Central

    Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio

    2004-01-01

    The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394

  15. SNP discovery in common bean by restriction-associated DNA (RAD) sequencing for genetic diversity and population structure analysis.

    PubMed

    Valdisser, Paula Arielle M R; Pappas, Georgios J; de Menezes, Ivandilson P P; Müller, Bárbara S F; Pereira, Wendell J; Narciso, Marcelo G; Brondani, Claudio; Souza, Thiago L P O; Borba, Tereza C O; Vianello, Rosana P

    2016-06-01

    Researchers have made great advances into the development and application of genomic approaches for common beans, creating opportunities to driving more real and applicable strategies for sustainable management of the genetic resource towards plant breeding. This work provides useful polymorphic single-nucleotide polymorphisms (SNPs) for high-throughput common bean genotyping developed by RAD (restriction site-associated DNA) sequencing. The RAD tags were generated from DNA pooled from 12 common bean genotypes, including breeding lines of different gene pools and market classes. The aligned sequences identified 23,748 putative RAD-SNPs, of which 3357 were adequate for genotyping; 1032 RAD-SNPs with the highest ADT (assay design tool) score are presented in this article. The RAD-SNPs were structurally annotated in different coding (47.00 %) and non-coding (53.00 %) sequence components of genes. A subset of 384 RAD-SNPs with broad genome distribution was used to genotype a diverse panel of 95 common bean germplasms and revealed a successful amplification rate of 96.6 %, showing 73 % of polymorphic SNPs within the Andean group and 83 % in the Mesoamerican group. A slightly increased He (0.161, n = 21) value was estimated for the Andean gene pool, compared to the Mesoamerican group (0.156, n = 74). For the linkage disequilibrium (LD) analysis, from a group of 580 SNPs (289 RAD-SNPs and 291 BARC-SNPs) genotyped for the same set of genotypes, 70.2 % were in LD, decreasing to 0.10 %in the Andean group and 0.77 % in the Mesoamerican group. Haplotype patterns spanning 310 Mb of the genome (60 %) were characterized in samples from different origins. However, the haplotype frameworks were under-represented for the Andean (7.85 %) and Mesoamerican (5.55 %) gene pools separately. In conclusion, RAD sequencing allowed the discovery of hundreds of useful SNPs for broad genetic analysis of common bean germplasm. From now, this approach provides an excellent panel of molecular tools for whole genome analysis, allowing integrating and better exploring the common bean breeding practices.

  16. Transcription Factor KLF5 Binds a Cyclin E1 Polymorphic Intronic Enhancer to Confer Increased Bladder Cancer Risk

    PubMed Central

    Pattison, Jillian M.; Posternak, Valeriya; Cole, Michael D.

    2016-01-01

    It is well established that environmental toxins, such as exposure to arsenic, are risk factors in the development of urinary bladder cancer, yet recent genome-wide association studies (GWAS) provide compelling evidence that there is a strong genetic component associated with disease predisposition. A single nucleotide polymorphism (SNP), rs8102137, was identified on chromosome 19q12, residing 6 kb upstream of the important cell cycle regulator and proto-oncogene, Cyclin E1 (CCNE1). However, the functional role of this variant in bladder cancer predisposition has been unclear since it lies within a non-coding region of the genome. Here, it is demonstrated that bladder cancer cells heterozygous for this SNP exhibit biased allelic expression of CCNE1 with 1.5-fold more transcription occurring from the risk allele. Furthermore, using chromatin immunoprecipitation assays, a novel enhancer element was identified within the first intron of CCNE1 that binds Kruppel-like Factor 5 (KLF5), a known transcriptional activator in bladder cancer. Moreover, the data reveal that the presence of rs200996365, a SNP in high linkage disequilibrium with rs8102137 residing in the center of a KLF5 motif, alters KLF5 binding to this genomic region. Through luciferase assays and CRISPR-Cas9 genome editing, a novel polymorphic intronic regulatory element controlling CCNE1 transcription is characterized. These studies uncover how a cancer-associated polymorphism mechanistically contributes to an increased predisposition for bladder cancer development. Implications A polymorphic KLF5 binding site near the CCNE1 gene explains genetic risk identified through genome wide association studies. PMID:27514407

  17. Comparative genome-wide polymorphic microsatellite markers in Antarctic penguins through next generation sequencing

    PubMed Central

    Vianna, Juliana A.; Noll, Daly; Mura-Jornet, Isidora; Valenzuela-Guerra, Paulina; González-Acuña, Daniel; Navarro, Cristell; Loyola, David E.; Dantas, Gisele P. M.

    2017-01-01

    Abstract Microsatellites are valuable molecular markers for evolutionary and ecological studies. Next generation sequencing is responsible for the increasing number of microsatellites for non-model species. Penguins of the Pygoscelis genus are comprised of three species: Adélie (P. adeliae), Chinstrap (P. antarcticus) and Gentoo penguin (P. papua), all distributed around Antarctica and the sub-Antarctic. The species have been affected differently by climate change, and the use of microsatellite markers will be crucial to monitor population dynamics. We characterized a large set of genome-wide microsatellites and evaluated polymorphisms in all three species. SOLiD reads were generated from the libraries of each species, identifying a large amount of microsatellite loci: 33,677, 35,265 and 42,057 for P. adeliae, P. antarcticus and P. papua, respectively. A large number of dinucleotide (66,139), trinucleotide (29,490) and tetranucleotide (11,849) microsatellites are described. Microsatellite abundance, diversity and orthology were characterized in penguin genomes. We evaluated polymorphisms in 170 tetranucleotide loci, obtaining 34 polymorphic loci in at least one species and 15 polymorphic loci in all three species, which allow to perform comparative studies. Polymorphic markers presented here enable a number of ecological, population, individual identification, parentage and evolutionary studies of Pygoscelis, with potential use in other penguin species. PMID:28898354

  18. Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.

    PubMed

    Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E

    2016-11-18

    Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.

  19. Informative genomic microsatellite markers for efficient genotyping applications in sugarcane.

    PubMed

    Parida, Swarup K; Kalia, Sanjay K; Kaul, Sunita; Dalal, Vivek; Hemaprabha, G; Selvi, Athiappan; Pandit, Awadhesh; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; Srivastava, Prem Shankar; Singh, Nagendra K; Mohapatra, Trilochan

    2009-01-01

    Genomic microsatellite markers are capable of revealing high degree of polymorphism. Sugarcane (Saccharum sp.), having a complex polyploid genome requires more number of such informative markers for various applications in genetics and breeding. With the objective of generating a large set of microsatellite markers designated as Sugarcane Enriched Genomic MicroSatellite (SEGMS), 6,318 clones from genomic libraries of two hybrid sugarcane cultivars enriched with 18 different microsatellite repeat-motifs were sequenced to generate 4.16 Mb high-quality sequences. Microsatellites were identified in 1,261 of the 5,742 non-redundant clones that accounted for 22% enrichment of the libraries. Retro-transposon association was observed for 23.1% of the identified microsatellites. The utility of the microsatellite containing genomic sequences were demonstrated by higher primer designing potential (90%) and PCR amplification efficiency (87.4%). A total of 1,315 markers including 567 class I microsatellite markers were designed and placed in the public domain for unrestricted use. The level of polymorphism detected by these markers among sugarcane species, genera, and varieties was 88.6%, while cross-transferability rate was 93.2% within Saccharum complex and 25% to cereals. Cloning and sequencing of size variant amplicons revealed that the variation in the number of repeat-units was the main source of SEGMS fragment length polymorphism. High level of polymorphism and wide range of genetic diversity (0.16-0.82 with an average of 0.44) assayed with the SEGMS markers suggested their usefulness in various genotyping applications in sugarcane.

  20. Human Xq28 Inversion Polymorphism: From Sex Linkage to Genomics--A Genetic Mother Lode

    ERIC Educational Resources Information Center

    Kirby, Cait S.; Kolber, Natalie; Salih Almohaidi, Asmaa M.; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to…

  1. Adaptive divergence in the monkey flower Mimulus guttatus is maintained by a chromosomal inversion.

    PubMed

    Twyford, Alex D; Friedman, Jannice

    2015-06-01

    Organisms exhibit an incredible diversity of life history strategies as adaptive responses to environmental variation. The establishment of novel life history strategies involves multilocus polymorphisms, which will be challenging to establish in the face of gene flow and recombination. Theory predicts that adaptive allelic combinations may be maintained and spread if they occur in genomic regions of reduced recombination, such as chromosomal inversion polymorphisms, yet empirical support for this prediction is lacking. Here, we use genomic data to investigate the evolution of divergent adaptive ecotypes of the yellow monkey flower Mimulus guttatus. We show that a large chromosomal inversion polymorphism is the major region of divergence between geographically widespread annual and perennial ecotypes. In contrast, ∼40,000 single nucleotide polymorphisms in collinear regions of the genome show no signal of life history, revealing genomic patterns of diversity have been shaped by localized homogenizing gene flow and large-scale Pleistocene range expansion. Our results provide evidence for an inversion capturing and protecting loci involved in local adaptation, while also explaining how adaptive divergence can occur with gene flow. © 2015 The Author(s). Evolution published by Wiley Periodicals, Inc. on behalf of The Society for the Study of Evolution.

  2. A global reference for human genetic variation

    PubMed Central

    2016-01-01

    The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. PMID:26432245

  3. Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

    PubMed

    Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

    2011-01-01

    Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.

  4. Detection of a single nucleotide polymorphism in the human alpha-lactalbumin gene: implications for human milk proteins.

    PubMed

    Chowanadisai, Winyoo; Kelleher, Shannon L; Nemeth, Jennifer F; Yachetti, Stephen; Kuhlman, Charles F; Jackson, Joan G; Davis, Anne M; Lien, Eric L; Lönnerdal, Bo

    2005-05-01

    Variability in the protein composition of breast milk has been observed in many women and is believed to be due to natural variation of the human population. Single nucleotide polymorphisms (SNPs) are present throughout the entire human genome, but the impact of this variation on human milk composition and biological activity and infant nutrition and health is unclear. The goals of this study were to characterize a variant of human alpha-lactalbumin observed in milk from a Filipino population by determining the location of the polymorphism in the amino acid and genomic sequences of alpha-lactalbumin. Milk and blood samples were collected from 20 Filipino women, and milk samples were collected from an additional 450 women from nine different countries. alpha-Lactalbumin concentration was measured by high-performance liquid chromatography (HPLC), and milk samples containing the variant form of the protein were identified with both HPLC and mass spectrometry (MS). The molecular weight of the variant form was measured by MS, and the location of the polymorphism was narrowed down by protein reduction, alkylation and trypsin digestion. Genomic DNA was isolated from whole blood, and the polymorphism location and subject genotype were determined by amplifying the entire coding sequence of human alpha-lactalbumin by PCR, followed by DNA sequencing. A variant form of alpha-lactalbumin was observed in HPLC chromatograms, and the difference in molecular weight was determined by MS (wild type=14,070 Da, variant=14,056 Da). Protein reduction and digestion narrowed the polymorphism between the 33rd and 77th amino acid of the protein. The genetic polymorphism was identified as adenine to guanine, which translates to a substitution from isoleucine to valine at amino acid 46. The frequency of variation was higher in milk from China, Japan and Philippines, which suggests that this polymorphism is most prevalent in Asia. There are SNPs in the genome for human milk proteins and their implications for protein bioactivity and infant nutrition need to be considered.

  5. A pipeline for the systematic identification of non-redundant full-ORF cDNAs for polymorphic and evolutionary divergent genomes: Application to the ascidian Ciona intestinalis

    DOE PAGES

    Gilchrist, Michael J.; Sobral, Daniel; Khoueiry, Pierre; ...

    2015-05-27

    Genome-wide resources, such as collections of cDNA clones encoding for complete proteins (full-ORF clones), are crucial tools for studying the evolution of gene function and genetic interactions. Non-model organisms, in particular marine organisms, provide a rich source of functional diversity. Marine organism genomes are, however, frequently highly polymorphic and encode proteins that diverge significantly from those of well-annotated model genomes. The construction of full-ORF clone collections from non-model organisms is hindered by the difficulty of predicting accurately the N-terminal ends of proteins, and distinguishing recent paralogs from highly polymorphic alleles. We also report a computational strategy that overcomes these difficulties,more » and allows for accurate gene level clustering of transcript data followed by the automated identification of full-ORFs with correct 5'- and 3'-ends. It is robust to polymorphism, includes paralog calling and does not require evolutionary proximity to well annotated model organisms. Here, we developed this pipeline for the ascidian Ciona intestinalis, a highly polymorphic member of the divergent sister group of the vertebrates, emerging as a powerful model organism to study chordate gene function, Gene Regulatory Networks and molecular mechanisms underlying human pathologies. Furthermore, using this pipeline we have generated the first full-ORF collection for a highly polymorphic marine invertebrate. It contains 19,163 full-ORF cDNA clones covering 60% of Ciona coding genes, and full-ORF orthologs for approximately half of curated human disease-associated genes.« less

  6. Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula

    PubMed Central

    Grzebelus, Dariusz; Lasota, Slawomir; Gambin, Tomasz; Kucherov, Gregory; Gambin, Anna

    2007-01-01

    Background Transposable elements constitute a significant fraction of plant genomes. The PIF/Harbinger superfamily includes DNA transposons (class II elements) carrying terminal inverted repeats and producing a 3 bp target site duplication upon insertion. The presence of an ORF coding for the DDE/DDD transposase, required for transposition, is characteristic for the autonomous PIF/Harbinger-like elements. Based on the above features, PIF/Harbinger-like elements were identified in several plant genomes and divided into several evolutionary lineages. Availability of a significant portion of Medicago truncatula genomic sequence allowed for mining PIF/Harbinger-like elements, starting from a single previously described element MtMaster. Results Twenty two putative autonomous, i.e. carrying an ORF coding for TPase and complete terminal inverted repeats, and 67 non-autonomous PIF/Harbinger-like elements were found in the genome of M. truncatula. They were divided into five families, MtPH-A5, MtPH-A6, MtPH-D,MtPH-E, and MtPH-M, corresponding to three previously identified and two new lineages. The largest families, MtPH-A6 and MtPH-M were further divided into four and three subfamilies, respectively. Non-autonomous elements were usually direct deletion derivatives of the putative autonomous element, however other types of rearrangements, including inversions and nested insertions were also observed. An interesting structural characteristic – the presence of 60 bp tandem repeats – was observed in a group of elements of subfamily MtPH-A6-4. Some families could be related to miniature inverted repeat elements (MITEs). The presence of empty loci (RESites), paralogous to those flanking the identified transposable elements, both autonomous and non-autonomous, as well as the presence of transposon insertion related size polymorphisms, confirmed that some of the mined elements were capable for transposition. Conclusion The population of PIF/Harbinger-like elements in the genome of M. truncatula is diverse. A detailed intra-family comparison of the elements' structure proved that they proliferated in the genome generally following the model of abortive gap repair. However, the presence of tandem repeats facilitated more pronounced rearrangements of the element internal regions. The insertion polymorphism of the MtPH elements and related MITE families in different populations of M. truncatula, if further confirmed experimentally, could be used as a source of molecular markers complementary to other marker systems. PMID:17996080

  7. Exploiting Genome Structure in Association Analysis

    PubMed Central

    Kim, Seyoung

    2014-01-01

    Abstract A genome-wide association study involves examining a large number of single-nucleotide polymorphisms (SNPs) to identify SNPs that are significantly associated with the given phenotype, while trying to reduce the false positive rate. Although haplotype-based association methods have been proposed to accommodate correlation information across nearby SNPs that are in linkage disequilibrium, none of these methods directly incorporated the structural information such as recombination events along chromosome. In this paper, we propose a new approach called stochastic block lasso for association mapping that exploits prior knowledge on linkage disequilibrium structure in the genome such as recombination rates and distances between adjacent SNPs in order to increase the power of detecting true associations while reducing false positives. Following a typical linear regression framework with the genotypes as inputs and the phenotype as output, our proposed method employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a first-order Markov process along the sequence of SNPs that incorporates the prior information on the linkage disequilibrium structure. The Markov-chain prior models the structural dependencies between a pair of adjacent SNPs, and allows us to look for association SNPs in a coupled manner, combining strength from multiple nearby SNPs. Our results on HapMap-simulated datasets and mouse datasets show that there is a significant advantage in incorporating the prior knowledge on linkage disequilibrium structure for marker identification under whole-genome association. PMID:21548809

  8. Analysis of MHC class I genes across horse MHC haplotypes

    PubMed Central

    Tallmadge, Rebecca L.; Campbell, Julie A.; Miller, Donald C.; Antczak, Douglas F.

    2010-01-01

    The genomic sequences of 15 horse Major Histocompatibility Complex (MHC) class I genes and a collection of MHC class I homozygous horses of five different haplotypes were used to investigate the genomic structure and polymorphism of the equine MHC. A combination of conserved and locus-specific primers was used to amplify horse MHC class I genes with classical and non-classical characteristics. Multiple clones from each haplotype identified three to five classical sequences per homozygous animal, and two to three non-classical sequences. Phylogenetic analysis was applied to these sequences and groups were identified which appear to be allelic series, but some sequences were left ungrouped. Sequences determined from MHC class I heterozygous horses and previously described MHC class I sequences were then added, representing a total of ten horse MHC haplotypes. These results were consistent with those obtained from the MHC homozygous horses alone, and 30 classical sequences were assigned to four previously confirmed loci and three new provisional loci. The non-classical genes had few alleles and the classical genes had higher levels of allelic polymorphism. Alleles for two classical loci with the expected pattern of polymorphism were found in the majority of haplotypes tested, but alleles at two other commonly detected loci had more variation outside of the hypervariable region than within. Our data indicate that the equine Major Histocompatibility Complex is characterized by variation in the complement of class I genes expressed in different haplotypes in addition to the expected allelic polymorphism within loci. PMID:20099063

  9. Intraspecific variation in mitochondrial genome sequence, structure, and gene content in Silene vulgaris, an angiosperm with pervasive cytoplasmic male sterility.

    PubMed

    Sloan, Daniel B; Müller, Karel; McCauley, David E; Taylor, Douglas R; Storchová, Helena

    2012-12-01

    In angiosperms, mitochondrial-encoded genes can cause cytoplasmic male sterility (CMS), resulting in the coexistence of female and hermaphroditic individuals (gynodioecy). We compared four complete mitochondrial genomes from the gynodioecious species Silene vulgaris and found unprecedented amounts of intraspecific diversity for plant mitochondrial DNA (mtDNA). Remarkably, only about half of overall sequence content is shared between any pair of genomes. The four mtDNAs range in size from 361 to 429 kb and differ in gene complement, with rpl5 and rps13 being intact in some genomes but absent or pseudogenized in others. The genomes exhibit essentially no conservation of synteny and are highly repetitive, with evidence of reciprocal recombination occurring even across short repeats (< 250 bp). Some mitochondrial genes exhibit atypically high degrees of nucleotide polymorphism, while others are invariant. The genomes also contain a variable number of small autonomously mapping chromosomes, which have only recently been identified in angiosperm mtDNA. Southern blot analysis of one of these chromosomes indicated a complex in vivo structure consisting of both monomeric circles and multimeric forms. We conclude that S. vulgaris harbors an unusually large degree of variation in mtDNA sequence and structure and discuss the extent to which this variation might be related to CMS. © 2012 The Authors. New Phytologist © 2012 New Phytologist Trust.

  10. Detecting a hierarchical genetic population structure via Multi-InDel markers on the X chromosome

    PubMed Central

    Fan, Guang Yao; Ye, Yi; Hou, Yi Ping

    2016-01-01

    Detecting population structure and estimating individual biogeographical ancestry are very important in population genetics studies, biomedical research and forensics. Single-nucleotide polymorphism (SNP) has long been considered to be a primary ancestry-informative marker (AIM), but it is constrained by complex and time-consuming genotyping protocols. Following up on our previous study, we propose that a multi-insertion-deletion polymorphism (Multi-InDel) with multiple haplotypes can be useful in ancestry inference and hierarchical genetic population structures. A validation study for the X chromosome Multi-InDel marker (X-Multi-InDel) as a novel AIM was conducted. Genetic polymorphisms and genetic distances among three Chinese populations and 14 worldwide populations obtained from the 1000 Genomes database were analyzed. A Bayesian clustering method (STRUCTURE) was used to discern the continental origins of Europe, East Asia, and Africa. A minimal panel of ten X-Multi-InDels was verified to be sufficient to distinguish human ancestries from three major continental regions with nearly the same efficiency of the earlier panel with 21 insertion-deletion AIMs. Along with the development of more X-Multi-InDels, an approach using this novel marker has the potential for broad applicability as a cost-effective tool toward more accurate determinations of individual biogeographical ancestry and population stratification. PMID:27535707

  11. Low-coverage MiSeq next generation sequencing reveals the mitochondrial genome of the Eastern Rock Lobster, Sagmariasus verreauxi.

    PubMed

    Doyle, Stephen R; Griffith, Ian S; Murphy, Nick P; Strugnell, Jan M

    2015-01-01

    The complete mitochondrial genome of the Eastern Rock lobster, Sagmariasus verreauxi, is reported for the first time. Using low-coverage, long read MiSeq next generation sequencing, we constructed and determined the mtDNA genome organization of the 15,470 bp sequence from two isolates from Eastern Tasmania, Australia and Northern New Zealand, and identified 46 polymorphic nucleotides between the two sequences. This genome sequence and its genetic polymorphisms will likely be useful in understanding the distribution and population connectivity of the Eastern Rock Lobster, and in the fisheries management of this commercially important species.

  12. Genomic signatures of selection at linked sites: unifying the disparity among species

    PubMed Central

    Cutter, Asher D.; Payseur, Bret A.

    2014-01-01

    Population genetics theory supplies powerful predictions about how natural selection interacts with genetic linkage to sculpt the genomic landscape of nucleotide polymorphism. Both the spread of beneficial mutations and removal of deleterious mutations act to depress polymorphism levels, especially in low-recombination regions. However, empiricists have documented extreme disparities among species. Here we characterize the dominant features that could drive variation in linked selection among species, including roles for selective sweeps being ‘hard’ or ‘soft’, and concealing by demography and genomic confounds. We advocate targeted studies of close relatives to unify our understanding of how selection and linkage interact to shape genome evolution. PMID:23478346

  13. The Fusarium Graminearum Genome Reveals a Link Between Localized Polymorphism and Pathogen Specialization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cuomo, Christina A.; Guldener, Ulrich; Xu, Jin Rong

    2007-09-07

    We sequenced and annotated the genome of the filamentous fungus Fusarium graminearum, a major pathogen of cultivated cereals. Very few repetitive sequences were detected, and the process of repeat-induced point mutation, in which duplicated sequences are subject to extensive mutation, may partially account for the reduced repeat content and apparent low number of paralogous (ancestrally duplicated) genes. A second strain of F. graminearum contained more than 10,000 single-nucleotide polymorphisms, which were frequently located near telomeres and within other discrete chromosomal segments. Many highly polymorphic regions contained sets of genes implicated in plant-fungus interactions and were unusually divergent, with higher ratesmore » of recombination. These regions of genome innovation may result from selection due to interactions of F. graminearum with its plant hosts.« less

  14. Genetic and Epigenetic Changes in Oilseed Rape (Brassica napus L.) Extracted from Intergeneric Allopolyploid and Additions with Orychophragmus.

    PubMed

    Gautam, Mayank; Dang, Yanwei; Ge, Xianhong; Shao, Yujiao; Li, Zaiyun

    2016-01-01

    Allopolyploidization with the merger of the genomes from different species has been shown to be associated with genetic and epigenetic changes. But the maintenance of such alterations related to one parental species after the genome is extracted from the allopolyploid remains to be detected. In this study, the genome of Brassica napus L. (2n = 38, genomes AACC) was extracted from its intergeneric allohexaploid (2n = 62, genomes AACCOO) with another crucifer Orychophragmus violaceus (2n = 24, genome OO), by backcrossing and development of alien addition lines. B. napus-type plants identified in the self-pollinated progenies of nine monosomic additions were analyzed by the methods of amplified fragment length polymorphism, sequence-specific amplified polymorphism, and methylation-sensitive amplified polymorphism. They showed modifications to certain extents in genomic components (loss and gain of DNA segments and transposons, introgression of alien DNA segments) and DNA methylation, compared with B. napus donor. The significant differences in the changes between the B. napus types extracted from these additions likely resulted from the different effects of individual alien chromosomes. Particularly, the additions which harbored the O. violaceus chromosome carrying dominant rRNA genes over those of B. napus tended to result in the development of plants which showed fewer changes, suggesting a role of the expression levels of alien rRNA genes in genomic stability. These results provided new cues for the genetic alterations in one parental genome that are maintained even after the genome becomes independent.

  15. Genetic and Epigenetic Changes in Oilseed Rape (Brassica napus L.) Extracted from Intergeneric Allopolyploid and Additions with Orychophragmus

    PubMed Central

    Gautam, Mayank; Dang, Yanwei; Ge, Xianhong; Shao, Yujiao; Li, Zaiyun

    2016-01-01

    Allopolyploidization with the merger of the genomes from different species has been shown to be associated with genetic and epigenetic changes. But the maintenance of such alterations related to one parental species after the genome is extracted from the allopolyploid remains to be detected. In this study, the genome of Brassica napus L. (2n = 38, genomes AACC) was extracted from its intergeneric allohexaploid (2n = 62, genomes AACCOO) with another crucifer Orychophragmus violaceus (2n = 24, genome OO), by backcrossing and development of alien addition lines. B. napus-type plants identified in the self-pollinated progenies of nine monosomic additions were analyzed by the methods of amplified fragment length polymorphism, sequence-specific amplified polymorphism, and methylation-sensitive amplified polymorphism. They showed modifications to certain extents in genomic components (loss and gain of DNA segments and transposons, introgression of alien DNA segments) and DNA methylation, compared with B. napus donor. The significant differences in the changes between the B. napus types extracted from these additions likely resulted from the different effects of individual alien chromosomes. Particularly, the additions which harbored the O. violaceus chromosome carrying dominant rRNA genes over those of B. napus tended to result in the development of plants which showed fewer changes, suggesting a role of the expression levels of alien rRNA genes in genomic stability. These results provided new cues for the genetic alterations in one parental genome that are maintained even after the genome becomes independent. PMID:27148282

  16. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata

    PubMed Central

    2017-01-01

    The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata, to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species. PMID:29181279

  17. Activation of individual L1 retrotransposon instances is restricted to cell-type dependent permissive loci

    PubMed Central

    Philippe, Claude; Vargas-Landin, Dulce B; Doucet, Aurélien J; van Essen, Dominic; Vera-Otarola, Jorge; Kuciak, Monika; Corbin, Antoine; Nigumann, Pilvi; Cristofari, Gaël

    2016-01-01

    LINE-1 (L1) retrotransposons represent approximately one sixth of the human genome, but only the human-specific L1HS-Ta subfamily acts as an endogenous mutagen in modern humans, reshaping both somatic and germline genomes. Due to their high levels of sequence identity and the existence of many polymorphic insertions absent from the reference genome, the transcriptional activation of individual genomic L1HS-Ta copies remains poorly understood. Here we comprehensively mapped fixed and polymorphic L1HS-Ta copies in 12 commonly-used somatic cell lines, and identified transcriptional and epigenetic signatures allowing the unambiguous identification of active L1HS-Ta copies in their genomic context. Strikingly, only a very restricted subset of L1HS-Ta loci - some being polymorphic among individuals - significantly contributes to the bulk of L1 expression, and these loci are differentially regulated among distinct cell lines. Thus, our data support a local model of L1 transcriptional activation in somatic cells, governed by individual-, locus-, and cell-type-specific determinants. DOI: http://dx.doi.org/10.7554/eLife.13926.001 PMID:27016617

  18. Genomic and genotyping characterization of haplotype-based polymorphic microsatellites in Prunus

    USDA-ARS?s Scientific Manuscript database

    Efficient utilization of microsatellites in genetic studies remains impeded largely due to the unknown status of their primer reliability, chromosomal location, and allele polymorphism. Discovery and characterization of microsatellite polymorphisms in a taxon will disclose the unknowns and gain new ...

  19. PeanutDB: an integrated bioinformatics web portal for Arachis hypogaea transcriptomics

    PubMed Central

    2012-01-01

    Background The peanut (Arachis hypogaea) is an important crop cultivated worldwide for oil production and food sources. Its complex genetic architecture (e.g., the large and tetraploid genome possibly due to unique cross of wild diploid relatives and subsequent chromosome duplication: 2n = 4x = 40, AABB, 2800 Mb) presents a major challenge for its genome sequencing and makes it a less-studied crop. Without a doubt, transcriptome sequencing is the most effective way to harness the genome structure and gene expression dynamics of this non-model species that has a limited genomic resource. Description With the development of next generation sequencing technologies such as 454 pyro-sequencing and Illumina sequencing by synthesis, the transcriptomics data of peanut is rapidly accumulated in both the public databases and private sectors. Integrating 187,636 Sanger reads (103,685,419 bases), 1,165,168 Roche 454 reads (333,862,593 bases) and 57,135,995 Illumina reads (4,073,740,115 bases), we generated the first release of our peanut transcriptome assembly that contains 32,619 contigs. We provided EC, KEGG and GO functional annotations to these contigs and detected SSRs, SNPs and other genetic polymorphisms for each contig. Based on both open-source and our in-house tools, PeanutDB presents many seamlessly integrated web interfaces that allow users to search, filter, navigate and visualize easily the whole transcript assembly, its annotations and detected polymorphisms and simple sequence repeats. For each contig, sequence alignment is presented in both bird’s-eye view and nucleotide level resolution, with colorfully highlighted regions of mismatches, indels and repeats that facilitate close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors. Conclusion As a public genomic database that integrates peanut transcriptome data from different sources, PeanutDB (http://bioinfolab.muohio.edu/txid3818v1) provides the Peanut research community with an easy-to-use web portal that will definitely facilitate genomics research and molecular breeding in this less-studied crop. PMID:22712730

  20. Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism

    PubMed Central

    Gur-Arie, Riva; Cohen, Cyril J.; Eitan, Yuval; Shelef, Leora; Hallerman, Eric M.; Kashi, Yechezkel

    2000-01-01

    Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.[The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AF209020–209030 and AF209508–209518.] PMID:10645951

  1. Immune and stress responses in oysters with insights on adaptation.

    PubMed

    Guo, Ximing; He, Yan; Zhang, Linlin; Lelong, Christophe; Jouaux, Aude

    2015-09-01

    Oysters are representative bivalve molluscs that are widely distributed in world oceans. As successful colonizers of estuaries and intertidal zones, oysters are remarkably resilient against harsh environmental conditions including wide fluctuations in temperature and salinity as well as prolonged air exposure. Oysters have no adaptive immunity but can thrive in microbe-rich estuaries as filter-feeders. These unique adaptations make oysters interesting models to study the evolution of host-defense systems. Recent advances in genomic studies including sequencing of the oyster genome have provided insights into oyster's immune and stress responses underlying their amazing resilience. Studies show that the oyster genomes are highly polymorphic and complex, which may be key to their resilience. The oyster genome has a large gene repertoire that is enriched for immune and stress response genes. Thousands of genes are involved in oyster's immune and stress responses, through complex interactions, with many gene families expanded showing high sequence, structural and functional diversity. The high diversity of immune receptors and effectors may provide oysters with enhanced specificity in immune recognition and response to cope with diverse pathogens in the absence of adaptive immunity. Some members of expanded immune gene families have diverged to function at different temperatures and salinities or assumed new roles in abiotic stress response. Most canonical innate immunity pathways are conserved in oysters and supported by a large number of diverse and often novel genes. The great diversity in immune and stress response genes exhibited by expanded gene families as well as high sequence and structural polymorphisms may be central to oyster's adaptation to highly stressful and widely changing environments. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Construction of a High-Density American Cranberry (Vaccinium macrocarpon Ait.) Composite Map Using Genotyping-by-Sequencing for Multi-pedigree Linkage Mapping.

    PubMed

    Schlautman, Brandon; Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Iorizzo, Massimo; Polashock, James; Grygleski, Edward; Vorsa, Nicholi; Zalapa, Juan

    2017-04-03

    The American cranberry ( Vaccinium macrocarpon Ait.) is a recently domesticated, economically important, fruit crop with limited molecular resources. New genetic resources could accelerate genetic gain in cranberry through characterization of its genomic structure and by enabling molecular-assisted breeding strategies. To increase the availability of cranberry genomic resources, genotyping-by-sequencing (GBS) was used to discover and genotype thousands of single nucleotide polymorphisms (SNPs) within three interrelated cranberry full-sib populations. Additional simple sequence repeat (SSR) loci were added to the SNP datasets and used to construct bin maps for the parents of the populations, which were then merged to create the first high-density cranberry composite map containing 6073 markers (5437 SNPs and 636 SSRs) on 12 linkage groups (LGs) spanning 1124 cM. Interestingly, higher rates of recombination were observed in maternal than paternal gametes. The large number of markers in common (mean of 57.3) and the high degree of observed collinearity (mean Pair-wise Spearman rank correlations >0.99) between the LGs of the parental maps demonstrates the utility of GBS in cranberry for identifying polymorphic SNP loci that are transferable between pedigrees and populations in future trait-association studies. Furthermore, the high-density of markers anchored within the component maps allowed identification of segregation distortion regions, placement of centromeres on each of the 12 LGs, and anchoring of genomic scaffolds. Collectively, the results represent an important contribution to the current understanding of cranberry genomic structure and to the availability of molecular tools for future genetic research and breeding efforts in cranberry. Copyright © 2017 Schlautman et al.

  3. Restriction Site Tiling Analysis: accurate discovery and quantitative genotyping of genome-wide polymorphisms using nucleotide arrays

    PubMed Central

    2010-01-01

    High-throughput genotype data can be used to identify genes important for local adaptation in wild populations, phenotypes in lab stocks, or disease-related traits in human medicine. Here we advance microarray-based genotyping for population genomics with Restriction Site Tiling Analysis. The approach simultaneously discovers polymorphisms and provides quantitative genotype data at 10,000s of loci. It is highly accurate and free from ascertainment bias. We apply the approach to uncover genomic differentiation in the purple sea urchin. PMID:20403197

  4. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium

    PubMed Central

    Pajuelo, Mónica J.; Eguiluz, María; Dahlstrom, Eric; Requena, David; Guzmán, Frank; Ramirez, Manuel; Sheen, Patricia; Frace, Michael; Sammons, Scott; Cama, Vitaliano; Anzick, Sarah; Bruno, Dan; Mahanty, Siddhartha; Wilkins, Patricia; Nash, Theodore; Gonzalez, Armando; García, Héctor H.; Gilman, Robert H.; Porcella, Steve; Zimic, Mirko

    2015-01-01

    Background Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen. Methods For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS) and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples. Results The predicted size of the hybrid (proglottid genome combined with cyst genome) T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt) were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites. Conclusions/Significance The availability of draft genomes for T. solium represents a significant step towards the understanding the biology of the parasite. We report here a set of T. solium polymorphic microsatellite markers that appear promising for genetic epidemiology studies. PMID:26697878

  5. Markers and mapping revisited: finding your gene.

    PubMed

    Jones, Neil; Ougham, Helen; Thomas, Howard; Pasakinskiene, Izolda

    2009-01-01

    This paper is an update of our earlier review (Jones et al., 1997, Markers and mapping: we are all geneticists now. New Phytologist 137: 165-177), which dealt with the genetics of mapping, in terms of recombination as the basis of the procedure, and covered some of the first generation of markers, including restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNA (RAPDs), simple sequence repeats (SSRs) and quantitative trait loci (QTLs). In the intervening decade there have been numerous developments in marker science with many new systems becoming available, which are herein described: cleavage amplification polymorphism (CAP), sequence-specific amplification polymorphism (S-SAP), inter-simple sequence repeat (ISSR), sequence tagged site (STS), sequence characterized amplification region (SCAR), selective amplification of microsatellite polymorphic loci (SAMPL), single nucleotide polymorphism (SNP), expressed sequence tag (EST), sequence-related amplified polymorphism (SRAP), target region amplification polymorphism (TRAP), microarrays, diversity arrays technology (DArT), single-strand conformation polymorphism (SSCP), denaturing gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE) and methylation-sensitive PCR. In addition there has been an explosion of knowledge and databases in the area of genomics and bioinformatics. The number of flowering plant ESTs is c. 19 million and counting, with all the opportunity that this provides for gene-hunting, while the survey of bioinformatics and computer resources points to a rapid growth point for future activities in unravelling and applying the burst of new information on plant genomes. A case study is presented on tracking down a specific gene (stay-green (SGR), a post-transcriptional senescence regulator) using the full suite of mapping tools and comparative mapping resources. We end with a brief speculation on how genome analysis may progress into the future of this highly dynamic arena of plant science.

  6. Meta-analysis of genome-wide association studies identifies common susceptibility polymorphisms for colorectal and endometrial cancer near SH2B3 and TSHZ1

    PubMed Central

    Cheng, Timothy HT; Thompson, Deborah; Painter, Jodie; O’Mara, Tracy; Gorman, Maggie; Martin, Lynn; Palles, Claire; Jones, Angela; Buchanan, Daniel D.; Ko Win, Aung; Hopper, John; Jenkins, Mark; Lindor, Noralane M.; Newcomb, Polly A.; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Giles, Graham G; Pharoah, Paul; Peto, Julian; Cox, Angela; Swerdlow, Anthony; Couch, Fergus; Cunningham, Julie M; Goode, Ellen L; Winham, Stacey J; Lambrechts, Diether; Fasching, Peter; Burwinkel, Barbara; Brenner, Hermann; Brauch, Hiltrud; Chang-Claude, Jenny; Salvesen, Helga B.; Kristensen, Vessela; Darabi, Hatef; Li, Jingmei; Liu, Tao; Lindblom, Annika; Hall, Per; de Polanco, Magdalena Echeverry; Sans, Monica; Carracedo, Angel; Castellvi-Bel, Sergi; Rojas-Martinez, Augusto; Aguiar Jnr, Samuel; Teixeira, Manuel R.; Dunning, Alison M; Dennis, Joe; Otton, Geoffrey; Proietto, Tony; Holliday, Elizabeth; Attia, John; Ashton, Katie; Scott, Rodney J; McEvoy, Mark; Dowdy, Sean C; Fridley, Brooke L; Werner, Henrica MJ; Trovik, Jone; Njolstad, Tormund S; Tham, Emma; Mints, Miriam; Runnebaum, Ingo; Hillemanns, Peter; Dörk, Thilo; Amant, Frederic; Schrauwen, Stefanie; Hein, Alexander; Beckmann, Matthias W; Ekici, Arif; Czene, Kamila; Meindl, Alfons; Bolla, Manjeet K; Michailidou, Kyriaki; Tyrer, Jonathan P; Wang, Qin; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Annibali, Daniela; Depreeuw, Jeroen; Al-Tassan, Nada A.; Harris, Rebecca; Meyer, Brian F.; Whiffin, Nicola; Hosking, Fay J; Kinnersley, Ben; Farrington, Susan M.; Timofeeva, Maria; Tenesa, Albert; Campbell, Harry; Haile, Robert W.; Hodgson, Shirley; Carvajal-Carmona, Luis; Cheadle, Jeremy P.; Easton, Douglas; Dunlop, Malcolm; Houlston, Richard; Spurdle, Amanda; Tomlinson, Ian

    2015-01-01

    High-risk mutations in several genes predispose to both colorectal cancer (CRC) and endometrial cancer (EC). We therefore hypothesised that some lower-risk genetic variants might also predispose to both CRC and EC. Using CRC and EC genome-wide association series, totalling 13,265 cancer cases and 40,245 controls, we found that the protective allele [G] at one previously-identified CRC polymorphism, rs2736100 near TERT, was associated with EC risk (odds ratio (OR) = 1.08, P = 0.000167); this polymorphism influences the risk of several other cancers. A further CRC polymorphism near TERC also showed evidence of association with EC (OR = 0.92; P = 0.03). Overall, however, there was no good evidence that the set of CRC polymorphisms was associated with EC risk, and neither of two previously-reported EC polymorphisms was associated with CRC risk. A combined analysis revealed one genome-wide significant polymorphism, rs3184504, on chromosome 12q24 (OR = 1.10, P = 7.23 × 10−9) with shared effects on CRC and EC risk. This polymorphism, a missense variant in the gene SH2B3, is also associated with haematological and autoimmune disorders, suggesting that it influences cancer risk through the immune response. Another polymorphism, rs12970291 near gene TSHZ1, was associated with both CRC and EC (OR = 1.26, P = 4.82 × 10−8), with the alleles showing opposite effects on the risks of the two cancers. PMID:26621817

  7. A genomic landscape of mitochondrial DNA insertions in the pig nuclear genome provides evolutionary signatures of interspecies admixture.

    PubMed

    Schiavo, Giuseppina; Hoffmann, Orsolya Ivett; Ribani, Anisa; Utzeri, Valerio Joe; Ghionda, Marco Ciro; Bertolini, Francesca; Geraci, Claudia; Bovo, Samuele; Fontanesi, Luca

    2017-10-01

    Nuclear DNA sequences of mitochondrial origin (numts) are derived by insertion of mitochondrial DNA (mtDNA), into the nuclear genome. In this study, we provide, for the first time, a genome picture of numts inserted in the pig nuclear genome. The Sus scrofa reference nuclear genome (Sscrofa10.2) was aligned with circularized and consensus mtDNA sequences using LAST software. A total of 430 numt sequences that may represent 246 different numt integration events (57 numt regions determined by at least two numt sequences and 189 singletons) were identified, covering about 0.0078% of the nuclear genome. Numt integration events were correlated (0.99) to the chromosome length. The longest numt sequence (about 11 kbp) was located on SSC2. Six numts were sequenced and PCR amplified in pigs of European commercial and local pig breeds, of the Chinese Meishan breed and in European wild boars. Three of them were polymorphic for the presence or absence of the insertion. Surprisingly, the estimated age of insertion of two of the three polymorphic numts was more ancient than that of the speciation time of the Sus scrofa, supporting that these polymorphic sites were originated from interspecies admixture that contributed to shape the pig genome. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  8. A survey of copy number variation in the porcine genome detected from whole-genome sequence

    USDA-ARS?s Scientific Manuscript database

    An important challenge to post-genomic biology is relating observed phenotypic variation to the underlying genotypic variation. Genome-wide association studies (GWAS) have made thousands of connections between single nucleotide polymorphisms (SNPs) and phenotypes, implicating regions of the genome t...

  9. Genomic Prediction for Quantitative Traits Is Improved by Mapping Variants to Gene Ontology Categories in Drosophila melanogaster

    PubMed Central

    Edwards, Stefan M.; Sørensen, Izel F.; Sarup, Pernille; Mackay, Trudy F. C.; Sørensen, Peter

    2016-01-01

    Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits. PMID:27235308

  10. Impact of gamma rays on the Phaffia rhodozyma genome revealed by RAPD-PCR.

    PubMed

    Najafi, N; Hosseini, Ramin; Ahmadi, Ar

    2011-12-01

    Phaffia rhodozyma is a red yeast which produces astaxanthin as the major carotenoid pigment. Astaxanthin is thought to reduce the incidence of cancer and degenerative diseases in man. It also enhances the immune response and acts as a free-radical quencher, a precursor of vitamin A, or a pigment involved in the visual attraction of animals as mating partners. The impact of gamma irradiation was studied on the Phaffia rhodozyma genome. Ten mutant strains, designated Gam1-Gam10, were obtained using gamma irradiation. Ten decamer random amplified polymorphic DNA (RAPD) primers were employed to assess genetic changes. Nine primers revealed scorable polymorphisms and a total of 95 band positions were scored; amongst which 38 bands (37.5%) were polymorphic. Primer F with 3 bands and primer J20 with 13 bands produced the lowest and the highest number of bands, respectively. Primer A16 produced the highest number of polymorphic bands (70% polymorphism) and primer F showed the lowest number of polymorphic bands (0% polymorphism). Genetic distances were calculated using Jaccard's coefficient and the UPGMA method. A dendrogram was created using SPSS (version 11.5) and the strains were clustered into four groups. RAPD markers could distinguish between the parental and the mutant strains of P. rhodozyma. RAPD technique showed that some changes had occurred in the genome of the mutated strains. This technique demonstrated the capability to differentiate between the parental and the mutant strains.

  11. The Joint Effects of Background Selection and Genetic Recombination on Local Gene Genealogies

    PubMed Central

    Zeng, Kai; Charlesworth, Brian

    2011-01-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data. PMID:21705759

  12. Investigating intra-host and intra-herd sequence diversity of foot-and-mouth disease virus.

    PubMed

    King, David J; Freimanis, Graham L; Orton, Richard J; Waters, Ryan A; Haydon, Daniel T; King, Donald P

    2016-10-01

    Due to the poor-fidelity of the enzymes involved in RNA genome replication, foot-and-mouth disease (FMD) virus samples comprise of unique polymorphic populations. In this study, deep sequencing was utilised to characterise the diversity of FMD virus (FMDV) populations in 6 infected cattle present on a single farm during the series of outbreaks in the UK in 2007. A novel RT-PCR method was developed to amplify a 7.6kb nucleotide fragment encompassing the polyprotein coding region of the FMDV genome. Illumina sequencing of each sample identified the fine polymorphic structures at each nucleotide position, from consensus level changes to variants present at a 0.24% frequency. These data were used to investigate population dynamics of FMDV at both herd and host levels, evaluate the impact of host on the viral swarm structure and to identify transmission links with viruses recovered from other farms in the same series of outbreaks. In 7 samples, from 6 different animals, a total of 5 consensus level variants were identified, in addition to 104 sub-consensus variants of which 22 were shared between 2 or more animals. Further analysis revealed differences in swarm structures from samples derived from the same animal suggesting the presence of distinct viral populations evolving independently at different lesion sites within the same infected animal. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  13. The joint effects of background selection and genetic recombination on local gene genealogies.

    PubMed

    Zeng, Kai; Charlesworth, Brian

    2011-09-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.

  14. Parallel evolution of Batesian mimicry supergene in two Papilio butterflies, P. polytes and P. memnon

    PubMed Central

    Itoh, Takehiko

    2018-01-01

    Batesian mimicry protects animals from predators when mimics resemble distasteful models. The female-limited Batesian mimicry in Papilio butterflies is controlled by a supergene locus switching mimetic and nonmimetic forms. In Papilio polytes, recent studies revealed that a highly diversified region (HDR) containing doublesex (dsx-HDR) constitutes the supergene with dimorphic alleles and is likely maintained by a chromosomal inversion. In the closely related Papilio memnon, which exhibits a similar mimicry polymorphism, we performed whole-genome sequence analyses in 11 butterflies, which revealed a nearly identical dsx-HDR containing three genes (dsx, Nach-like, and UXT) with dimorphic sequences strictly associated with the mimetic/nonmimetic phenotypes. In addition, expression of these genes, except that of Nach-like in female hind wings, showed differences correlated with phenotype. The dimorphic dsx-HDR in P. memnon is maintained without a chromosomal inversion, suggesting that a separate mechanism causes and maintains allelic divergence in these genes. More abundant accumulation of transposable elements and repetitive sequences in the dsx-HDR than in other genomic regions may contribute to the suppression of chromosomal recombination. Gene trees for Dsx, Nach-like, and UXT indicated that mimetic alleles evolved independently in the two Papilio species. These results suggest that the genomic region involving the above three genes has repeatedly diverged so that two allelic sequences of this region function as developmental switches for mimicry polymorphism in the two Papilio species. The supergene structures revealed here suggest that independent evolutionary processes with different genetic mechanisms have led to parallel evolution of similar female-limited polymorphisms underlying Batesian mimicry in Papilio butterflies. PMID:29675466

  15. Methylation-sensitive amplified polymorphism-based genome-wide analysis of cytosine methylation profiles in Nicotiana tabacum cultivars.

    PubMed

    Jiao, J; Wu, J; Lv, Z; Sun, C; Gao, L; Yan, X; Cui, L; Tang, Z; Yan, B; Jia, Y

    2015-11-26

    This study aimed to investigate cytosine methylation profiles in different tobacco (Nicotiana tabacum) cultivars grown in China. Methylation-sensitive amplified polymorphism was used to analyze genome-wide global methylation profiles in four tobacco cultivars (Yunyan 85, NC89, K326, and Yunyan 87). Amplicons with methylated C motifs were cloned by reamplified polymerase chain reaction, sequenced, and analyzed. The results show that geographical location had a greater effect on methylation patterns in the tobacco genome than did sampling time. Analysis of the CG dinucleotide distribution in methylation-sensitive polymorphic restriction fragments suggested that a CpG dinucleotide cluster-enriched area is a possible site of cytosine methylation in the tobacco genome. The sequence alignments of the Nia1 gene (that encodes nitrate reductase) in Yunyan 87 in different regions indicate that a C-T transition might be responsible for the tobacco phenotype. T-C nucleotide replacement might also be responsible for the tobacco phenotype and may be influenced by geographical location.

  16. Genome-wide cross-amplification of domestic sheep microsatellites in bighorn sheep and mountain goats.

    PubMed

    Poissant, J; Shafer, A B A; Davis, C S; Mainguy, J; Hogg, J T; Côté, S D; Coltman, D W

    2009-07-01

    We tested for cross-species amplification of microsatellite loci located throughout the domestic sheep (Ovis aries) genome in two north American mountain ungulates (bighorn sheep, Ovis canadensis, and mountain goats, Oreamnos americanus). We identified 247 new polymorphic markers in bighorn sheep (≥ 3 alleles in one of two study populations) and 149 in mountain goats (≥ 2 alleles in a single study population) using 648 and 576 primer pairs, respectively. Our efforts increased the number of available polymorphic microsatellite markers to 327 for bighorn sheep and 180 for mountain goats. The average distance between successive polymorphic bighorn sheep and mountain goat markers inferred from the Australian domestic sheep genome linkage map (mean ± 1 SD) was 11.9 ± 9.2 and 15.8 ± 13.8 centimorgans, respectively. The development of genomic resources in these wildlife species enables future studies of the genetic architecture of trait variation. © 2009 Blackwell Publishing Ltd.

  17. [Analysis on genetic polymorphism of 5 STR loci selected from X chromosome].

    PubMed

    Liu, Qi-ji; Gong, Yao-qin; Zhang, Xi-yu; Gao, Gui-min; Li, Jiang-xia; Guo, Yi-shou

    2005-02-01

    To select short tandem repeats(STR) from X chromosome. STR is a universal genetic marker that has changeable polymorphism and stable heredity in human genome. It is a specific DNA segment composed of 2-6 base pairs as its core sequence. It is an ideal DNA marker used in linkage analysis and gene mapping. In this study, 8 short tandem repeats were selected from two genomic clones on X chromosome by using BCM Search Launcher. Primers amplifying the STR loci were designed by using Primer 3.0 according to the unique sequence flanking the STRs. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five of these STRs were polymorphic. Chi-square test indicated that the distribution of genotypes agreed with Hardy-Weinberg equilibrium (P>0.05). Five polymorphic short tandem repeats have been identified on chromosome X and will be useful for linkage analysis and gene mapping.

  18. Development and Evaluation of a Genome-Wide 6K SNP Array for Diploid Sweet Cherry and Tetraploid Sour Cherry

    PubMed Central

    Peace, Cameron; Bassil, Nahla; Main, Dorrie; Ficklin, Stephen; Rosyara, Umesh R.; Stegmeir, Travis; Sebolt, Audrey; Gilmore, Barbara; Lawley, Cindy; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Iezzoni, Amy

    2012-01-01

    High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome structure investigation, and genetic diversity assessment in this diploid-tetraploid crop group. PMID:23284615

  19. Genome-wide identification, phylogenetic classification, and exon-intron structure characterisation of the tubulin and actin genes in flax (Linum usitatissimum).

    PubMed

    Pydiura, Nikolay; Pirko, Yaroslav; Galinousky, Dmitry; Postovoitova, Anastasiia; Yemets, Alla; Kilchevsky, Aleksandr; Blume, Yaroslav

    2018-06-08

    Flax (Linum usitatissimum L.) is a valuable food and fiber crop cultivated for its quality fiber and seed oil. α-, β-, γ-tubulins and actins are the main structural proteins of the cytoskeleton. α- and γ-tubulin and actin genes have not been characterized yet in the flax genome. In this study, we have identified 6 α-tubulin genes, 13 β-tubulin genes, 2 γ-tubulin genes, and 15 actin genes in the flax genome and analysed the phylogenetic relationships between flax and A. thaliana tubulin and actin genes. Six α-tubulin genes are represented by 3 paralogous pairs, among 13 β-tubulin genes 7 different isotypes can be distinguished, 6 of which are encoded by two paralogous genes each. γ-tubulin is represented by a paralogous pair of genes one of which may be not functional. Fifteen actin genes represent 7 paralogous pairs - 7 actin isotypes and a sequentially duplicated copy of one of the genes of one of the isotypes. Exon-intron structure analysis has shown intron length polymorphism within the β-tubulin genes and intron number variation among the α-tubulin gene: 3 or 4 introns are found in two or four genes, respectively. Intron positioning occurs at conservative sites, as observed in numerous other plant species. Flax actin genes show both intron length polymorphisms and variation in the number of intron that may be 2 or 3. These data will be useful to support further studies on the specificity, functioning, regulation and evolution of the flax cytoskeleton proteins. This article is protected by copyright. All rights reserved.

  20. Topological impact of noncanonical DNA structures on Klenow fragment of DNA polymerase.

    PubMed

    Takahashi, Shuntaro; Brazier, John A; Sugimoto, Naoki

    2017-09-05

    Noncanonical DNA structures that stall DNA replication can cause errors in genomic DNA. Here, we investigated how the noncanonical structures formed by sequences in genes associated with a number of diseases impacted DNA polymerization by the Klenow fragment of DNA polymerase. Replication of a DNA sequence forming an i-motif from a telomere, hypoxia-induced transcription factor, and an insulin-linked polymorphic region was effectively inhibited. On the other hand, replication of a mixed-type G-quadruplex (G4) from a telomere was less inhibited than that of the antiparallel type or parallel type. Interestingly, the i-motif was a better inhibitor of replication than were mixed-type G4s or hairpin structures, even though all had similar thermodynamic stabilities. These results indicate that both the stability and topology of structures formed in DNA templates impact the processivity of a DNA polymerase. This suggests that i-motif formation may trigger genomic instability by stalling the replication of DNA, causing intractable diseases.

  1. Topological impact of noncanonical DNA structures on Klenow fragment of DNA polymerase

    PubMed Central

    Takahashi, Shuntaro; Brazier, John A.; Sugimoto, Naoki

    2017-01-01

    Noncanonical DNA structures that stall DNA replication can cause errors in genomic DNA. Here, we investigated how the noncanonical structures formed by sequences in genes associated with a number of diseases impacted DNA polymerization by the Klenow fragment of DNA polymerase. Replication of a DNA sequence forming an i-motif from a telomere, hypoxia-induced transcription factor, and an insulin-linked polymorphic region was effectively inhibited. On the other hand, replication of a mixed-type G-quadruplex (G4) from a telomere was less inhibited than that of the antiparallel type or parallel type. Interestingly, the i-motif was a better inhibitor of replication than were mixed-type G4s or hairpin structures, even though all had similar thermodynamic stabilities. These results indicate that both the stability and topology of structures formed in DNA templates impact the processivity of a DNA polymerase. This suggests that i-motif formation may trigger genomic instability by stalling the replication of DNA, causing intractable diseases. PMID:28827350

  2. Exploiting genotyping by sequencing to characterize the genomic structure of the American cranberry through high-density linkage mapping.

    PubMed

    Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Schlautman, Brandon; Deutsch, Joseph; Salazar, Walter; Hernandez-Ochoa, Miguel; Grygleski, Edward; Steffan, Shawn; Iorizzo, Massimo; Polashock, James; Vorsa, Nicholi; Zalapa, Juan

    2016-06-13

    The application of genotyping by sequencing (GBS) approaches, combined with data imputation methodologies, is narrowing the genetic knowledge gap between major and understudied, minor crops. GBS is an excellent tool to characterize the genomic structure of recently domesticated (~200 years) and understudied species, such as cranberry (Vaccinium macrocarpon Ait.), by generating large numbers of markers for genomic studies such as genetic mapping. We identified 10842 potentially mappable single nucleotide polymorphisms (SNPs) in a cranberry pseudo-testcross population wherein 5477 SNPs and 211 short sequence repeats (SSRs) were used to construct a high density linkage map in cranberry of which a total of 4849 markers were mapped. Recombination frequency, linkage disequilibrium (LD), and segregation distortion at the genomic level in the parental and integrated linkage maps were characterized for first time in cranberry. SSR markers, used as the backbone in the map, revealed high collinearity with previously published linkage maps. The 4849 point map consisted of twelve linkage groups spanning 1112 cM, which anchored 2381 nuclear scaffolds accounting for ~13 Mb of the estimated 470 Mb cranberry genome. Bin mapping identified 592 and 672 unique bins in the parentals and a total of 1676 unique marker positions in the integrated map. Synteny analyses comparing the order of anchored cranberry scaffolds to their homologous positions in kiwifruit, grape, and coffee genomes provided initial evidence of homology between cranberry and closely related species. GBS data was used to rapidly saturate the cranberry genome with markers in a pseudo-testcross population. Collinearity between the present saturated genetic map and previous cranberry SSR maps suggests that the SNP locations represent accurate marker order and chromosome structure of the cranberry genome. SNPs greatly improved current marker genome coverage, which allowed for genome-wide structure investigations such as segregation distortion, recombination, linkage disequilibrium, and synteny analyses. In the future, GBS can be used to accelerate cranberry molecular breeding through QTL mapping and genome-wide association studies (GWAS).

  3. Effects of functional polymorphisms on beef carcass merit

    USDA-ARS?s Scientific Manuscript database

    To develop a resource to identify polymorphisms present in common beef cattle breeds, and relate those polymorphisms to phenotypic differences, low-coverage genomic sequence was obtained on 186 purebred bulls from 15 predominant breeds in the United States, and 84 crossbred sons of these bulls. The...

  4. Genomic relations among 31 species of Mammillaria haworth (Cactaceae) using random amplified polymorphic DNA.

    PubMed

    Mattagajasingh, Ilwola; Mukherjee, Arup Kumar; Das, Premananda

    2006-01-01

    Thirty-one species of Mammillaria were selected to study the molecular phylogeny using random amplified polymorphic DNA (RAPD) markers. High amount of mucilage (gelling polysaccharides) present in Mammillaria was a major obstacle in isolating good quality genomic DNA. The CTAB (cetyl trimethyl ammonium bromide) method was modified to obtain good quality genomic DNA. Twenty-two random decamer primers resulted in 621 bands, all of which were polymorphic. The similarity matrix value varied from 0.109 to 0.622 indicating wide variability among the studied species. The dendrogram obtained from the unweighted pair group method using arithmetic averages (UPGMA) analysis revealed that some of the species did not follow the conventional classification. The present work shows the usefulness of RAPD markers for genetic characterization to establish phylogenetic relations among Mammillaria species.

  5. Meraculous2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2014-06-01

    meraculous2 is a whole genome shotgun assembler for short-reads that is capable of assembling large, polymorphic genomes with modest computational requirements. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. Additional features include (1) handling of allelic variation using "bubble" structures within the deBruijn graph, (2) gap closing of repetitive and low quality regions using localized assemblies, and (3) an improved scaffolding algorithm that produces more complete assemblies without compromising onmore » scaffolding accuracy« less

  6. The variant call format and VCFtools.

    PubMed

    Danecek, Petr; Auton, Adam; Abecasis, Goncalo; Albers, Cornelis A; Banks, Eric; DePristo, Mark A; Handsaker, Robert E; Lunter, Gerton; Marth, Gabor T; Sherry, Stephen T; McVean, Gilean; Durbin, Richard

    2011-08-01

    The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. http://vcftools.sourceforge.net

  7. Impact of recombination on polymorphism of genes encoding Kunitz-type protease inhibitors in the genus Solanum.

    PubMed

    Speranskaya, Anna S; Krinitsina, Anastasia A; Kudryavtseva, Anna V; Poltronieri, Palmiro; Santino, Angelo; Oparina, Nina Y; Dmitriev, Alexey A; Belenikin, Maxim S; Guseva, Marina A; Shevelev, Alexei B

    2012-08-01

    The group of Kunitz-type protease inhibitors (KPI) from potato is encoded by a polymorphic family of multiple allelic and non-allelic genes. The previous explanations of the KPI variability were based on the hypothesis of random mutagenesis as a key factor of KPI polymorphism. KPI-A genes from the genomes of Solanum tuberosum cv. Istrinskii and the wild species Solanum palustre were amplified by PCR with subsequent cloning in plasmids. True KPI sequences were derived from comparison of the cloned copies. "Hot spots" of recombination in KPI genes were independently identified by DnaSP 4.0 and TOPALi v2.5 software. The KPI-A sequence from potato cv. Istrinskii was found to be 100% identical to the gene from Solanum nigrum. This fact illustrates a high degree of similarity of KPI genes in the genus Solanum. Pairwise comparison of KPI A and B genes unambiguously showed a non-uniform extent of polymorphism at different nt positions. Moreover, the occurrence of substitutions was not random along the strand. Taken together, these facts contradict the traditional hypothesis of random mutagenesis as a principal source of KPI gene polymorphism. The experimentally found mosaic structure of KPI genes in both plants studied is consistent with the hypothesis suggesting recombination of ancestral genes. The same mechanism was proposed earlier for other resistance-conferring genes in the nightshade family (Solanaceae). Based on the data obtained, we searched for potential motifs of site-specific binding with plant DNA recombinases. During this work, we analyzed the sequencing data reported by the Potato Genome Sequencing Consortium (PGSC), 2011 and found considerable inconsistence of their data concerning the number, location, and orientation of KPI genes of groups A and B. The key role of recombination rather than random point mutagenesis in KPI polymorphism was demonstrated for the first time. Copyright © 2012 Elsevier Masson SAS. All rights reserved.

  8. Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations.

    PubMed

    Feusier, Julie; Witherspoon, David J; Scott Watkins, W; Goubert, Clément; Sasani, Thomas A; Jorde, Lynn B

    2017-01-01

    Polymorphic human Alu elements are excellent tools for assessing population structure, and new retrotransposition events can contribute to disease. Next-generation sequencing has greatly increased the potential to discover Alu elements in human populations, and various sequencing and bioinformatics methods have been designed to tackle the problem of detecting these highly repetitive elements. However, current techniques for Alu discovery may miss rare, polymorphic Alu elements. Combining multiple discovery approaches may provide a better profile of the polymorphic Alu mobilome. Alu Yb8/9 elements have been a focus of our recent studies as they are young subfamilies (~2.3 million years old) that contribute ~30% of recent polymorphic Alu retrotransposition events. Here, we update our ME-Scan methods for detecting Alu elements and apply these methods to discover new insertions in a large set of individuals with diverse ancestral backgrounds. We identified 5,288 putative Alu insertion events, including several hundred novel Alu Yb8/9 elements from 213 individuals from 18 diverse human populations. Hundreds of these loci were specific to continental populations, and 23 non-reference population-specific loci were validated by PCR. We provide high-quality sequence information for 68 rare Alu Yb8/9 elements, of which 11 have hallmarks of an active source element. Our subfamily distribution of rare Alu Yb8/9 elements is consistent with previous datasets, and may be representative of rare loci. We also find that while ME-Scan and low-coverage, whole-genome sequencing (WGS) detect different Alu elements in 41 1000 Genomes individuals, the two methods yield similar population structure results. Current in-silico methods for Alu discovery may miss rare, polymorphic Alu elements. Therefore, using multiple techniques can provide a more accurate profile of Alu elements in individuals and populations. We improved our false-negative rate as an indicator of sample quality for future ME-Scan experiments. In conclusion, we demonstrate that ME-Scan is a good supplement for next-generation sequencing methods and is well-suited for population-level analyses.

  9. [Research advances of genomic GYP coding MNS blood group antigens].

    PubMed

    Liu, Chang-Li; Zhao, Wei-Jun

    2012-02-01

    The MNS blood group system includes more than 40 antigens, and the M, N, S and s antigens are the most significant ones in the system. The antigenic determinants of M and N antigens lie on the top of GPA on the surface of red blood cells, while the antigenic determinants of S and s antigens lie on the top of GPB on the surface of red blood cells. The GYPA gene coding GPA and the GYPB gene coding GPB locate at the longarm of chromosome 4 and display 95% homologus sequence, meanwhile both genes locate closely to GYPE gene that did not express product. These three genes formed "GYPA-GYPB-GYPE" structure called GYP genome. This review focuses on the molecular basis of genomic GYP and the variety of GYP genome in the expression of diversity MNS blood group antigens. The molecular basis of Miltenberger hybrid glycophorin polymorphism is specifically expounded.

  10. Computational intelligence in bioinformatics: SNP/haplotype data in genetic association study for common diseases.

    PubMed

    Kelemen, Arpad; Vasilakos, Athanasios V; Liang, Yulan

    2009-09-01

    Comprehensive evaluation of common genetic variations through association of single-nucleotide polymorphism (SNP) structure with common complex disease in the genome-wide scale is currently a hot area in human genome research due to the recent development of the Human Genome Project and HapMap Project. Computational science, which includes computational intelligence (CI), has recently become the third method of scientific enquiry besides theory and experimentation. There have been fast growing interests in developing and applying CI in disease mapping using SNP and haplotype data. Some of the recent studies have demonstrated the promise and importance of CI for common complex diseases in genomic association study using SNP/haplotype data, especially for tackling challenges, such as gene-gene and gene-environment interactions, and the notorious "curse of dimensionality" problem. This review provides coverage of recent developments of CI approaches for complex diseases in genetic association study with SNP/haplotype data.

  11. A Genomics-Based Model for Prediction of Severe Bioprosthetic Mitral Valve Calcification.

    PubMed

    Ponasenko, Anastasia V; Khutornaya, Maria V; Kutikhin, Anton G; Rutkovskaya, Natalia V; Tsepokina, Anna V; Kondyukova, Natalia V; Yuzhalin, Arseniy E; Barbarash, Leonid S

    2016-08-31

    Severe bioprosthetic mitral valve calcification is a significant problem in cardiovascular surgery. Unfortunately, clinical markers did not demonstrate efficacy in prediction of severe bioprosthetic mitral valve calcification. Here, we examined whether a genomics-based approach is efficient in predicting the risk of severe bioprosthetic mitral valve calcification. A total of 124 consecutive Russian patients who underwent mitral valve replacement surgery were recruited. We investigated the associations of the inherited variation in innate immunity, lipid metabolism and calcium metabolism genes with severe bioprosthetic mitral valve calcification. Genotyping was conducted utilizing the TaqMan assay. Eight gene polymorphisms were significantly associated with severe bioprosthetic mitral valve calcification and were therefore included into stepwise logistic regression which identified male gender, the T/T genotype of the rs3775073 polymorphism within the TLR6 gene, the C/T genotype of the rs2229238 polymorphism within the IL6R gene, and the A/A genotype of the rs10455872 polymorphism within the LPA gene as independent predictors of severe bioprosthetic mitral valve calcification. The developed genomics-based model had fair predictive value with area under the receiver operating characteristic (ROC) curve of 0.73. In conclusion, our genomics-based approach is efficient for the prediction of severe bioprosthetic mitral valve calcification.

  12. A Genomics-Based Model for Prediction of Severe Bioprosthetic Mitral Valve Calcification

    PubMed Central

    Ponasenko, Anastasia V.; Khutornaya, Maria V.; Kutikhin, Anton G.; Rutkovskaya, Natalia V.; Tsepokina, Anna V.; Kondyukova, Natalia V.; Yuzhalin, Arseniy E.; Barbarash, Leonid S.

    2016-01-01

    Severe bioprosthetic mitral valve calcification is a significant problem in cardiovascular surgery. Unfortunately, clinical markers did not demonstrate efficacy in prediction of severe bioprosthetic mitral valve calcification. Here, we examined whether a genomics-based approach is efficient in predicting the risk of severe bioprosthetic mitral valve calcification. A total of 124 consecutive Russian patients who underwent mitral valve replacement surgery were recruited. We investigated the associations of the inherited variation in innate immunity, lipid metabolism and calcium metabolism genes with severe bioprosthetic mitral valve calcification. Genotyping was conducted utilizing the TaqMan assay. Eight gene polymorphisms were significantly associated with severe bioprosthetic mitral valve calcification and were therefore included into stepwise logistic regression which identified male gender, the T/T genotype of the rs3775073 polymorphism within the TLR6 gene, the C/T genotype of the rs2229238 polymorphism within the IL6R gene, and the A/A genotype of the rs10455872 polymorphism within the LPA gene as independent predictors of severe bioprosthetic mitral valve calcification. The developed genomics-based model had fair predictive value with area under the receiver operating characteristic (ROC) curve of 0.73. In conclusion, our genomics-based approach is efficient for the prediction of severe bioprosthetic mitral valve calcification. PMID:27589735

  13. Genome-wide single-nucleotide polymorphism arrays demonstrate high fidelity of multiple displacement-based whole-genome amplification.

    PubMed

    Tzvetkov, Mladen V; Becker, Christian; Kulle, Bettina; Nürnberg, Peter; Brockmöller, Jürgen; Wojnowski, Leszek

    2005-02-01

    Whole-genome DNA amplification by multiple displacement (MD-WGA) is a promising tool to obtain sufficient DNA amounts from samples of limited quantity. Using Affymetrix' GeneChip Human Mapping 10K Arrays, we investigated the accuracy and allele amplification bias in DNA samples subjected to MD-WGA. We observed an excellent concordance (99.95%) between single-nucleotide polymorphisms (SNPs) called both in the nonamplified and the corresponding amplified DNA. This concordance was only 0.01% lower than the intra-assay reproducibility of the genotyping technique used. However, MD-WGA failed to amplify an estimated 7% of polymorphic loci. Due to the algorithm used to call genotypes, this was detected only for heterozygous loci. We achieved a 4.3-fold reduction of noncalled SNPs by combining the results from two independent MD-WGA reactions. This indicated that inter-reaction variations rather than specific chromosomal loci reduced the efficiency of MD-WGA. Consistently, we detected no regions of reduced amplification, with the exception of several SNPs located near chromosomal ends. Altogether, despite a substantial loss of polymorphic sites, MD-WGA appears to be the current method of choice to amplify genomic DNA for array-based SNP analyses. The number of nonamplified loci can be substantially reduced by amplifying each DNA sample in duplicate.

  14. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    PubMed

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  15. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    PubMed

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. Copyright © 2016 Teng et al.

  16. Characterization of a highly polymorphic region 5′ to JH in the human immunoglobulin heavy chain

    PubMed Central

    Silva, Alcino J.; Johnson, John P.; White, Raymond L.

    1987-01-01

    A cloned DNA segment 1.25 kilobases (kb) upstream from the joining segments of the human heavy chain immunoglobulin gene revealed extensive polymorphic variation at this locus, and the polymorphic pattern was stably transmitted to the next generation. Genomic restriction analysis showed that the polymorphism was caused by insertions/deletions within an MspI/BamHI fragment. Sequencing of one allele, 848 base pairs (bp) long, revealed eleven 50-base-pair tandem repeats. A second allele, 648 bp long, was cloned from a human genomic cosmid library, sequenced, and found to contain four fewer repeats than the first allele. A survey of 186 chromosomes from unrelated individuals of primarily northern European descent revealed at least six alleles. Images PMID:2884636

  17. Genomic Diversity of Erwinia carotovora subsp. carotovora and Its Correlation with Virulence

    PubMed Central

    Yap, Mee-Ngan; Barak, Jeri D.; Charkowski, Amy O.

    2004-01-01

    We used genetic and biochemical methods to examine the genomic diversity of the enterobacterial plant pathogen Erwinia carotovora subsp. carotovora. The results obtained with each method showed that E. carotovora subsp. carotovora strains isolated from one ecological niche, potato plants, are surprisingly diverse compared to related pathogens. A comparison of 23 partial mdh sequences revealed a maximum pairwise difference of 10.49% and an average pairwise difference of 2.13%, values which are much greater than the maximum variation (1.81%) and average variation (0.75%) previously reported for Escherichia coli. Pulsed-field gel electrophoresis analysis of I-CeuI-digested genomic DNA revealed seven rrn operons in all E. carotovora subsp. carotovora strains examined except strain WPP17, which had only six copies. We identified 26 I-CeuI restriction fragment length polymorphism patterns and observed significant polymorphism in fragment sizes ranging from 100 to 450 kb for all strains. We detected large plasmids in two strains, including the model strain E. carotovora subsp. carotovora 71. The two least virulent strains had an unusual chromosomal structure, suggesting that a particular pulsotype is correlated with virulence. To compare chromosomal organization of multiple enterobacterial genomes, several genes were mapped onto I-CeuI fragments. We identified portions of the genome that appear to be conserved across enterobacteria and portions that have undergone genome rearrangements. We found that the least virulent strain, WPP17, failed to oxidize cellobiose and was missing several hrp and hrc genes. The unexpected variability among isolates obtained from clonal hosts in one region and in one season suggests that factors other than the host plant, potato, drive the evolution of this common environmental bacterium and key plant pathogen. PMID:15128563

  18. Impact of genomic polymorphism on arterial hypertension after aortic coarctation repair.

    PubMed

    Hager, Alfred; Bildau, Judith; Kreuder, Joachim; Kaemmerer, Harald; Hess, John

    2011-08-18

    Even after repair of aortic coarctation without restenosis there is a high incidence of arterial hypertension. This study was performed to assess the contribution of several inherited gene polymorphisms, which are known to be related to essential hypertension. 122 patients aged 17-72 years, 46 women, and 2-27 years after repair of isolated aortic coarctation without restenosis were investigated. Genomic polymorphism of angiotensin converting enzyme (ACE I/D), angiotensinogen (AGT, c.704C>T), angiotensin II receptor type 1 (AGTR1, c.1166A>C), aldosterone synthase (CYP11B2, c.-344C>T), endothelin 1 (EDN1, EDN1/ex5-c.5665G>T), G protein (GNB3, c.825C>T), G protein-coupled receptor kinase 4 (GRK4, c.679C>T), fibrillin 1 (FBN1, VNTR(TAAA)) and two polymorphisms each of the ß1 adrenoreceptor (ADRB1, c.145G>A and c.1165C>G), ß2 adrenoreceptor (ADRB2, c.46A>G and c.79C>G), and endothelial NO synthase (NOS3, intron 4 I/D and NOS3, c.894G>T) were determined by PCR amplification and fragment length analysis. Patients were classified "normotensive", if they were not on antihypertensive drugs and showed normal blood pressure both on ambulatory measurement and exercise test. None of the investigated genomic polymorphism could be related to hypertension. Only patients with the ACE I/I genotype had a less pronounced nocturnal dipping and patients with a ADRB1 c.1165 C/C genotype had a higher systolic and mean blood pressure at night. Development of late hypertension after aortic coarctation repair could not be related to the investigated genomic polymorphism. The correlation of the ACE I/D and the ADRB1 c.1165C>G polymorphism to nocturnal dipping and blood pressure at nighttime needs further confirmation. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  19. Diversity Arrays Technology (DArT) for whole-genome profiling of barley

    PubMed Central

    Wenzl, Peter; Carling, Jason; Kudrna, David; Jaccoud, Damian; Huttner, Eric; Kleinhofs, Andris; Kilian, Andrzej

    2004-01-01

    Diversity Arrays Technology (DArT) can detect and type DNA variation at several hundred genomic loci in parallel without relying on sequence information. Here we show that it can be effectively applied to genetic mapping and diversity analyses of barley, a species with a 5,000-Mbp genome. We tested several complexity reduction methods and selected two that generated the most polymorphic genomic representations. Arrays containing individual fragments from these representations generated DArT fingerprints with a genotype call rate of 98.0% and a scoring reproducibility of at least 99.8%. The fingerprints grouped barley lines according to known genetic relationships. To validate the Mendelian behavior of DArT markers, we constructed a genetic map for a cross between cultivars Steptoe and Morex. Nearly all polymorphic array features could be incorporated into one of seven linkage groups (98.8%). The resulting map comprised ≈385 unique DArT markers and spanned 1,137 centimorgans. A comparison with the restriction fragment length polymorphism-based framework map indicated that the quality of the DArT map was equivalent, if not superior, to that of the framework map. These results highlight the potential of DArT as a generic technique for genome profiling in the context of molecular breeding and genomics. PMID:15192146

  20. Sequence Polymorphisms and Structural Variations among Four Grapevine (Vitis vinifera L.) Cultivars Representing Sardinian Agriculture

    PubMed Central

    Mercenaro, Luca; Nieddu, Giovanni; Porceddu, Andrea; Pezzotti, Mario; Camiolo, Salvatore

    2017-01-01

    The genetic diversity among grapevine (Vitis vinifera L.) cultivars that underlies differences in agronomic performance and wine quality reflects the accumulation of single nucleotide polymorphisms (SNPs) and small indels as well as larger genomic variations. A combination of high throughput sequencing and mapping against the grapevine reference genome allows the creation of comprehensive sequence variation maps. We used next generation sequencing and bioinformatics to generate an inventory of SNPs and small indels in four widely cultivated Sardinian grape cultivars (Bovale sardo, Cannonau, Carignano and Vermentino). More than 3,200,000 SNPs were identified with high statistical confidence. Some of the SNPs caused the appearance of premature stop codons and thus identified putative pseudogenes. The analysis of SNP distribution along chromosomes led to the identification of large genomic regions with uninterrupted series of homozygous SNPs. We used a digital comparative genomic hybridization approach to identify 6526 genomic regions with significant differences in copy number among the four cultivars compared to the reference sequence, including 81 regions shared between all four cultivars and 4953 specific to single cultivars (representing 1.2 and 75.9% of total copy number variation, respectively). Reads mapping at a distance that was not compatible with the insert size were used to identify a dataset of putative large deletions with cultivar Cannonau revealing the highest number. The analysis of genes mapping to these regions provided a list of candidates that may explain some of the phenotypic differences among the Bovale sardo, Cannonau, Carignano and Vermentino cultivars. PMID:28775732

  1. Single nucleotide polymorphism discovery via genotyping by sequencing to assess population genetic structure and recurrent polyploidization in Andropogon gerardii.

    PubMed

    McAllister, Christine A; Miller, Allison J

    2016-07-01

    Autopolyploidy, genome duplication within a single lineage, can result in multiple cytotypes within a species. Geographic distributions of cytotypes may reflect the evolutionary history of autopolyploid formation and subsequent population dynamics including stochastic (drift) and deterministic (differential selection among cytotypes) processes. Here, we used a population genomic approach to investigate whether autopolyploidy occurred once or multiple times in Andropogon gerardii, a widespread, North American grass with two predominant cytotypes. Genotyping by sequencing was used to identify single nucleotide polymorphisms (SNPs) in individuals collected from across the geographic range of A. gerardii. Two independent approaches to SNP calling were used: the reference-free UNEAK pipeline and a reference-guided approach based on the sequenced Sorghum bicolor genome. SNPs generated using these pipelines were analyzed independently with genetic distance and clustering. Analyses of the two SNP data sets showed very similar patterns of population-level clustering of A. gerardii individuals: a cluster of A. gerardii individuals from the southern Plains, a northern Plains cluster, and a western cluster. Groupings of individuals corresponded to geographic localities regardless of cytotype: 6x and 9x individuals from the same geographic area clustered together. SNPs generated using reference-guided and reference-free pipelines in A. gerardii yielded unique subsets of genomic data. Both data sets suggest that the 9x cytotype in A. gerardii likely evolved multiple times from 6x progenitors across the range of the species. Genomic approaches like GBS and diverse bioinformatics pipelines used here facilitate evolutionary analyses of complex systems with multiple ploidy levels. © 2016 Botanical Society of America.

  2. Similar Efficacies of Selection Shape Mitochondrial and Nuclear Genes in Both Drosophila melanogaster and Homo sapiens.

    PubMed

    Cooper, Brandon S; Burrus, Chad R; Ji, Chao; Hahn, Matthew W; Montooth, Kristi L

    2015-08-21

    Deleterious mutations contribute to polymorphism even when selection effectively prevents their fixation. The efficacy of selection in removing deleterious mitochondrial mutations from populations depends on the effective population size (Ne) of the mitochondrial DNA and the degree to which a lack of recombination magnifies the effects of linked selection. Using complete mitochondrial genomes from Drosophila melanogaster and nuclear data available from the same samples, we reexamine the hypothesis that nonrecombining animal mitochondrial DNA harbor an excess of deleterious polymorphisms relative to the nuclear genome. We find no evidence of recombination in the mitochondrial genome, and the much-reduced level of mitochondrial synonymous polymorphism relative to nuclear genes is consistent with a reduction in Ne. Nevertheless, we find that the neutrality index, a measure of the excess of nonsynonymous polymorphism relative to the neutral expectation, is only weakly significantly different between mitochondrial and nuclear loci. This difference is likely the result of the larger proportion of beneficial mutations in X-linked relative to autosomal loci, and we find little to no difference between mitochondrial and autosomal neutrality indices. Reanalysis of published data from Homo sapiens reveals a similar lack of a difference between the two genomes, although previous studies have suggested a strong difference in both species. Thus, despite a smaller Ne, mitochondrial loci of both flies and humans appear to experience similar efficacies of purifying selection as do loci in the recombining nuclear genome. Copyright © 2015 Cooper et al.

  3. Assessment of genetic diversity, population structure and relationships in Indian and non-Indian genotypes of finger millet (Eleusine coracana (L.) Gaertn) using genomic SSR markers.

    PubMed

    Ramakrishnan, M; Antony Ceasar, S; Duraipandiyan, V; Al-Dhabi, N A; Ignacimuthu, S

    2016-01-01

    We evaluated the genetic variation and population structure in Indian and non-Indian genotypes of finger millet using 87 genomic SSR primers. The 128 finger millet genotypes were collected and genomic DNA was isolated. Eighty-seven genomic SSR primers with 60-70 % GC contents were used for PCR analysis of 128 finger millet genotypes. The PCR products were separated and visualized on a 6 % polyacrylamide gel followed by silver staining. The data were used to estimate major allele frequency using Power Marker v3.0. Dendrograms were constructed based on the Jaccard's similarity coefficient. Statistical fitness and population structure analyses were performed to find the genetic diversity. The mean major allele frequency was 0.92; the means of polymorphic alleles were 2.13 per primer and 1.45 per genotype; the average polymorphism was 59.94 % per primer and average PIC value was 0.44 per primer. Indian genotypes produced an additional 0.21 allele than non-Indian genotypes. Gene diversity was in the range from 0.02 to 0.35. The average heterozygosity was 0.11, close to 100 % homozygosity. The highest inbreeding coefficient was observed with SSR marker UGEP67. The Jaccard's similarity coefficient value ranged from 0.011 to 0.836. The highest similarity value was 0.836 between genotypes DPI009-04 and GPU-45. Indian genotypes were placed in Eleusine coracana major cluster (EcMC) 1 along with 6 non-Indian genotypes. AMOVA showed that molecular variance in genotypes from various geographical regions was 4 %; among populations it was 3 % and within populations it was 93 %. PCA scatter plot analysis showed that GPU-28, GPU-45 and DPI009-04 were closely dispersed in first component axis. In structural analysis, the genotypes were divided into three subpopulations (SP1, SP2 and SP3). All the three subpopulations had an admixture of alleles and no pure line was observed. These analyses confirmed that all the genotypes were genetically diverse and had been grouped based on their geographic regions.

  4. On the allopolyploid origin and genome structure of the closely related species Hordeum secalinum and Hordeum capense inferred by molecular karyotyping.

    PubMed

    Cuadrado, Ángeles; de Bustos, Alfredo; Jouve, Nicolás

    2017-08-01

    To provide additional information to the many phylogenetic analyses conducted within Hordeum , here the origin and interspecific affinities of the allotetraploids Hordeum secalinum and Hordeum capense were analysed by molecular karyotyping. Karyotypes were determined using genomic in situ hybridization (GISH) to distinguish the sub-genomes and , plus fluorescence in situ hybridization (FISH)/non-denaturing (ND)-FISH to determine the distribution of ten tandem repetitive DNA sequences and thus provide chromosome markers. Each chromosome pair in the six accessions analysed was identified, allowing the establishment of homologous and putative homeologous relationships. The low-level polymorphism observed among the H. secalinum accessions contrasted with the divergence recorded for the sub-genome of the H. capense accessions. Although accession H335 carries an intergenomic translocation, its chromosome structure was indistinguishable from that of H. secalinum . Hordeum secalinum and H. capense accession H335 share a hybrid origin involving Hordeum marinum subsp. gussoneanum as the genome donor and an unidentified genome progenitor. Hordeum capense accession BCC2062 either diverged, with remodelling of the sub-genome, or its genome was donated by a now extinct ancestor. A scheme of probable evolution shows the intricate pattern of relationships among the Hordeum species carrying the genome (including all H. marinum taxa and the hexaploid Hordeum brachyantherum ). © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. Genome-wide survey and analysis of microsatellites in giant panda (Ailuropoda melanoleuca), with a focus on the applications of a novel microsatellite marker system.

    PubMed

    Huang, Jie; Li, Yu-Zhi; Du, Lian-Ming; Yang, Bo; Shen, Fu-Jun; Zhang, He-Min; Zhang, Zhi-He; Zhang, Xiu-Yue; Yue, Bi-Song

    2015-02-07

    The giant panda (Ailuropoda melanoleuca) is a critically endangered species endemic to China. Microsatellites have been preferred as the most popular molecular markers and proven effective in estimating population size, paternity test, genetic diversity for the critically endangered species. The availability of the giant panda complete genome sequences provided the opportunity to carry out genome-wide scans for all types of microsatellites markers, which now opens the way for the analysis and development of microsatellites in giant panda. By screening the whole genome sequence of giant panda in silico mining, we identified microsatellites in the genome of giant panda and analyzed their frequency and distribution in different genomic regions. Based on our search criteria, a repertoire of 855,058 SSRs was detected, with mono-nucleotides being the most abundant. SSRs were found in all genomic regions and were more abundant in non-coding regions than coding regions. A total of 160 primer pairs were designed to screen for polymorphic microsatellites using the selected tetranucleotide microsatellite sequences. The 51 novel polymorphic tetranucleotide microsatellite loci were discovered based on genotyping blood DNA from 22 captive giant pandas in this study. Finally, a total of 15 markers, which showed good polymorphism, stability, and repetition in faecal samples, were used to establish the novel microsatellite marker system for giant panda. Meanwhile, a genotyping database for Chengdu captive giant pandas (n = 57) were set up using this standardized system. What's more, a universal individual identification method was established and the genetic diversity were analysed in this study as the applications of this marker system. The microsatellite abundance and diversity were characterized in giant panda genomes. A total of 154,677 tetranucleotide microsatellites were identified and 15 of them were discovered as the polymorphic and stable loci. The individual identification method and the genetic diversity analysis method in this study provided adequate material for the future study of giant panda.

  6. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing.

    PubMed

    Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I

    2018-01-01

    Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.

  7. Genomic diversity of cercarial clones of Himasthla elongata (Trematoda, Echinostomatidae) determined with AFLP technique.

    PubMed

    Galaktionov, N K; Podgornaya, O I; Strelkov, P P; Galaktionov, K V

    2016-12-01

    The aim of this study was to reveal genomic diversity formed during parthenogenetic reproduction of rediae of the trematode Himasthla elongata in its molluskan host Littorina littorea. We applied amplification fragment length polymorphism (AFLP) to determine the genomic diversity of individual cercariae within the clone, that is, the infrapopulation of parthenogenetic progeny in a single molluskan host. The level of genomic diversity of particular cercariae isolates from a single clone, detected with EcoR1/Mse1 AFLP reaction, was significantly lower than the variability of cercariae from different clones. The presence of intraclonal genomic diversity indicates a nonsexual shuffle of alleles during parthenogenesis in the rediae of H. elongata. The obtained polymorphic AFLP fragments were long enough to detect the sequences that may be responsible for clonal genomic variability. Based on this, AFLP can be recommended as a tool for the study of genetic mechanisms of this variability.

  8. M13-Tailed Simple Sequence Repeat (SSR) Markers in Studies of Genetic Diversity and Population Structure of Common Oat Germplasm.

    PubMed

    Onyśk, Agnieszka; Boczkowska, Maja

    2017-01-01

    Simple Sequence Repeat (SSR) markers are one of the most frequently used molecular markers in studies of crop diversity and population structure. This is due to their uniform distribution in the genome, the high polymorphism, reproducibility, and codominant character. Additional advantages are the possibility of automatic analysis and simple interpretation of the results. The M13 tagged PCR reaction significantly reduces the costs of analysis by the automatic genetic analyzers. Here, we also disclose a short protocol of SSR data analysis.

  9. From conservation genetics to conservation genomics: a genome-wide assessment of blue whales (Balaenoptera musculus) in Australian feeding aggregations

    PubMed Central

    Sandoval-Castillo, Jonathan; Jenner, K. Curt S.; Gill, Peter C.; Jenner, Micheline-Nicole M.; Morrice, Margaret G.

    2018-01-01

    Genetic datasets of tens of markers have been superseded through next-generation sequencing technology with genome-wide datasets of thousands of markers. Genomic datasets improve our power to detect low population structure and identify adaptive divergence. The increased population-level knowledge can inform the conservation management of endangered species, such as the blue whale (Balaenoptera musculus). In Australia, there are two known feeding aggregations of the pygmy blue whale (B. m. brevicauda) which have shown no evidence of genetic structure based on a small dataset of 10 microsatellites and mtDNA. Here, we develop and implement a high-resolution dataset of 8294 genome-wide filtered single nucleotide polymorphisms, the first of its kind for blue whales. We use these data to assess whether the Australian feeding aggregations constitute one population and to test for the first time whether there is adaptive divergence between the feeding aggregations. We found no evidence of neutral population structure and negligible evidence of adaptive divergence. We propose that individuals likely travel widely between feeding areas and to breeding areas, which would require them to be adapted to a wide range of environmental conditions. This has important implications for their conservation as this blue whale population is likely vulnerable to a range of anthropogenic threats both off Australia and elsewhere. PMID:29410806

  10. Genomic survey of pathogenicity determinants and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. Manihotis strain CIO151.

    PubMed

    Arrieta-Ortiz, Mario L; Rodríguez-R, Luis M; Pérez-Quintero, Álvaro L; Poulin, Lucie; Díaz, Ana C; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D; Ortiz Quiñones, Juan F; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P; Tabima, Javier; Urrego Morales, Oscar G; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo; Koebnik, Ralf; Bernal, Adriana

    2013-01-01

    Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis scheme for epidemiological surveillance of this disease.

  11. Genomic Survey of Pathogenicity Determinants and VNTR Markers in the Cassava Bacterial Pathogen Xanthomonas axonopodis pv. Manihotis Strain CIO151

    PubMed Central

    Arrieta-Ortiz, Mario L.; Rodríguez-R, Luis M.; Pérez-Quintero, Álvaro L.; Poulin, Lucie; Díaz, Ana C.; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D.; Ortiz Quiñones, Juan F.; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B.; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P.; Tabima, Javier; Urrego Morales, Oscar G.; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo

    2013-01-01

    Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis scheme for epidemiological surveillance of this disease. PMID:24278159

  12. Canine candidate genes for dilated cardiomyopathy: annotation of and polymorphic markers for 14 genes

    PubMed Central

    Wiersma, Anje C; Leegwater, Peter AJ; van Oost, Bernard A; Ollier, William E; Dukes-McEwan, Joanna

    2007-01-01

    Background Dilated cardiomyopathy is a myocardial disease occurring in humans and domestic animals and is characterized by dilatation of the left ventricle, reduced systolic function and increased sphericity of the left ventricle. Dilated cardiomyopathy has been observed in several, mostly large and giant, dog breeds, such as the Dobermann and the Great Dane. A number of genes have been identified, which are associated with dilated cardiomyopathy in the human, mouse and hamster. These genes mainly encode structural proteins of the cardiac myocyte. Results We present the annotation of, and marker development for, 14 of these genes of the dog genome, i.e. α-cardiac actin, caveolin 1, cysteine-rich protein 3, desmin, lamin A/C, LIM-domain binding factor 3, myosin heavy polypeptide 7, phospholamban, sarcoglycan δ, titin cap, α-tropomyosin, troponin I, troponin T and vinculin. A total of 33 Single Nucleotide Polymorphisms were identified for these canine genes and 11 polymorphic microsatellite repeats were developed. Conclusion The presented polymorphisms provide a tool to investigate the role of the corresponding genes in canine Dilated Cardiomyopathy by linkage analysis or association studies. PMID:17949487

  13. Canine candidate genes for dilated cardiomyopathy: annotation of and polymorphic markers for 14 genes.

    PubMed

    Wiersma, Anje C; Leegwater, Peter Aj; van Oost, Bernard A; Ollier, William E; Dukes-McEwan, Joanna

    2007-10-19

    Dilated cardiomyopathy is a myocardial disease occurring in humans and domestic animals and is characterized by dilatation of the left ventricle, reduced systolic function and increased sphericity of the left ventricle. Dilated cardiomyopathy has been observed in several, mostly large and giant, dog breeds, such as the Dobermann and the Great Dane. A number of genes have been identified, which are associated with dilated cardiomyopathy in the human, mouse and hamster. These genes mainly encode structural proteins of the cardiac myocyte. We present the annotation of, and marker development for, 14 of these genes of the dog genome, i.e. alpha-cardiac actin, caveolin 1, cysteine-rich protein 3, desmin, lamin A/C, LIM-domain binding factor 3, myosin heavy polypeptide 7, phospholamban, sarcoglycan delta, titin cap, alpha-tropomyosin, troponin I, troponin T and vinculin. A total of 33 Single Nucleotide Polymorphisms were identified for these canine genes and 11 polymorphic microsatellite repeats were developed. The presented polymorphisms provide a tool to investigate the role of the corresponding genes in canine Dilated Cardiomyopathy by linkage analysis or association studies.

  14. Genetic Polymorphism in Wine Yeasts: Mechanisms and Methods for Its Detection

    PubMed Central

    Guillamón, José M.; Barrio, Eladio

    2017-01-01

    The processes of yeast selection for using as wine fermentation starters have revealed a great phenotypic diversity both at interspecific and intraspecific level, which is explained by a corresponding genetic variation among different yeast isolates. Thus, the mechanisms involved in promoting these genetic changes are the main engine generating yeast biodiversity. Currently, an important task to understand biodiversity, population structure and evolutionary history of wine yeasts is the study of the molecular mechanisms involved in yeast adaptation to wine fermentation, and on remodeling the genomic features of wine yeast, unconsciously selected since the advent of winemaking. Moreover, the availability of rapid and simple molecular techniques that show genetic polymorphisms at species and strain levels have enabled the study of yeast diversity during wine fermentation. This review will summarize the mechanisms involved in generating genetic polymorphisms in yeasts, the molecular methods used to unveil genetic variation, and the utility of these polymorphisms to differentiate strains, populations, and species in order to infer the evolutionary history and the adaptive evolution of wine yeasts, and to identify their influence on their biotechnological and sensorial properties. PMID:28522998

  15. Genetic discovery in Xylella fastidiosa through sequence analysis of selected randomly amplified polymorphic DNAs.

    PubMed

    Chen, Jianchi; Civerolo, Edwin L; Jarret, Robert L; Van Sluys, Marie-Anne; de Oliveira, Mariana C

    2005-02-01

    Xylella fastidiosa causes many important plant diseases including Pierce's disease (PD) in grape and almond leaf scorch disease (ALSD). DNA-based methodologies, such as randomly amplified polymorphic DNA (RAPD) analysis, have been playing key roles in genetic information collection of the bacterium. This study further analyzed the nucleotide sequences of selected RAPDs from X. fastidiosa strains in conjunction with the available genome sequence databases and unveiled several previously unknown novel genetic traits. These include a sequence highly similar to those in the phage family of Podoviridae. Genome comparisons among X. fastidiosa strains suggested that the "phage" is currently active. Two other RAPDs were also related to horizontal gene transfer: one was part of a broadly distributed cryptic plasmid and the other was associated with conjugal transfer. One RAPD inferred a genomic rearrangement event among X. fastidiosa PD strains and another identified a single nucleotide polymorphism of evolutionary value.

  16. Marsupials and monotremes possess a novel family of MHC class I genes that is lost from the eutherian lineage.

    PubMed

    Papenfuss, Anthony T; Feng, Zhi-Ping; Krasnec, Katina; Deakin, Janine E; Baker, Michelle L; Miller, Robert D

    2015-07-22

    Major histocompatibility complex (MHC) class I genes are found in the genomes of all jawed vertebrates. The evolution of this gene family is closely tied to the evolution of the vertebrate genome. Family members are frequently found in four paralogous regions, which were formed in two rounds of genome duplication in the early vertebrates, but in some species class Is have been subject to additional duplication or translocation, creating additional clusters. The gene family is traditionally grouped into two subtypes: classical MHC class I genes that are usually MHC-linked, highly polymorphic, expressed in a broad range of tissues and present endogenously-derived peptides to cytotoxic T-cells; and non-classical MHC class I genes generally have lower polymorphism, may have tissue-specific expression and have evolved to perform immune-related or non-immune functions. As immune genes can evolve rapidly and are subject to different selection pressure, we hypothesised that there may be divergent, as yet unannotated or uncharacterised class I genes. Application of a novel method of sensitive genome searching of available vertebrate genome sequences revealed a new, extensive sub-family of divergent MHC class I genes, denoted as UT, which has not previously been characterized. These class I genes are found in both American and Australian marsupials, and in monotremes, at an evolutionary chromosomal breakpoint, but are not present in non-mammalian genomes and have been lost from the eutherian lineage. We show that UT family members are expressed in the thymus of the gray short-tailed opossum and in other immune tissues of several Australian marsupials. Structural homology modelling shows that the proteins encoded by this family are predicted to have an open, though short, antigen-binding groove. We have identified a novel sub-family of putatively non-classical MHC class I genes that are specific to marsupials and monotremes. This family was present in the ancestral mammal and is found in extant marsupials and monotremes, but has been lost from the eutherian lineage. The function of this family is as yet unknown, however, their predicted structure may be consistent with presentation of antigens to T-cells.

  17. Genome skimming identifies polymorphism in tern populations and species

    PubMed Central

    2012-01-01

    Background Terns (Charadriiformes: Sterninae) are a lineage of cosmopolitan shorebirds with a disputed evolutionary history that comprises several species of conservation concern. As a non-model system in genetics, previous study has left most of the nuclear genome unexplored, and population-level studies are limited to only 15% of the world's species of terns and noddies. Screening of polymorphic nuclear sequence markers is needed to enhance genetic resolution because of supposed low mitochondrial mutation rate, documentation of nuclear insertion of hypervariable mitochondrial regions, and limited success of microsatellite enrichment in terns. Here, we investigated the phylogenetic and population genetic utility for terns and relatives of a variety of nuclear markers previously developed for other birds and spanning the nuclear genome. Markers displaying a variety of mutation rates from both the nuclear and mitochondrial genome were tested and prioritized according to optimal cross-species amplification and extent of genetic polymorphism between (1) the main tern clades and (2) individual Royal Terns (Thalasseus maxima) breeding on the US East Coast. Results Results from this genome skimming effort yielded four new nuclear sequence-based markers for tern phylogenetics and 11 intra-specific polymorphic markers. Further, comparison between the two genomes indicated a phylogenetic conflict at the base of terns, involving the inclusion (mitochondrial) or exclusion (nuclear) of the Angel Tern (Gygis alba). Although limited mitochondrial variation was confirmed, both nuclear markers and a short tandem repeat in the mitochondrial control region indicated the presence of considerable genetic variation in Royal Terns at a regional scale. Conclusions These data document the value of intronic markers to the study of terns and allies. We expect that these and additional markers attained through next-generation sequencing methods will accurately map the genetic origin and species history of this group of birds. PMID:22333071

  18. Substitutions of short heterologous DNA segments of intragenomic or extragenomic origins produce clustered genomic polymorphisms

    PubMed Central

    Harms, Klaus; Lunnan, Asbjørn; Hülter, Nils; Mourier, Tobias; Vinner, Lasse; Andam, Cheryl P.; Marttinen, Pekka; Fridholm, Helena; Hansen, Anders Johannes; Hanage, William P.; Nielsen, Kaare Magne; Willerslev, Eske; Johnsen, Pål Jarle

    2016-01-01

    In a screen for unexplained mutation events we identified a previously unrecognized mechanism generating clustered DNA polymorphisms such as microindels and cumulative SNPs. The mechanism, short-patch double illegitimate recombination (SPDIR), facilitates short single-stranded DNA molecules to invade and replace genomic DNA through two joint illegitimate recombination events. SPDIR is controlled by key components of the cellular genome maintenance machinery in the gram-negative bacterium Acinetobacter baylyi. The source DNA is primarily intragenomic but can also be acquired through horizontal gene transfer. The DNA replacements are nonreciprocal and locus independent. Bioinformatic approaches reveal occurrence of SPDIR events in the gram-positive human pathogen Streptococcus pneumoniae and in the human genome. PMID:27956618

  19. Detection and validation of single feature polymorphisms using RNA expression data from a rice genome array

    USDA-ARS?s Scientific Manuscript database

    A large number of genetic variations have been identified in rice. Such variations must in many cases control phenotypic differences in abiotic stress tolerance and other traits. A single feature polymorphism (SFP) is an oligonucleotide array-based polymorphism which can be used for identification o...

  20. Genomic polymorphism, recombination, and linkage disequilibrium in human major histocompatibility complex-encoded antigen-processing genes.

    PubMed Central

    van Endert, P M; Lopez, M T; Patel, S D; Monaco, J J; McDevitt, H O

    1992-01-01

    Recently, two subunits of a large cytosolic protease and two putative peptide transporter proteins were found to be encoded by genes within the class II region of the major histocompatibility complex (MHC). These genes have been suggested to be involved in the processing of antigenic proteins for presentation by MHC class I molecules. Because of the high degree of polymorphism in MHC genes, and previous evidence for both functional and polypeptide sequence polymorphism in the proteins encoded by the antigen-processing genes, we tested DNA from 27 consanguineous human cell lines for genomic polymorphism by restriction fragment length polymorphism (RFLP) analysis. These studies demonstrate a strong linkage disequilibrium between TAP1 and LMP2 RFLPs. Moreover, RFLPs, as well as a polymorphic stop codon in the telomeric TAP2 gene, appear to be in linkage disequilibrium with HLA-DR alleles and RFLPs in the HLA-DO gene. A high rate of recombination, however, seems to occur in the center of the complex, between the TAP1 and TAP2 genes. Images PMID:1360671

  1. Typing and comparative genome analysis of Brucella melitensis isolated from Lebanon.

    PubMed

    Abou Zaki, Natalia; Salloum, Tamara; Osman, Marwan; Rafei, Rayane; Hamze, Monzer; Tokajian, Sima

    2017-10-16

    Brucella melitensis is the main causative agent of the zoonotic disease brucellosis. This study aimed at typing and characterizing genetic variation in 33 Brucella isolates recovered from patients in Lebanon. Bruce-ladder multiplex PCR and PCR-RFLP of omp31, omp2a and omp2b were performed. Sixteen representative isolates were chosen for draft-genome sequencing and analyzed to determine variations in virulence, resistance, genomic islands, prophages and insertion sequences. Comparative whole-genome single nucleotide polymorphism analysis was also performed. The isolates were confirmed to be B. melitensis. Genome analysis revealed multiple virulence determinants and efflux pumps. Genome comparisons and single nucleotide polymorphisms divided the isolates based on geographical distribution but revealed high levels of similarity between the strains. Sequence divergence in B. melitensis was mainly due to lateral gene transfer of mobile elements. This is the first report of an in-depth genomic characterization of B. melitensis in Lebanon. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Purification of polymorphic components of complex genomes

    DOEpatents

    Stodolsky, Marvin

    1991-01-01

    A method is disclosed for processing related subject and reference macromolecule populations composed of complementary strands into their respective subject and reference populations of representative fragments and effectuating purification of unique polymorphic subject fragments.

  3. Purification of polymorphic components of complex genomes

    DOEpatents

    Stodolsky, M.

    1988-01-21

    A method for processing related subject and reference macromolecule composed of complementary strand into their respective subject and reference populations of representative fragments and effectuating purification of unique polymorphic subject fragments. 1 fig.

  4. DangerTrack: A scoring system to detect difficult-to-assess regions.

    PubMed

    Dolgalev, Igor; Sedlazeck, Fritz; Busby, Ben

    2017-01-01

    Over recent years, multiple groups have shown that a large number of structural variants, repeats, or problems with the underlying genome assembly have dramatic effects on the mapping, calling, and overall reliability of single nucleotide polymorphism calls. This project endeavored to develop an easy-to-use track for looking at structural variant and repeat regions. This track, DangerTrack, can be displayed alongside the existing Genome Reference Consortium assembly tracks to warn clinicians and biologists when variants of interest may be incorrectly called, of dubious quality, or on an insertion or copy number expansion. While mapping and variant calling can be automated, it is our opinion that when these regions are of interest to a particular clinical or research group, they warrant a careful examination, potentially involving localized reassembly. DangerTrack is available at https://github.com/DCGenomics/DangerTrack.

  5. Cadmium-induced genomic instability in Arabidopsis: Molecular toxicological biomarkers for early diagnosis of cadmium stress.

    PubMed

    Wang, Hetong; He, Lei; Song, Jie; Cui, Weina; Zhang, Yanzhao; Jia, Chunyun; Francis, Dennis; Rogers, Hilary J; Sun, Lizong; Tai, Peidong; Hui, Xiujuan; Yang, Yuesuo; Liu, Wan

    2016-05-01

    Microsatellite instability (MSI) analysis, random-amplified polymorphic DNA (RAPD), and methylation-sensitive arbitrarily primed PCR (MSAP-PCR) are methods to evaluate the toxicity of environmental pollutants in stress-treated plants and human cancer cells. Here, we evaluate these techniques to screen for genetic and epigenetic alterations of Arabidopsis plantlets exposed to 0-5.0 mg L(-1) cadmium (Cd) for 15 d. There was a substantial increase in RAPD polymorphism of 24.5, and in genomic methylation polymorphism of 30.5-34.5 at CpG and of 14.5-20 at CHG sites under Cd stress of 5.0 mg L(-1) by RAPD and of 0.25-5.0 mg L(-1) by MSAP-PCR, respectively. However, only a tiny increase of 1.5 loci by RAPD occurred under Cd stress of 4.0 mg L(-1), and an additional high dose (8.0 mg L(-1)) resulted in one repeat by MSI analysis. MSAP-PCR detected the most significant epigenetic modifications in plantlets exposed to Cd stress, and the patterns of hypermethylation and polymorphisms were consistent with inverted U-shaped dose responses. The presence of genomic methylation polymorphism in Cd-treated seedlings, prior to the onset of RAPD polymorphism, MSI and obvious growth effects, suggests that these altered DNA methylation loci are the most sensitive biomarkers for early diagnosis and risk assessment of genotoxic effects of Cd pollution in ecotoxicology. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Fitness consequences of polymorphic inversions in the zebra finch genome.

    PubMed

    Knief, Ulrich; Hemmrich-Stanisak, Georg; Wittig, Michael; Franke, Andre; Griffith, Simon C; Kempenaers, Bart; Forstmeier, Wolfgang

    2016-09-29

    Inversion polymorphisms constitute an evolutionary puzzle: they should increase embryo mortality in heterokaryotypic individuals but still they are widespread in some taxa. Some insect species have evolved mechanisms to reduce the cost of embryo mortality but humans have not. In birds, a detailed analysis is missing although intraspecific inversion polymorphisms are regarded as common. In Australian zebra finches (Taeniopygia guttata), two polymorphic inversions are known cytogenetically and we set out to detect these two and potentially additional inversions using genomic tools and study their effects on embryo mortality and other fitness-related and morphological traits. Using whole-genome SNP data, we screened 948 wild zebra finches for polymorphic inversions and describe four large (12-63 Mb) intraspecific inversion polymorphisms with allele frequencies close to 50 %. Using additional data from 5229 birds and 9764 eggs from wild and three captive zebra finch populations, we show that only the largest inversions increase embryo mortality in heterokaryotypic males, with surprisingly small effect sizes. We test for a heterozygote advantage on other fitness components but find no evidence for heterosis for any of the inversions. Yet, we find strong additive effects on several morphological traits. The mechanism that has carried the derived inversion haplotypes to such high allele frequencies remains elusive. It appears that selection has effectively minimized the costs associated with inversions in zebra finches. The highly skewed distribution of recombination events towards the chromosome ends in zebra finches and other estrildid species may function to minimize crossovers in the inverted regions.

  7. Neutral polymorphisms in putative housekeeping genes and tandem repeats unravels the population genetics and evolutionary history of Plasmodium vivax in India.

    PubMed

    Prajapati, Surendra K; Joshi, Hema; Carlton, Jane M; Rizvi, M Alam

    2013-01-01

    The evolutionary history and age of Plasmodium vivax has been inferred as both recent and ancient by several studies, mainly using mitochondrial genome diversity. Here we address the age of P. vivax on the Indian subcontinent using selectively neutral housekeeping genes and tandem repeat loci. Analysis of ten housekeeping genes revealed a substantial number of SNPs (n = 75) from 100 P. vivax isolates collected from five geographical regions of India. Neutrality tests showed a majority of the housekeeping genes were selectively neutral, confirming the suitability of housekeeping genes for inferring the evolutionary history of P. vivax. In addition, a genetic differentiation test using housekeeping gene polymorphism data showed a lack of geographical structuring between the five regions of India. The coalescence analysis of the time to the most recent common ancestor estimate yielded an ancient TMRCA (232,228 to 303,030 years) and long-term population history (79,235 to 104,008) of extant P. vivax on the Indian subcontinent. Analysis of 18 tandem repeat loci polymorphisms showed substantial allelic diversity and heterozygosity per locus, and analysis of potential bottlenecks revealed the signature of a stable P. vivax population, further corroborating our ancient age estimates. For the first time we report a comparable evolutionary history of P. vivax inferred by nuclear genetic markers (putative housekeeping genes) to that inferred from mitochondrial genome diversity.

  8. Short loop length and high thermal stability determine genomic instability induced by G-quadruplex-forming minisatellites

    PubMed Central

    Piazza, Aurèle; Adrian, Michael; Samazan, Frédéric; Heddi, Brahim; Hamon, Florian; Serero, Alexandre; Lopes, Judith; Teulade-Fichou, Marie-Paule; Phan, Anh Tuân; Nicolas, Alain

    2015-01-01

    G-quadruplexes (G4) are polymorphic four-stranded structures formed by certain G-rich nucleic acids, with various biological roles. However, structural features dictating their formation and/or functionin vivo are unknown. InS. cerevisiae, the pathological persistency of G4 within the CEB1 minisatellite induces its rearrangement during leading-strand replication. We now show that several other G4-forming sequences remain stable. Extensive mutagenesis of the CEB25 minisatellite motif reveals that only variants with very short (≤ 4 nt) G4 loops preferentially containing pyrimidine bases trigger genomic instability. Parallel biophysical analyses demonstrate that shortening loop length does not change the monomorphic G4 structure of CEB25 variants but drastically increases its thermal stability, in correlation with thein vivo instability. Finally, bioinformatics analyses reveal that the threat for genomic stability posed by G4 bearing short pyrimidine loops is conserved inC. elegans and humans. This work provides a framework explanation for the heterogeneous instability behavior of G4-forming sequencesin vivo, highlights the importance of structure thermal stability, and questions the prevailing assumption that G4 structures with short or longer loops are as likely to formin vivo. PMID:25956747

  9. Nanomanipulation of Single RNA Molecules by Optical Tweezers

    PubMed Central

    Stephenson, William; Wan, Gorby; Tenenbaum, Scott A.; Li, Pan T. X.

    2014-01-01

    A large portion of the human genome is transcribed but not translated. In this post genomic era, regulatory functions of RNA have been shown to be increasingly important. As RNA function often depends on its ability to adopt alternative structures, it is difficult to predict RNA three-dimensional structures directly from sequence. Single-molecule approaches show potentials to solve the problem of RNA structural polymorphism by monitoring molecular structures one molecule at a time. This work presents a method to precisely manipulate the folding and structure of single RNA molecules using optical tweezers. First, methods to synthesize molecules suitable for single-molecule mechanical work are described. Next, various calibration procedures to ensure the proper operations of the optical tweezers are discussed. Next, various experiments are explained. To demonstrate the utility of the technique, results of mechanically unfolding RNA hairpins and a single RNA kissing complex are used as evidence. In these examples, the nanomanipulation technique was used to study folding of each structural domain, including secondary and tertiary, independently. Lastly, the limitations and future applications of the method are discussed. PMID:25177917

  10. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomic Data.

    PubMed

    Bolser, Dan M; Staines, Daniel M; Perry, Emily; Kersey, Paul J

    2017-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for 39 sequenced plant species. Available data includes genome sequence, gene models, functional annotation, and polymorphic loci; for the latter, additional information including population structure, individual genotypes, linkage, and phenotype data is available for some species. Comparative data is also available, including genomic alignments and "gene trees," which show the inferred evolutionary history of each gene family represented in the resource. Access to the data is provided through a genome browser, which incorporates many specialist interfaces for different data types, through a variety of programmatic interfaces, and via a specialist data mining tool supporting rapid filtering and retrieval of bulk data. Genomic data from many non-plant species, including those of plant pathogens, pests, and pollinators, is also available via the same interfaces through other divisions of Ensembl.Ensembl Plants is updated 4-6 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.eu ).

  11. Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations.

    PubMed

    Teo, Yik-Ying; Sim, Xueling; Ong, Rick T H; Tan, Adrian K S; Chen, Jieming; Tantoso, Erwin; Small, Kerrin S; Ku, Chee-Seng; Lee, Edmund J D; Seielstad, Mark; Chia, Kee-Seng

    2009-11-01

    The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser.

  12. Singapore Genome Variation Project: A haplotype map of three Southeast Asian populations

    PubMed Central

    Teo, Yik-Ying; Sim, Xueling; Ong, Rick T.H.; Tan, Adrian K.S.; Chen, Jieming; Tantoso, Erwin; Small, Kerrin S.; Ku, Chee-Seng; Lee, Edmund J.D.; Seielstad, Mark; Chia, Kee-Seng

    2009-01-01

    The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser. PMID:19700652

  13. Copy Number Variations in Tilapia Genomes.

    PubMed

    Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

    2017-02-01

    Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2  > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.

  14. Chromosomal Rearrangements as Barriers to Genetic Homogenization between Archaic and Modern Humans

    PubMed Central

    Rogers, Rebekah L.

    2015-01-01

    Chromosomal rearrangements, which shuffle DNA throughout the genome, are an important source of divergence across taxa. Using a paired-end read approach with Illumina sequence data for archaic humans, I identify changes in genome structure that occurred recently in human evolution. Hundreds of rearrangements indicate genomic trafficking between the sex chromosomes and autosomes, raising the possibility of sex-specific changes. Additionally, genes adjacent to genome structure changes in Neanderthals are associated with testis-specific expression, consistent with evolutionary theory that new genes commonly form with expression in the testes. I identify one case of new-gene creation through transposition from the Y chromosome to chromosome 10 that combines the 5′-end of the testis-specific gene Fank1 with previously untranscribed sequence. This new transcript experienced copy number expansion in archaic genomes, indicating rapid genomic change. Among rearrangements identified in Neanderthals, 13% are transposition of selfish genetic elements, whereas 32% appear to be ectopic exchange between repeats. In Denisovan, the pattern is similar but numbers are significantly higher with 18% of rearrangements reflecting transposition and 40% ectopic exchange between distantly related repeats. There is an excess of divergent rearrangements relative to polymorphism in Denisovan, which might result from nonuniform rates of mutation, possibly reflecting a burst of transposable element activity in the lineage that led to Denisovan. Finally, loci containing genome structure changes show diminished rates of introgression from Neanderthals into modern humans, consistent with the hypothesis that rearrangements serve as barriers to gene flow during hybridization. Together, these results suggest that this previously unidentified source of genomic variation has important biological consequences in human evolution. PMID:26399483

  15. An Efficient Strategy Combining SSR Markers- and Advanced QTL-seq-driven QTL Mapping Unravels Candidate Genes Regulating Grain Weight in Rice

    PubMed Central

    Daware, Anurag; Das, Sweta; Srivastava, Rishi; Badoni, Saurabh; Singh, Ashok K.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    Development and use of genome-wide informative simple sequence repeat (SSR) markers and novel integrated genomic strategies are vital to drive genomics-assisted breeding applications and for efficient dissection of quantitative trait loci (QTLs) underlying complex traits in rice. The present study developed 6244 genome-wide informative SSR markers exhibiting in silico fragment length polymorphism based on repeat-unit variations among genomic sequences of 11 indica, japonica, aus, and wild rice accessions. These markers were mapped on diverse coding and non-coding sequence components of known cloned/candidate genes annotated from 12 chromosomes and revealed a much higher amplification (97%) and polymorphic potential (88%) along with wider genetic/functional diversity level (16–74% with a mean 53%) especially among accessions belonging to indica cultivar group, suggesting their utility in large-scale genomics-assisted breeding applications in rice. A high-density 3791 SSR markers-anchored genetic linkage map (IR 64 × Sonasal) spanning 2060 cM total map-length with an average inter-marker distance of 0.54 cM was generated. This reference genetic map identified six major genomic regions harboring robust QTLs (31% combined phenotypic variation explained with a 5.7–8.7 LOD) governing grain weight on six rice chromosomes. One strong grain weight major QTL region (OsqGW5.1) was narrowed-down by integrating traditional QTL mapping with high-resolution QTL region-specific integrated SSR and single nucleotide polymorphism markers-based QTL-seq analysis and differential expression profiling. This led us to delineate two natural allelic variants in two known cis-regulatory elements (RAV1AAT and CARGCW8GAT) of glycosyl hydrolase and serine carboxypeptidase genes exhibiting pronounced seed-specific differential regulation in low (Sonasal) and high (IR 64) grain weight mapping parental accessions. Our genome-wide SSR marker resource (polymorphic within/between diverse cultivar groups) and integrated genomic strategy can efficiently scan functionally relevant potential molecular tags (markers, candidate genes and alleles) regulating complex agronomic traits (grain weight) and expedite marker-assisted genetic enhancement in rice. PMID:27833617

  16. High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

    PubMed Central

    2011-01-01

    Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434

  17. Single Nucleotide Variations of the Human GR Gene Manifested as Pathologic Mutations or Polymorphisms.

    PubMed

    Kino, Tomoshige

    2018-05-11

    The human genome contains numerous single nucleotide variations (SNVs), and the human GR gene harbors ∼450 of these genetic changes. Among them, extremely rare non-synonymous variants known as pathologic GR gene mutations develop a characteristic pathologic condition, familial/sporadic generalized glucocorticoid resistance syndrome, by replacing the amino acids critical for GR protein structure and functions, whereas others known as pathologic polymorphisms develop mild manifestations recognized mainly at population bases by changing the GR activities slightly. Recent progress on the structural analysis to the GR protein and subsequent computer-based structural simulation revealed details of the molecular defects caused by such pathologic GR gene mutations, including their impact on the receptor interaction to ligands, nuclear receptor coactivators (NCoAs) or DNA glucocorticoid response elements (GREs). Indeed, those found in the GR ligand-binding domain significantly damage protein structure of the ligand-binding pocket and/or the activation function-2 transactivation domain and change their molecular interaction to glucocorticoids or the LxxLL signature motif of NCoAs. Two mutations found in GR DBD also affect interaction of the mutant receptors to GRE DNA by affecting the critical amino acid for the interaction or changing local hydrophobic circumstance. In this review, we discuss recent findings on the structural simulation of the pathologic GR mutants in connection to their functional and clinical impacts along with brief explanation to recent research achievement on the GR polymorphisms.

  18. A universal method for automated gene mapping

    PubMed Central

    Zipperlen, Peder; Nairz, Knud; Rimann, Ivo; Basler, Konrad; Hafen, Ernst; Hengartner, Michael; Hajnal, Alex

    2005-01-01

    Small insertions or deletions (InDels) constitute a ubiquituous class of sequence polymorphisms found in eukaryotic genomes. Here, we present an automated high-throughput genotyping method that relies on the detection of fragment-length polymorphisms (FLPs) caused by InDels. The protocol utilizes standard sequencers and genotyping software. We have established genome-wide FLP maps for both Caenorhabditis elegans and Drosophila melanogaster that facilitate genetic mapping with a minimum of manual input and at comparatively low cost. PMID:15693948

  19. The polymorphisms of the chromatin fiber

    NASA Astrophysics Data System (ADS)

    Boulé, Jean-Baptiste; Mozziconacci, Julien; Lavelle, Christophe

    2015-01-01

    In eukaryotes, the genome is packed into chromosomes, each consisting of large polymeric fibers made of DNA bound with proteins (mainly histones) and RNA molecules. The nature and precise 3D organization of this fiber has been a matter of intense speculations and debates. In the emerging picture, the local chromatin state plays a critical role in all fundamental DNA transactions, such as transcriptional control, DNA replication or repair. However, the molecular and structural mechanisms involved remain elusive. The purpose of this review is to give an overview of the tremendous efforts that have been made for almost 40 years to build physiologically relevant models of chromatin structure. The motivation behind building such models was to shift our representation and understanding of DNA transactions from a too simplistic ‘naked DNA’ view to a more realistic ‘coated DNA’ view, as a step towards a better framework in which to interpret mechanistically the control of genetic expression and other DNA metabolic processes. The field has evolved from a speculative point of view towards in vitro biochemistry and in silico modeling, but is still longing for experimental in vivo validations of the proposed structures or even proof of concept experiments demonstrating a clear role of a given structure in a metabolic transaction. The mere existence of a chromatin fiber as a relevant biological entity in vivo has been put into serious questioning. Current research is suggesting a possible reconciliation between theoretical studies and experiments, pointing towards a view where the polymorphic and dynamic nature of the chromatin fiber is essential to support its function in genome metabolism.

  20. Detection of DNA "fingerprints" of cultivated rice by hybridization with a human minisatellite DNA probe.

    PubMed

    Dallas, J F

    1988-09-01

    A human minisatellite DNA probe detects several restriction fragment length polymorphisms in cultivars of Asian and African rice. Certain fragments appear to be inherited in a Mendelian fashion and may represent unlinked loci. The hybridization patterns appear to be cultivar-specific and largely unchanged after the regeneration of plants from tissue culture. The results suggest that these regions of the rice genome may be used to generate cultivar-specific DNA fingerprints. The demonstration of similarity between a human minisatellite sequence and polymorphic regions in the rice genome suggests that such regions also occur in the genomes of many other plant species.

  1. The 8p23 inversion polymorphism determines local recombination heterogeneity across human populations.

    PubMed

    Alves, Joao M; Chikhi, Lounès; Amorim, António; Lopes, Alexandra M

    2014-04-01

    For decades, chromosomal inversions have been regarded as fascinating evolutionary elements as they are expected to suppress recombination between chromosomes with opposite orientations, leading to the accumulation of genetic differences between the two configurations over time. Here, making use of publicly available population genotype data for the largest polymorphic inversion in the human genome (8p23-inv), we assessed whether this inhibitory effect of inversion rearrangements led to significant differences in the recombination landscape of two homologous DNA segments, with opposite orientation. Our analysis revealed that the accumulation of genetic differentiation is positively correlated with the variation in recombination profiles. The observed recombination dissimilarity between inversion types is consistent across all populations analyzed and surpasses the effects of geographic structure, suggesting that both structures (orientations) have been evolving independently over an extended period of time, despite being subjected to the very same demographic history. Aside this mainly independent evolution, we also identified a short segment (350 kb, <10% of the whole inversion) in the central region of the inversion where the genetic divergence between the two structural haplotypes is diminished. Although it is difficult to demonstrate it, this could be due to gene flow (possibly via double-crossing over events), which is consistent with the higher recombination rates surrounding this segment. This study demonstrates for the first time that chromosomal inversions influence the recombination landscape at a fine-scale and highlights the role of these rearrangements as drivers of genome evolution.

  2. The origin, global distribution, and functional impact of the human 8p23 inversion polymorphism.

    PubMed

    Salm, Maximilian P A; Horswell, Stuart D; Hutchison, Claire E; Speedy, Helen E; Yang, Xia; Liang, Liming; Schadt, Eric E; Cookson, William O; Wierzbicki, Anthony S; Naoumova, Rossi P; Shoulders, Carol C

    2012-06-01

    Genomic inversions are an increasingly recognized source of genetic variation. However, a lack of reliable high-throughput genotyping assays for these structures has precluded a full understanding of an inversion's phylogenetic, phenotypic, and population genetic properties. We characterize these properties for one of the largest polymorphic inversions in man (the ∼4.5-Mb 8p23.1 inversion), a structure that encompasses numerous signals of natural selection and disease association. We developed and validated a flexible bioinformatics tool that utilizes SNP data to enable accurate, high-throughput genotyping of the 8p23.1 inversion. This tool was applied retrospectively to diverse genome-wide data sets, revealing significant population stratification that largely follows a clinal "serial founder effect" distribution model. Phylogenetic analyses establish the inversion's ancestral origin within the Homo lineage, indicating that 8p23.1 inversion has occurred independently in the Pan lineage. The human inversion breakpoint was localized to an inverted pair of human endogenous retrovirus elements within the large, flanking low-copy repeats; experimental validation of this breakpoint confirmed these elements as the likely intermediary substrates that sponsored inversion formation. In five data sets, mRNA levels of disease-associated genes were robustly associated with inversion genotype. Moreover, a haplotype associated with systemic lupus erythematosus was restricted to the derived inversion state. We conclude that the 8p23.1 inversion is an evolutionarily dynamic structure that can now be accommodated into the understanding of human genetic and phenotypic diversity.

  3. The origin, global distribution, and functional impact of the human 8p23 inversion polymorphism

    PubMed Central

    Salm, Maximilian P.A.; Horswell, Stuart D.; Hutchison, Claire E.; Speedy, Helen E.; Yang, Xia; Liang, Liming; Schadt, Eric E.; Cookson, William O.; Wierzbicki, Anthony S.; Naoumova, Rossi P.; Shoulders, Carol C.

    2012-01-01

    Genomic inversions are an increasingly recognized source of genetic variation. However, a lack of reliable high-throughput genotyping assays for these structures has precluded a full understanding of an inversion's phylogenetic, phenotypic, and population genetic properties. We characterize these properties for one of the largest polymorphic inversions in man (the ∼4.5-Mb 8p23.1 inversion), a structure that encompasses numerous signals of natural selection and disease association. We developed and validated a flexible bioinformatics tool that utilizes SNP data to enable accurate, high-throughput genotyping of the 8p23.1 inversion. This tool was applied retrospectively to diverse genome-wide data sets, revealing significant population stratification that largely follows a clinal “serial founder effect” distribution model. Phylogenetic analyses establish the inversion's ancestral origin within the Homo lineage, indicating that 8p23.1 inversion has occurred independently in the Pan lineage. The human inversion breakpoint was localized to an inverted pair of human endogenous retrovirus elements within the large, flanking low-copy repeats; experimental validation of this breakpoint confirmed these elements as the likely intermediary substrates that sponsored inversion formation. In five data sets, mRNA levels of disease-associated genes were robustly associated with inversion genotype. Moreover, a haplotype associated with systemic lupus erythematosus was restricted to the derived inversion state. We conclude that the 8p23.1 inversion is an evolutionarily dynamic structure that can now be accommodated into the understanding of human genetic and phenotypic diversity. PMID:22399572

  4. [Phylogenetic relationships and intraspecific variation of D-genome Aegilops L. as revealed by RAPD analysis].

    PubMed

    Goriunova, S V; Kochieva, E Z; Chikida, N N; Pukhal'skiĭ, V A

    2004-05-01

    RAPD analysis was carried out to study the genetic variation and phylogenetic relationships of polyploid Aegilops species, which contain the D genome as a component of the alloploid genome, and diploid Aegilops tauschii, which is a putative donor of the D genome for common wheat. In total, 74 accessions of six D-genome Aegilops species were examined. The highest intraspecific variation (0.03-0.21) was observed for Ae. tauschii. Intraspecific distances between accessions ranged 0.007-0.067 in Ae. cylindrica, 0.017-0.047 in Ae. vavilovii, and 0.00-0.053 in Ae. juvenalis. Likewise, Ae. ventricosa and Ae. crassa showed low intraspecific polymorphism. The among-accession difference in alloploid Ae. ventricosa (genome DvNv) was similar to that of one parental species, Ae. uniaristata (N), and substantially lower than in the other parent, Ae. tauschii (D). The among-accession difference in Ae. cylindrica (CcDc) was considerably lower than in either parent, Ae. tauschii (D) or Ae. caudata (C). With the exception of Ae. cylindrica, all D-genome species--Ae. tauschii (D), Ae. ventricosa (DvNv), Ae. crassa (XcrDcrl and XcrDcrlDcr2), Ae. juvenalis (XjDjUj), and Ae. vavilovii (XvaDvaSva)--formed a single polymorphic cluster, which was distinct from clusters of other species. The only exception, Ae. cylindrica, did not group with the other D-genome species, but clustered with Ae. caudata (C), a donor of the C genome. The cluster of these two species was clearly distinct from the cluster of the other D-genome species and close to a cluster of Ae. umbellulata (genome U) and Ae. ovata (genome UgMg). Thus, RAPD analysis for the first time was used to estimate and to compare the interpopulation polymorphism and to establish the phylogenetic relationships of all diploid and alloploid D-genome Aegilops species.

  5. Purification of polymorphic components of complex genomes

    DOEpatents

    Stodolsky, M.

    1991-07-16

    A method is disclosed for processing related subject and reference macromolecule populations composed of complementary strands into their respective subject and reference populations of representative fragments and effectuating purification of unique polymorphic subject fragments. 1 figure.

  6. Genome-wide DNA polymorphism in the indica rice varieties RGD-7S and Taifeng B as revealed by whole genome re-sequencing.

    PubMed

    Fu, Chong-Yun; Liu, Wu-Ge; Liu, Di-Lin; Li, Ji-Hua; Zhu, Man-Shan; Liao, Yi-Long; Liu, Zhen-Rong; Zeng, Xue-Qin; Wang, Feng

    2016-03-01

    Next-generation sequencing technologies provide opportunities to further understand genetic variation, even within closely related cultivars. We performed whole genome resequencing of two elite indica rice varieties, RGD-7S and Taifeng B, whose F1 progeny showed hybrid weakness and hybrid vigor when grown in the early- and late-cropping seasons, respectively. Approximately 150 million 100-bp pair-end reads were generated, which covered ∼86% of the rice (Oryza sativa L. japonica 'Nipponbare') reference genome. A total of 2,758,740 polymorphic sites including 2,408,845 SNPs and 349,895 InDels were detected in RGD-7S and Taifeng B, respectively. Applying stringent parameters, we identified 961,791 SNPs and 46,640 InDels between RGD-7S and Taifeng B (RGD-7S/Taifeng B). The density of DNA polymorphisms was 256.8 SNPs and 12.5 InDels per 100 kb for RGD-7S/Taifeng B. Copy number variations (CNVs) were also investigated. In RGD-7S, 1989 of 2727 CNVs were overlapped in 218 genes, and 1231 of 2010 CNVs were annotated in 175 genes in Taifeng B. In addition, we verified a subset of InDels in the interval of hybrid weakness genes, Hw3 and Hw4, and obtained some polymorphic InDel markers, which will provide a sound foundation for cloning hybrid weakness genes. Analysis of genomic variations will also contribute to understanding the genetic basis of hybrid weakness and heterosis.

  7. Genetic Diversity of the Q Fever Agent, Coxiella burnetii, Assessed by Microarray-Based Whole-Genome Comparisons†

    PubMed Central

    Beare, Paul A.; Samuel, James E.; Howe, Dale; Virtaneva, Kimmo; Porcella, Stephen F.; Heinzen, Robert A.

    2006-01-01

    Coxiella burnetii, a gram-negative obligate intracellular bacterium, causes human Q fever and is considered a potential agent of bioterrorism. Distinct genomic groups of C. burnetii are revealed by restriction fragment-length polymorphisms (RFLP). Here we comprehensively define the genetic diversity of C. burnetii by hybridizing the genomes of 20 RFLP-grouped and four ungrouped isolates from disparate sources to a high-density custom Affymetrix GeneChip containing all open reading frames (ORFs) of the Nine Mile phase I (NMI) reference isolate. We confirmed the relatedness of RFLP-grouped isolates and showed that two ungrouped isolates represent distinct genomic groups. Isolates contained up to 20 genomic polymorphisms consisting of 1 to 18 ORFs each. These were mostly complete ORF deletions, although partial deletions, point mutations, and insertions were also identified. A total of 139 chromosomal and plasmid ORFs were polymorphic among all C. burnetii isolates, representing ca. 7% of the NMI coding capacity. Approximately 67% of all deleted ORFs were hypothetical, while 9% were annotated in NMI as nonfunctional (e.g., frameshifted). The remaining deleted ORFs were associated with diverse cellular functions. The only deletions associated with isogenic NMI variants of attenuated virulence were previously described large deletions containing genes involved in lipopolysaccharide (LPS) biosynthesis, suggesting that these polymorphisms alone are responsible for the lower virulence of these variants. Interestingly, a variant of the Australia QD isolate producing truncated LPS had no detectable deletions, indicating LPS truncation can occur via small genetic changes. Our results provide new insight into the genetic diversity and virulence potential of Coxiella species. PMID:16547017

  8. Landscape of Insertion Polymorphisms in the Human Genome

    PubMed Central

    Onozawa, Masahiro; Goldberg, Liat; Aplan, Peter D.

    2015-01-01

    Nucleotide substitutions, small (<50 bp) insertions or deletions (indels), and large (>50 bp) deletions are well-known causes of genetic variation within the human genome. We recently reported a previously unrecognized form of polymorphic insertions, termed templated sequence insertion polymorphism (TSIP), in which the inserted sequence was templated from a distant genomic region, and was inserted in the genome through reverse transcription of an RNA intermediate. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; class 1 TSIPs show target site duplication, polyadenylation, and preference for insertion at a 5′-TTTT/A-3′ sequence, suggesting a LINE-1 based insertion mechanism, whereas class 2 TSIPs show features consistent with repair of a DNA double strand break by nonhomologous end joining. To gain a more complete picture of TSIPs throughout the human population, we evaluated whole-genome sequence from 52 individuals, and identified 171 TSIPs. Most individuals had 25–30 TSIPs, and common (present in >20% of individuals) TSIPs were found in individuals throughout the world, whereas rare TSIPs tended to cluster in specific geographic regions. The number of rare TSIPs was greater than the number of common TSIPs, suggesting that TSIP generation is an ongoing process. Intriguingly, mitochondrial sequences were a frequent template for class 2 insertions, used more commonly than any nuclear chromosome. Similar to single nucleotide polymorphisms and indels, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases, and can be useful in tracking historical migration of populations. PMID:25745018

  9. Genome-wide single nucleotide polymorphisms reveal population history and adaptive divergence in wild guppies.

    PubMed

    Willing, Eva-Maria; Bentzen, Paul; van Oosterhout, Cock; Hoffmann, Margarete; Cable, Joanne; Breden, Felix; Weigel, Detlef; Dreyer, Christine

    2010-03-01

    Adaptation of guppies (Poecilia reticulata) to contrasting upland and lowland habitats has been extensively studied with respect to behaviour, morphology and life history traits. Yet population history has not been studied at the whole-genome level. Although single nucleotide polymorphisms (SNPs) are the most abundant form of variation in many genomes and consequently very informative for a genome-wide picture of standing natural variation in populations, genome-wide SNP data are rarely available for wild vertebrates. Here we use genetically mapped SNP markers to comprehensively survey genetic variation within and among naturally occurring guppy populations from a wide geographic range in Trinidad and Venezuela. Results from three different clustering methods, Neighbor-net, principal component analysis (PCA) and Bayesian analysis show that the population substructure agrees with geographic separation and largely with previously hypothesized patterns of historical colonization. Within major drainages (Caroni, Oropouche and Northern), populations are genetically similar, but those in different geographic regions are highly divergent from one another, with some indications of ancient shared polymorphisms. Clear genomic signatures of a previous introduction experiment were seen, and we detected additional potential admixture events. Headwater populations were significantly less heterozygous than downstream populations. Pairwise F(ST) values revealed marked differences in allele frequencies among populations from different regions, and also among populations within the same region. F(ST) outlier methods indicated some regions of the genome as being under directional selection. Overall, this study demonstrates the power of a genome-wide SNP data set to inform for studies on natural variation, adaptation and evolution of wild populations.

  10. Characterization and compilation of polymorphic simple sequence repeat (SSR) markers of peanut from public database

    PubMed Central

    2012-01-01

    Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L.) genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5%) within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5%) was the most abundant followed by AAG (12.1%), AAT (10.9%), and AT (10.3%).The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders. PMID:22818284

  11. Intragenomic polymorphisms among high-copy loci: a genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae).

    PubMed

    Weitemier, Kevin; Straub, Shannon C K; Fishbein, Mark; Liston, Aaron

    2015-01-01

    Despite knowledge that concerted evolution of high-copy loci is often imperfect, studies that investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the "noncoding" ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).

  12. Three chromosomal rearrangements promote genomic divergence between migratory and stationary ecotypes of Atlantic cod.

    PubMed

    Berg, Paul R; Star, Bastiaan; Pampoulie, Christophe; Sodeland, Marte; Barth, Julia M I; Knutsen, Halvor; Jakobsen, Kjetill S; Jentoft, Sissel

    2016-03-17

    Identification of genome-wide patterns of divergence provides insight on how genomes are influenced by selection and can reveal the potential for local adaptation in spatially structured populations. In Atlantic cod - historically a major marine resource - Northeast-Arctic- and Norwegian coastal cod are recognized by fundamental differences in migratory and non-migratory behavior, respectively. However, the genomic architecture underlying such behavioral ecotypes is unclear. Here, we have analyzed more than 8.000 polymorphic SNPs distributed throughout all 23 linkage groups and show that loci putatively under selection are localized within three distinct genomic regions, each of several megabases long, covering approximately 4% of the Atlantic cod genome. These regions likely represent genomic inversions. The frequency of these distinct regions differ markedly between the ecotypes, spawning in the vicinity of each other, which contrasts with the low level of divergence in the rest of the genome. The observed patterns strongly suggest that these chromosomal rearrangements are instrumental in local adaptation and separation of Atlantic cod populations, leaving footprints of large genomic regions under selection. Our findings demonstrate the power of using genomic information in further understanding the population dynamics and defining management units in one of the world's most economically important marine resources.

  13. PwRn1, a novel Ty3/gypsy-like retrotransposon of Paragonimus westermani: molecular characters and its differentially preserved mobile potential according to host chromosomal polyploidy.

    PubMed

    Bae, Young-An; Ahn, Jong-Sook; Kim, Seon-Hee; Rhyu, Mun-Gan; Kong, Yoon; Cho, Seung-Yull

    2008-10-14

    Retrotransposons have been known to involve in the remodeling and evolution of host genome. These reverse transcribing elements, which show a complex evolutionary pathway with diverse intermediate forms, have been comprehensively analyzed from a wide range of host genomes, while the information remains limited to only a few species in the phylum Platyhelminthes. A LTR retrotransposon and its homologs with a strong phylogenetic affinity toward CsRn1 of Clonorchis sinensis were isolated from a trematode parasite Paragonimus westermani via a degenerate PCR method and from an insect species Anopheles gambiae by in silico analysis of the whole mosquito genome, respectively. These elements, designated PwRn1 and AgCR-1 - AgCR-14 conserved unique features including a t-RNATrp primer binding site and the unusual CHCC signature of Gag proteins. Their flanking LTRs displayed >97% nucleotide identities and thus, these elements were likely to have expanded recently in the trematode and insect genomes. They evolved heterogeneous expression strategies: a single fused ORF, two separate ORFs with an identical reading frame and two ORFs overlapped by -1 frameshifting. Phylogenetic analyses suggested that the elements with the separate ORFs had evolved from an ancestral form(s) with the overlapped ORFs. The mobile potential of PwRn1 was likely to be maintained differentially in association with the karyotype of host genomes, as was examined by the presence/absence of intergenomic polymorphism and mRNA transcripts. Our results on the structural diversity of CsRn1-like elements can provide a molecular tool to dissect a more detailed evolutionary episode of LTR retrotransposons. The PwRn1-associated genomic polymorphism, which is substantial in diploids, will also be informative in addressing genomic diversification following inter-/intra-specific hybridization in P. westermani populations.

  14. Comparison of the genomic sequence of the microminipig, a novel breed of swine, with the genomic database for conventional pig.

    PubMed

    Miura, Naoki; Kucho, Ken-Ichi; Noguchi, Michiko; Miyoshi, Noriaki; Uchiumi, Toshiki; Kawaguchi, Hiroaki; Tanimoto, Akihide

    2014-01-01

    The microminipig, which weighs less than 10 kg at an early stage of maturity, has been reported as a potential experimental model animal. Its extremely small size and other distinct characteristics suggest the possibility of a number of differences between the genome of the microminipig and that of conventional pigs. In this study, we analyzed the genomes of two healthy microminipigs using a next-generation sequencer SOLiD™ system. We then compared the obtained genomic sequences with a genomic database for the domestic pig (Sus scrofa). The mapping coverage of sequenced tag from the microminipig to conventional pig genomic sequences was greater than 96% and we detected no clear, substantial genomic variance from these data. The results may indicate that the distinct characteristics of the microminipig derive from small-scale alterations in the genome, such as Single Nucleotide Polymorphisms or translational modifications, rather than large-scale deletion or insertion polymorphisms. Further investigation of the entire genomic sequence of the microminipig with methods enabling deeper coverage is required to elucidate the genetic basis of its distinct phenotypic traits. Copyright © 2014 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  15. Genome-wide DNA polymorphisms in Kavuni, a traditional rice cultivar with nutritional and therapeutic properties.

    PubMed

    Rathinasabapathi, Pasupathi; Purushothaman, Natarajan; Parani, Madasamy

    2016-05-01

    Although rice genome was sequenced in the year 2002, efforts in resequencing the large number of available accessions, landraces, traditional cultivars, and improved varieties of this important food crop are limited. We have initiated resequencing of the traditional cultivars from India. Kavuni is an important traditional rice cultivar from South India that attracts premium price for its nutritional and therapeutic properties. Whole-genome sequencing of Kavuni using Illumina platform and SNPs analysis using Nipponbare reference genome identified 1 150 711 SNPs of which 377 381 SNPs were located in the genic regions. Non-synonymous SNPs (62 708) were distributed in 19 251 genes, and their number varied between 1 and 115 per gene. Large-effect DNA polymorphisms (7769) were present in 3475 genes. Pathway mapping of these polymorphisms revealed the involvement of genes related to carbohydrate metabolism, translation, protein-folding, and cell death. Analysis of the starch biosynthesis related genes revealed that the granule-bound starch synthase I gene had T/G SNPs at the first intron/exon junction and a two-nucleotide combination, which were reported to favour high amylose content and low glycemic index. The present study provided a valuable genomics resource to study the rice varieties with nutritional and medicinal properties.

  16. Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

    PubMed Central

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

    2015-01-01

    We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692

  17. Changes in Malaria Parasite Drug Resistance in an Endemic Population Over a 25-Year Period With Resulting Genomic Evidence of Selection

    PubMed Central

    Nwakanma, Davis C.; Duffy, Craig W.; Amambua-Ngwa, Alfred; Oriero, Eniyou C.; Bojang, Kalifa A.; Pinder, Margaret; Drakeley, Chris J.; Sutherland, Colin J.; Milligan, Paul J.; MacInnis, Bronwyn; Kwiatkowski, Dominic P.; Clark, Taane G.; Greenwood, Brian M.; Conway, David J.

    2014-01-01

    Background. Analysis of genome-wide polymorphism in many organisms has potential to identify genes under recent selection. However, data on historical allele frequency changes are rarely available for direct confirmation. Methods. We genotyped single nucleotide polymorphisms (SNPs) in 4 Plasmodium falciparum drug resistance genes in 668 archived parasite-positive blood samples of a Gambian population between 1984 and 2008. This covered a period before antimalarial resistance was detected locally, through subsequent failure of multiple drugs until introduction of artemisinin combination therapy. We separately performed genome-wide sequence analysis of 52 clinical isolates from 2008 to prospect for loci under recent directional selection. Results. Resistance alleles increased from very low frequencies, peaking in 2000 for chloroquine resistance-associated crt and mdr1 genes and at the end of the survey period for dhfr and dhps genes respectively associated with pyrimethamine and sulfadoxine resistance. Temporal changes fit a model incorporating likely selection coefficients over the period. Three of the drug resistance loci were in the top 4 regions under strong selection implicated by the genome-wide analysis. Conclusions. Genome-wide polymorphism analysis of an endemic population sample robustly identifies loci with detailed documentation of recent selection, demonstrating power to prospectively detect emerging drug resistance genes. PMID:24265439

  18. Accumulation of slightly deleterious mutations in the mitochondrial genome: a hallmark of animal domestication.

    PubMed

    Hughes, Austin L

    2013-02-15

    The hypothesis that domestication leads to a relaxation of purifying selection on mitochondrial (mt) genomes was tested by comparative analysis of mt genes from dog, pig, chicken, and silkworm. The three vertebrate species showed mt genome phylogenies in which domestic and wild isolates were intermingled, whereas the domestic silkworm (Bombyx mori) formed a distinct cluster nested within its closest wild relative (Bombyx mandarina). In spite of these differences in phylogenetic pattern, significantly greater proportions of nonsynonymous SNPs than of synonymous SNPs were unique to the domestic populations of all four species. Likewise, in all four species, significantly greater proportions of RNA-encoding SNPs than of synonymous SNPs were unique to the domestic populations. Thus, domestic populations were characterized by an excess of unique polymorphisms in two categories generally subject to purifying selection: nonsynonymous sites and RNA-encoding sites. Many of these unique polymorphisms thus seem likely to be slightly deleterious; the latter hypothesis was supported by the generally lower gene diversities of polymorphisms unique to domestic populations in comparison to those of polymorphisms shared by domestic and wild populations. Copyright © 2012 Elsevier B.V. All rights reserved.

  19. Molecular Characterization of Herpes Simplex Virus 2 Strains by Analysis of Microsatellite Polymorphism

    PubMed Central

    Ait-Arkoub, Zaïna; Voujon, Delphine; Deback, Claire; Abrao, Emiliana P.; Agut, Henri; Boutolleau, David

    2013-01-01

    The complete 154-kbp linear double-stranded genomic DNA sequence of herpes simplex virus 2 (HSV-2), consisting of two extended regions of unique sequences bounded by a pair of inverted repeat elements, was published in 1998 and since then has been widely employed in a wide range of studies. Throughout the HSV-2 genome are scattered 150 microsatellites (also referred to as short tandem repeats) of 1- to 6-nucleotide motifs, mainly distributed in noncoding regions. Microsatellites are considered reliable markers for genetic mapping to differentiate herpesvirus strains, as shown for cytomegalovirus and HSV-1. The aim of this work was to characterize 12 polymorphic microsatellites within the HSV-2 genome by use of 3 multiplex PCR assays in combination with length polymorphism analysis for the rapid genetic differentiation of 56 HSV-2 clinical isolates and 2 HSV-2 laboratory strains (gHSV-2 and MS). This new system was applied to a specific new HSV-2 variant recently identified in HIV-1-infected patients originating from West Africa. Our results confirm that microsatellite polymorphism analysis is an accurate tool for studying the epidemiology of HSV-2 infections. PMID:23966512

  20. Structural and sequence diversity of the transposon Galileo in the Drosophila willistoni genome.

    PubMed

    Gonçalves, Juliana W; Valiati, Victor Hugo; Delprat, Alejandra; Valente, Vera L S; Ruiz, Alfredo

    2014-09-13

    Galileo is one of three members of the P superfamily of DNA transposons. It was originally discovered in Drosophila buzzatii, in which three segregating chromosomal inversions were shown to have been generated by ectopic recombination between Galileo copies. Subsequently, Galileo was identified in six of 12 sequenced Drosophila genomes, indicating its widespread distribution within this genus. Galileo is strikingly abundant in Drosophila willistoni, a neotropical species that is highly polymorphic for chromosomal inversions, suggesting a role for this transposon in the evolution of its genome. We carried out a detailed characterization of all Galileo copies present in the D. willistoni genome. A total of 191 copies, including 133 with two terminal inverted repeats (TIRs), were classified according to structure in six groups. The TIRs exhibited remarkable variation in their length and structure compared to the most complete copy. Three copies showed extended TIRs due to internal tandem repeats, the insertion of other transposable elements (TEs), or the incorporation of non-TIR sequences into the TIRs. Phylogenetic analyses of the transposase (TPase)-encoding and TIR segments yielded two divergent clades, which we termed Galileo subfamilies V and W. Target-site duplications (TSDs) in D. willistoni Galileo copies were 7- or 8-bp in length, with the consensus sequence GTATTAC. Analysis of the region around the TSDs revealed a target site motif (TSM) with a 15-bp palindrome that may give rise to a stem-loop secondary structure. There is a remarkable abundance and diversity of Galileo copies in the D. willistoni genome, although no functional copies were found. The TIRs in particular have a dynamic structure and extend in different ways, but their ends (required for transposition) are more conserved than the rest of the element. The D. willistoni genome harbors two Galileo subfamilies (V and W) that diverged ~9 million years ago and may have descended from an ancestral element in the genome. Galileo shows a significant insertion preference for a 15-bp palindromic TSM.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilchrist, Michael J.; Sobral, Daniel; Khoueiry, Pierre

    Genome-wide resources, such as collections of cDNA clones encoding for complete proteins (full-ORF clones), are crucial tools for studying the evolution of gene function and genetic interactions. Non-model organisms, in particular marine organisms, provide a rich source of functional diversity. Marine organism genomes are, however, frequently highly polymorphic and encode proteins that diverge significantly from those of well-annotated model genomes. The construction of full-ORF clone collections from non-model organisms is hindered by the difficulty of predicting accurately the N-terminal ends of proteins, and distinguishing recent paralogs from highly polymorphic alleles. We also report a computational strategy that overcomes these difficulties,more » and allows for accurate gene level clustering of transcript data followed by the automated identification of full-ORFs with correct 5'- and 3'-ends. It is robust to polymorphism, includes paralog calling and does not require evolutionary proximity to well annotated model organisms. Here, we developed this pipeline for the ascidian Ciona intestinalis, a highly polymorphic member of the divergent sister group of the vertebrates, emerging as a powerful model organism to study chordate gene function, Gene Regulatory Networks and molecular mechanisms underlying human pathologies. Furthermore, using this pipeline we have generated the first full-ORF collection for a highly polymorphic marine invertebrate. It contains 19,163 full-ORF cDNA clones covering 60% of Ciona coding genes, and full-ORF orthologs for approximately half of curated human disease-associated genes.« less

  2. Development of cleaved amplified polymorphic sequence markers and a CAPS-based genetic linkage map in watermelon (Citrullus lanatus [Thunb.] Matsum. and Nakai) constructed using whole-genome re-sequencing data

    PubMed Central

    Liu, Shi; Gao, Peng; Zhu, Qianglong; Luan, Feishi; Davis, Angela R.; Wang, Xiaolu

    2016-01-01

    Cleaved amplified polymorphic sequence (CAPS) markers are useful tools for detecting single nucleotide polymorphisms (SNPs). This study detected and converted SNP sites into CAPS markers based on high-throughput re-sequencing data in watermelon, for linkage map construction and quantitative trait locus (QTL) analysis. Two inbred lines, Cream of Saskatchewan (COS) and LSW-177 had been re-sequenced and analyzed by Perl self-compiled script for CAPS marker development. 88.7% and 78.5% of the assembled sequences of the two parental materials could map to the reference watermelon genome, respectively. Comparative assembled genome data analysis provided 225,693 and 19,268 SNPs and indels between the two materials. 532 pairs of CAPS markers were designed with 16 restriction enzymes, among which 271 pairs of primers gave distinct bands of the expected length and polymorphic bands, via PCR and enzyme digestion, with a polymorphic rate of 50.94%. Using the new CAPS markers, an initial CAPS-based genetic linkage map was constructed with the F2 population, spanning 1836.51 cM with 11 linkage groups and 301 markers. 12 QTLs were detected related to fruit flesh color, length, width, shape index, and brix content. These newly CAPS markers will be a valuable resource for breeding programs and genetic studies of watermelon. PMID:27162496

  3. Complete sequence of the first chimera genome constructed by cloning the whole genome of Synechocystis strain PCC6803 into the Bacillus subtilis 168 genome.

    PubMed

    Watanabe, Satoru; Shiwa, Yuh; Itaya, Mitsuhiro; Yoshikawa, Hirofumi

    2012-12-01

    Genome synthesis of existing or designed genomes is made feasible by the first successful cloning of a cyanobacterium, Synechocystis PCC6803, in Gram-positive, endospore-forming Bacillus subtilis. Whole-genome sequence analysis of the isolate and parental B. subtilis strains provides clues for identifying single nucleotide polymorphisms (SNPs) in the 2 complete bacterial genomes in one cell.

  4. Development of a set of SNP markers present in expressed genes of the apple.

    PubMed

    Chagné, David; Gasic, Ksenija; Crowhurst, Ross N; Han, Yuepeng; Bassett, Heather C; Bowatte, Deepa R; Lawrence, Timothy J; Rikkerink, Erik H A; Gardiner, Susan E; Korban, Schuyler S

    2008-11-01

    Molecular markers associated with gene coding regions are useful tools for bridging functional and structural genomics. Due to their high abundance in plant genomes, single nucleotide polymorphisms (SNPs) are present within virtually all genomic regions, including most coding sequences. The objective of this study was to develop a set of SNPs for the apple by taking advantage of the wealth of genomics resources available for the apple, including a large collection of expressed sequenced tags (ESTs). Using bioinformatics tools, a search for SNPs within an EST database of approximately 350,000 sequences developed from a variety of apple accessions was conducted. This resulted in the identification of a total of 71,482 putative SNPs. As the apple genome is reported to be an ancient polyploid, attempts were made to verify whether those SNPs detected in silico were attributable either to allelic polymorphisms or to gene duplication or paralogous or homeologous sequence variations. To this end, a set of 464 PCR primer pairs was designed, PCR was amplified using two subsets of plants, and the PCR products were sequenced. The SNPs retrieved from these sequences were then mapped onto apple genetic maps, including a newly constructed map of a Royal Gala x A689-24 cross and a Malling 9 x Robusta 5, map using a bin mapping strategy. The SNP genotyping was performed using the high-resolution melting (HRM) technique. A total of 93 new markers containing 210 coding SNPs were successfully mapped. This new set of SNP markers for the apple offers new opportunities for understanding the genetic control of important horticultural traits using quantitative trait loci (QTL) or linkage disequilibrium analysis. These also serve as useful markers for aligning physical and genetic maps, and as potential transferable markers across the Rosaceae family.

  5. Non-Canonical G-quadruplexes cause the hCEB1 minisatellite instability in Saccharomyces cerevisiae

    PubMed Central

    Piazza, Aurèle; Cui, Xiaojie; Adrian, Michael; Samazan, Frédéric; Heddi, Brahim; Phan, Anh-Tuan; Nicolas, Alain G

    2017-01-01

    G-quadruplexes (G4) are polymorphic four-stranded structures formed by certain G-rich nucleic acids in vitro, but the sequence and structural features dictating their formation and function in vivo remains uncertain. Here we report a structure-function analysis of the complex hCEB1 G4-forming sequence. We isolated four G4 conformations in vitro, all of which bear unusual structural features: Form 1 bears a V-shaped loop and a snapback guanine; Form 2 contains a terminal G-triad; Form 3 bears a zero-nucleotide loop; and Form 4 is a zero-nucleotide loop monomer or an interlocked dimer. In vivo, Form 1 and Form 2 differently account for 2/3rd of the genomic instability of hCEB1 in two G4-stabilizing conditions. Form 3 and an unidentified form contribute to the remaining instability, while Form 4 has no detectable effect. This work underscores the structural polymorphisms originated from a single highly G-rich sequence and demonstrates the existence of non-canonical G4s in cells, thus broadening the definition of G4-forming sequences. DOI: http://dx.doi.org/10.7554/eLife.26884.001 PMID:28661396

  6. Genetic predisposition to neuroblastoma mediated by a LMO1 super-enhancer polymorphism | Office of Cancer Genomics

    Cancer.gov

    Neuroblastoma is a paediatric malignancy that typically arises in early childhood, and is derived from the developing sympathetic nervous system. Clinical phenotypes range from localized tumours with excellent outcomes to widely metastatic disease in which long-term survival is approximately 40% despite intensive therapy. A previous genome-wide association study identified common polymorphisms at the LMO1 gene locus that are highly associated with neuroblastoma susceptibility and oncogenic addiction to LMO1 in the tumour cells.

  7. AFLP fragment isolation technique as a method to produce random sequences for single nucleotide polymorphism discovery in the green turtle, Chelonia mydas.

    PubMed

    Roden, Suzanne E; Dutton, Peter H; Morin, Phillip A

    2009-01-01

    The green sea turtle, Chelonia mydas, was used as a case study for single nucleotide polymorphism (SNP) discovery in a species that has little genetic sequence information available. As green turtles have a complex population structure, additional nuclear markers other than microsatellites could add to our understanding of their complex life history. Amplified fragment length polymorphism technique was used to generate sets of random fragments of genomic DNA, which were then electrophoretically separated with precast gels, stained with SYBR green, excised, and directly sequenced. It was possible to perform this method without the use of polyacrylamide gels, radioactive or fluorescent labeled primers, or hybridization methods, reducing the time, expense, and safety hazards of SNP discovery. Within 13 loci, 2547 base pairs were screened, resulting in the discovery of 35 SNPs. Using this method, it was possible to yield a sufficient number of loci to screen for SNP markers without the availability of prior sequence information.

  8. A unified genetic association test robust to latent population structure for a count phenotype.

    PubMed

    Song, Minsun

    2018-06-04

    Confounding caused by latent population structure in genome-wide association studies has been a big concern despite the success of genome-wide association studies at identifying genetic variants associated with complex diseases. In particular, because of the growing interest in association mapping using count phenotype data, it would be interesting to develop a testing framework for genetic associations that is immune to population structure when phenotype data consist of count measurements. Here, I propose a solution for testing associations between single nucleotide polymorphisms and a count phenotype in the presence of an arbitrary population structure. I consider a classical range of models for count phenotype data. Under these models, a unified test for genetic associations that protects against confounding was derived. An algorithm was developed to efficiently estimate the parameters that are required to fit the proposed model. I illustrate the proposed approach using simulation studies and an empirical study. Both simulated and real-data examples suggest that the proposed method successfully corrects population structure. Copyright © 2018 John Wiley & Sons, Ltd.

  9. A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing.

    PubMed

    Gao, Guangtu; Nome, Torfinn; Pearse, Devon E; Moen, Thomas; Naish, Kerry A; Thorgaard, Gary H; Lien, Sigbjørn; Palti, Yniv

    2018-01-01

    Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout ( Oncorhynchus mykiss ), SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL) and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway) that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU) and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1) which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup , followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs) and multi-sequence variants (MSVs). Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25). The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and heterozygosity within each population. We also provide functional annotation based on the genome position of each SNP and evaluate the use of clonal lines for filtering of PSVs and MSVs. These SNPs form a new database, which provides an important resource for a new high density SNP array design and for other SNP genotyping platforms used for genetic and genomics studies of this iconic salmonid fish species.

  10. A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation

    PubMed Central

    2012-01-01

    Background Efficient, robust, and accurate genotype imputation algorithms make large-scale application of genomic selection cost effective. An algorithm that imputes alleles or allele probabilities for all animals in the pedigree and for all genotyped single nucleotide polymorphisms (SNP) provides a framework to combine all pedigree, genomic, and phenotypic information into a single-stage genomic evaluation. Methods An algorithm was developed for imputation of genotypes in pedigreed populations that allows imputation for completely ungenotyped animals and for low-density genotyped animals, accommodates a wide variety of pedigree structures for genotyped animals, imputes unmapped SNP, and works for large datasets. The method involves simple phasing rules, long-range phasing and haplotype library imputation and segregation analysis. Results Imputation accuracy was high and computational cost was feasible for datasets with pedigrees of up to 25 000 animals. The resulting single-stage genomic evaluation increased the accuracy of estimated genomic breeding values compared to a scenario in which phenotypes on relatives that were not genotyped were ignored. Conclusions The developed imputation algorithm and software and the resulting single-stage genomic evaluation method provide powerful new ways to exploit imputation and to obtain more accurate genetic evaluations. PMID:22462519

  11. Direct Detection of Insertion/Deletion Polymorphisms in an Autosomal Region by Analyzing High-Density Markers in Individual Spermatozoa

    PubMed Central

    Pramanik, Sreemanta; Li, Honghua

    2002-01-01

    Direct polymerase chain reaction (PCR) detection of insertion/deletion (indel) polymorphisms requires sample homozygosity. For the indel polymorphisms that have the deletion allele with a relatively low frequency in the autosomal regions, direct PCR detection becomes difficult or impossible. The present study is, to our knowledge, the first designed to directly detect indel polymorphisms in a human autosomal region (i.e., the immunoglobulin VH region), through use of single haploid sperm cells as subjects. Unique marker sequences (n=32), spaced at ∼5-kb intervals, were selected near the 3′ end of the VH region. A two-round multiplex PCR protocol was used to amplify these sequences from single sperm samples from nine unrelated healthy donors. The parental haplotypes of the donors were determined by examining the presence or absence of these markers. Seven clustered markers in 6 of the 18 haplotypes were missing and likely represented a 35–40-kb indel polymorphism. The genotypes of the donors, with respect to this polymorphism, perfectly matched the expectation under Hardy-Weinberg equilibrium. Three VH gene segments, of which two are functional, are affected by this polymorphism. According to these results, >10% of individuals in the human population may not have these gene segments in their genome, and ∼44% may have only one copy of these gene segments. The biological impact of this polymorphism would be very interesting to study. The approach used in the present study could be applied to understand the physical structure and diversity of all other autosomal regions. PMID:12442231

  12. The History of Bordetella pertussis Genome Evolution Includes Structural Rearrangement

    PubMed Central

    Peng, Yanhui; Loparev, Vladimir; Batra, Dhwani; Bowden, Katherine E.; Burroughs, Mark; Cassiday, Pamela K.; Davis, Jamie K.; Johnson, Taccara; Juieng, Phalasy; Knipe, Kristen; Mathis, Marsenia H.; Pruitt, Andrea M.; Rowe, Lori; Sheth, Mili; Tondella, M. Lucia; Williams, Margaret M.

    2017-01-01

    ABSTRACT Despite high pertussis vaccine coverage, reported cases of whooping cough (pertussis) have increased over the last decade in the United States and other developed countries. Although Bordetella pertussis is well known for its limited gene sequence variation, recent advances in long-read sequencing technology have begun to reveal genomic structural heterogeneity among otherwise indistinguishable isolates, even within geographically or temporally defined epidemics. We have compared rearrangements among complete genome assemblies from 257 B. pertussis isolates to examine the potential evolution of the chromosomal structure in a pathogen with minimal gene nucleotide sequence diversity. Discrete changes in gene order were identified that differentiated genomes from vaccine reference strains and clinical isolates of various genotypes, frequently along phylogenetic boundaries defined by single nucleotide polymorphisms. The observed rearrangements were primarily large inversions centered on the replication origin or terminus and flanked by IS481, a mobile genetic element with >240 copies per genome and previously suspected to mediate rearrangements and deletions by homologous recombination. These data illustrate that structural genome evolution in B. pertussis is not limited to reduction but also includes rearrangement. Therefore, although genomes of clinical isolates are structurally diverse, specific changes in gene order are conserved, perhaps due to positive selection, providing novel information for investigating disease resurgence and molecular epidemiology. IMPORTANCE Whooping cough, primarily caused by Bordetella pertussis, has resurged in the United States even though the coverage with pertussis-containing vaccines remains high. The rise in reported cases has included increased disease rates among all vaccinated age groups, provoking questions about the pathogen's evolution. The chromosome of B. pertussis includes a large number of repetitive mobile genetic elements that obstruct genome analysis. However, these mobile elements facilitate large rearrangements that alter the order and orientation of essential protein-encoding genes, which otherwise exhibit little nucleotide sequence diversity. By comparing the complete genome assemblies from 257 isolates, we show that specific rearrangements have been conserved throughout recent evolutionary history, perhaps by eliciting changes in gene expression, which may also provide useful information for molecular epidemiology. PMID:28167525

  13. Genome structure of a Saccharomyces cerevisiae strain widely used in bioethanol production

    PubMed Central

    Argueso, Juan Lucas; Carazzolle, Marcelo F.; Mieczkowski, Piotr A.; Duarte, Fabiana M.; Netto, Osmar V.C.; Missawa, Silvia K.; Galzerani, Felipe; Costa, Gustavo G.L.; Vidal, Ramon O.; Noronha, Melline F.; Dominska, Margaret; Andrietta, Maria G.S.; Andrietta, Sílvio R.; Cunha, Anderson F.; Gomes, Luiz H.; Tavares, Flavio C.A.; Alcarde, André R.; Dietrich, Fred S.; McCusker, John H.; Petes, Thomas D.; Pereira, Gonçalo A.G.

    2009-01-01

    Bioethanol is a biofuel produced mainly from the fermentation of carbohydrates derived from agricultural feedstocks by the yeast Saccharomyces cerevisiae. One of the most widely adopted strains is PE-2, a heterothallic diploid naturally adapted to the sugar cane fermentation process used in Brazil. Here we report the molecular genetic analysis of a PE-2 derived diploid (JAY270), and the complete genome sequence of a haploid derivative (JAY291). The JAY270 genome is highly heterozygous (∼2 SNPs/kb) and has several structural polymorphisms between homologous chromosomes. These chromosomal rearrangements are confined to the peripheral regions of the chromosomes, with breakpoints within repetitive DNA sequences. Despite its complex karyotype, this diploid, when sporulated, had a high frequency of viable spores. Hybrid diploids formed by outcrossing with the laboratory strain S288c also displayed good spore viability. Thus, the rearrangements that exist near the ends of chromosomes do not impair meiosis, as they do not span regions that contain essential genes. This observation is consistent with a model in which the peripheral regions of chromosomes represent plastic domains of the genome that are free to recombine ectopically and experiment with alternative structures. We also explored features of the JAY270 and JAY291 genomes that help explain their high adaptation to industrial environments, exhibiting desirable phenotypes such as high ethanol and cell mass production and high temperature and oxidative stress tolerance. The genomic manipulation of such strains could enable the creation of a new generation of industrial organisms, ideally suited for use as delivery vehicles for future bioenergy technologies. PMID:19812109

  14. Functional variability of glutathione S-transferases in Basque populations.

    PubMed

    Iorio, Andrea; Piacentini, Sara; Polimanti, Renato; De Angelis, Flavio; Calderon, Rosario; Fuciarelli, Maria

    2014-01-01

    Glutathione S-transferases (GSTs) are enzymes involved in Phase II reactions. They play a key role in cellular detoxification. Various studies have shown that genes coding for the GST are highly polymorphic and some of these variants are directly associated with a decrease of enzyme activity making individuals more susceptible to different clinical phenotypes. The aim of this study is to investigate the genetic variability of GST genes among human populations. We have focused our attention on the polymorphic variants of the GSTA1, GSTM1, GSTO1, GSTO2, GSTP1, GSTT1, and GSTT2B genes. These polymorphisms were analyzed in a whole sample of 151 individuals: 112 autochthonous Navarrese Basques, and 39 non-autochthonous Navarrese Basques. DNA extraction from plasma was performed by using the phenol:chloroform:isoamylic alcohol method. Genotyping of the gene polymorphisms was performed by PCR Multiplex and the PCR-RFLP method. We applied correspondence analysis and built frequency-maps to compare the genetic structure in worldwide populations. Our results were compared with data available on the Human Genome Diversity Project (HGDP) and on the 1,000 Genomes Project to obtain information on the functional variability of GSTs in Basques. Our data indicated that Basque communities showed a higher differentiation of certain functional GST variants (i.e., GSTM1-positive/null genotype, GSTP1*I105V, and GSTT2B*1/0) than other European and Mediterranean populations. This might account for epidemiological differences in the predisposition to diseases and drug response among Basques and could be used to design and interpret genetic association studies for this particular population. Copyright © 2014 Wiley Periodicals, Inc.

  15. Role of gene polymorphisms in gastric cancer and its precursor lesions: Current knowledge and perspectives in Latin American countries

    PubMed Central

    Chiurillo, Miguel Angel

    2014-01-01

    Latin America shows one of the highest incidence rates of gastric cancer in the world, with variations in mortality rates among nations or even within countries belonging to this region. Gastric cancer is the result of a multifactorial complex process, for which a multistep model of carcinogenesis is currently accepted. Additionally to the infection with Helicobacter pylori, that plays a major role, environmental factors as well as genetic susceptibility factors are significant players at different stages in the gastric cancer process. The differences in population origin, demographic structure, socio-economic development, and the impact of globalization lifestyles experienced in Latin America in the last decades, all together offer opportunities for studying in this context the influence of genetic polymorphisms in the susceptibility to gastric cancer. The aim of this article is to discuss current trends on gastric cancer in Latin American countries and to review the available published information about studies of association of gene polymorphisms involved in gastric cancer susceptibility from this region of the world. A total of 40 genes or genomic regions and 69 genetic variants, 58% representing markers involved in inflammatory response, have been used in a number of studies in which predominates a low number of individuals (cases and controls) included. Polymorphisms of IL-1B (-511 C/T, 14 studies; -31 T/C, 10 studies) and IL-1RN (variable number of tandem repeats, 17 studies) are the most represented ones in the reviewed studies. Other genetic variants recently evaluated in large meta-analyses and associated with gastric cancer risk were also analyzed in a few studies [e.g., prostate stem cell antigen (PSCA), CDH1, Survivin]. Further and better analysis centered in gene polymorphisms linked to other covariates, epidemiological studies and the information provided by meta-analyses and genome-wide association studies should help to improve our understanding of gastric cancer etiology in order to develop appropriate health programs in Latin America. PMID:24782603

  16. Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array

    PubMed Central

    Wang, Shichen; Wong, Debbie; Forrest, Kerrie; Allen, Alexandra; Chao, Shiaoman; Huang, Bevan E; Maccaferri, Marco; Salvi, Silvio; Milner, Sara G; Cattivelli, Luigi; Mastrangelo, Anna M; Whan, Alex; Stephen, Stuart; Barker, Gary; Wieseke, Ralf; Plieske, Joerg; International Wheat Genome Sequencing Consortium; Lillemo, Morten; Mather, Diane; Appels, Rudi; Dolferus, Rudy; Brown-Guedira, Gina; Korol, Abraham; Akhunova, Alina R; Feuillet, Catherine; Salse, Jerome; Morgante, Michele; Pozniak, Curtis; Luo, Ming-Cheng; Dvorak, Jan; Morell, Matthew; Dubcovsky, Jorge; Ganal, Martin; Tuberosa, Roberto; Lawley, Cindy; Mikoulitch, Ivan; Cavanagh, Colin; Edwards, Keith J; Hayden, Matthew; Akhunov, Eduard

    2014-01-01

    High-density single nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships between individuals in populations and studying marker–trait associations in mapping experiments. We developed a genotyping array including about 90 000 gene-associated SNPs and used it to characterize genetic variation in allohexaploid and allotetraploid wheat populations. The array includes a significant fraction of common genome-wide distributed SNPs that are represented in populations of diverse geographical origin. We used density-based spatial clustering algorithms to enable high-throughput genotype calling in complex data sets obtained for polyploid wheat. We show that these model-free clustering algorithms provide accurate genotype calling in the presence of multiple clusters including clusters with low signal intensity resulting from significant sequence divergence at the target SNP site or gene deletions. Assays that detect low-intensity clusters can provide insight into the distribution of presence–absence variation (PAV) in wheat populations. A total of 46 977 SNPs from the wheat 90K array were genetically mapped using a combination of eight mapping populations. The developed array and cluster identification algorithms provide an opportunity to infer detailed haplotype structure in polyploid wheat and will serve as an invaluable resource for diversity studies and investigating the genetic basis of trait variation in wheat. PMID:24646323

  17. Candidate gene association studies in syndromic and non-syndromic cleft lip and palate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daack-Hirsch, S.; Basart, A.; Frischmeyer, P.

    1994-09-01

    Using ongoing case ascertainment through a birth defects registry, we have collected 219 nuclear families with non-syndromic cleft lip and/or palate and 111 families with a collection of syndromic forms. Syndromic cases include 24 with recognized forms and 72 with unrecognized syndromes. Candidate gene studies as well as genome-wide searches for evidence of microdeletions and isodisomy are currently being carried out. Candidate gene association studies, to date, have made use of PCR-based polymorphisms for TGFA, MSX1, CLPG13 (a CA repeat associated with a human homologue of a locus that results in craniofacial dysmorphogenesis in the mouse) and an STRP foundmore » in a Van der Woude syndrome microdeletion. Control tetranucleotide repeats, which insure that population-based differences are not responsible for any observed associations, are also tested. Studies of the syndromic cases have included the same list of candidate genes searching for evidence of microdeletions and a genome-wide search using tri- and tetranucleotide polymorphic markers to search for isodisomy or structural rearrangements. Significant associations have previously been identified for TGFA, and, in this report, identified for MSX1 and nonsyndromic cleft palate only (p = 0.04, uncorrected). Preliminary results of the genome-wide scan for isodisomy has returned no true positives and there has been no evidence for microdeletion cases.« less

  18. A massively parallel strategy for STR marker development, capture, and genotyping.

    PubMed

    Kistler, Logan; Johnson, Stephen M; Irwin, Mitchell T; Louis, Edward E; Ratan, Aakrosh; Perry, George H

    2017-09-06

    Short tandem repeat (STR) variants are highly polymorphic markers that facilitate powerful population genetic analyses. STRs are especially valuable in conservation and ecological genetic research, yielding detailed information on population structure and short-term demographic fluctuations. Massively parallel sequencing has not previously been leveraged for scalable, efficient STR recovery. Here, we present a pipeline for developing STR markers directly from high-throughput shotgun sequencing data without a reference genome, and an approach for highly parallel target STR recovery. We employed our approach to capture a panel of 5000 STRs from a test group of diademed sifakas (Propithecus diadema, n = 3), endangered Malagasy rainforest lemurs, and we report extremely efficient recovery of targeted loci-97.3-99.6% of STRs characterized with ≥10x non-redundant sequence coverage. We then tested our STR capture strategy on P. diadema fecal DNA, and report robust initial results and suggestions for future implementations. In addition to STR targets, this approach also generates large, genome-wide single nucleotide polymorphism (SNP) panels from flanking regions. Our method provides a cost-effective and scalable solution for rapid recovery of large STR and SNP datasets in any species without needing a reference genome, and can be used even with suboptimal DNA more easily acquired in conservation and ecological studies. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  19. Characterization of the breakpoints of a polymorphic inversion complex detects strict and broad breakpoint reuse at the molecular level.

    PubMed

    Puerma, Eva; Orengo, Dorcas J; Salguero, David; Papaceit, Montserrat; Segarra, Carmen; Aguadé, Montserrat

    2014-09-01

    Inversions are an integral part of structural variation within species, and they play a leading role in genome reorganization across species. Work at both the cytological and genome sequence levels has revealed heterogeneity in the distribution of inversion breakpoints, with some regions being recurrently used. Breakpoint reuse at the molecular level has mostly been assessed for fixed inversions through genome sequence comparison, and therefore rather broadly. Here, we have identified and sequenced the breakpoints of two polymorphic inversions-E1 and E2 that share a breakpoint-in the extant Est and E1 + 2 chromosomal arrangements of Drosophila subobscura. The breakpoints are two medium-sized repeated motifs that mediated the inversions by two different mechanisms: E1 via staggered breaks and subsequent repair and E2 via repeat-mediated ectopic recombination. The fine delimitation of the shared breakpoint revealed its strict reuse at the molecular level regardless of which was the intermediate arrangement. The occurrence of other rearrangements in the most proximal and distal extended breakpoint regions reveals the broad reuse of these regions. This differential degree of fragility might be related to their sharing the presence outside the inverted region of snoRNA-encoding genes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Single nucleotide polymorphism discovery in cutthroat trout subspecies using genome reduction, barcoding, and 454 pyro-sequencing

    PubMed Central

    2012-01-01

    Background Salmonids are popular sport fishes, and as such have been subjected to widespread stocking throughout western North America. Historically, stocking was done with little regard for genetic variation among populations and has resulted in genetic mixing among species and subspecies in many areas, thus putting the genetic integrity of native salmonid populations at risk and creating a need to assess the genetic constitution of native salmonid populations. Cutthroat trout is a salmonid species with pronounced geographic structure (there are 10 extant subspecies) and a recent history of hybridization with introduced rainbow trout in many populations. Genetic admixture has also occurred among cutthroat trout subspecies in areas where introductions have brought two or more subspecies into contact. Consequently, management agencies have increased their efforts to evaluate the genetic composition of cutthroat trout populations to identify populations that remain uncompromised and manage them accordingly, but additional genetic markers are needed to do so effectively. Here we used genome reduction, MID-barcoding, and 454-pyrosequencing to discover single nucleotide polymorphisms that differentiate cutthroat trout subspecies and can be used as a rapid, cost-effective method to characterize the genetic composition of cutthroat trout populations. Results Thirty cutthroat and six rainbow trout individuals were subjected to genome reduction and next-generation sequencing. A total of 1,499,670 reads averaging 379 base pairs in length were generated by 454-pyrosequencing, resulting in 569,060,077 total base pairs sequenced. A total of 43,558 putative SNPs were identified, and of those, 125 SNP primers were developed that successfully amplified 96 cutthroat trout and rainbow trout individuals. These SNP loci were able to differentiate most cutthroat trout subspecies using distance methods and Structure analyses. Conclusions Genomic and bioinformatic protocols were successfully implemented to identify 125 nuclear SNPs that are capable of differentiating most subspecies of cutthroat trout from one another. The ability to use this suite of SNPs to identify individuals of unknown genetic background to subspecies can be a valuable tool for management agencies in their efforts to evaluate the genetic structure of cutthroat trout populations prior to constructing and implementing conservation plans. PMID:23259499

  1. Development of a 690K SNP array in catfish and its application for genetic mapping and validation of the reference genome sequence

    USDA-ARS?s Scientific Manuscript database

    Single nucleotide polymorphisms (SNPs) are capable of providing the highest level of genome coverage for genomic and genetic analysis because of their abundance and relatively even distribution in the genome. Such a capacity, however, cannot be achieved without an efficient genotyping platform such ...

  2. Genome Comparisons Reveal a Dominant Mechanism of Chromosome Number Reduction in Grasses and Accelerated Genome Evolution in Triticeae

    USDA-ARS?s Scientific Manuscript database

    Single nucleotide polymorphism was employed in the construction of a high-resolution, expressed sequence tag (EST) map of Aegilops tauschii, the diploid source of the wheat D genome. Comparison of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and...

  3. Survival analysis of infected mice reveals pathogenic variations in the genome of avian H1N1 viruses.

    PubMed

    Koçer, Zeynep A; Fan, Yiping; Huether, Robert; Obenauer, John; Webby, Richard J; Zhang, Jinghui; Webster, Robert G; Wu, Gang

    2014-12-12

    Most influenza pandemics have been caused by H1N1 viruses of purely or partially avian origin. Here, using Cox proportional hazard model, we attempt to identify the genetic variations in the whole genome of wild-type North American avian H1N1 influenza A viruses that are associated with their virulence in mice by residue variations, host origins of virus (Anseriformes-ducks or Charadriiformes-shorebirds), and host-residue interactions. In addition, through structural modeling, we predicted that several polymorphic sites associated with pathogenicity were located in structurally important sites, especially in the polymerase complex and NS genes. Our study introduces a new approach to identify pathogenic variations in wild-type viruses circulating in the natural reservoirs and ultimately to understand their infectious risks to humans as part of risk assessment efforts towards the emergence of future pandemic strains.

  4. 11,670 whole-genome sequences representative of the Han Chinese population from the CONVERGE project.

    PubMed

    Cai, Na; Bigdeli, Tim B; Kretzschmar, Warren W; Li, Yihan; Liang, Jieqin; Hu, Jingchu; Peterson, Roseann E; Bacanu, Silviu; Webb, Bradley Todd; Riley, Brien; Li, Qibin; Marchini, Jonathan; Mott, Richard; Kendler, Kenneth S; Flint, Jonathan

    2017-02-14

    The China, Oxford and Virginia Commonwealth University Experimental Research on Genetic Epidemiology (CONVERGE) project on Major Depressive Disorder (MDD) sequenced 11,670 female Han Chinese at low-coverage (1.7X), providing the first large-scale whole genome sequencing resource representative of the largest ethnic group in the world. Samples are collected from 58 hospitals from 23 provinces around China. We are able to call 22 million high quality single nucleotide polymorphisms (SNP) from the nuclear genome, representing the largest SNP call set from an East Asian population to date. We use these variants for imputation of genotypes across all samples, and this has allowed us to perform a successful genome wide association study (GWAS) on MDD. The utility of these data can be extended to studies of genetic ancestry in the Han Chinese and evolutionary genetics when integrated with data from other populations. Molecular phenotypes, such as copy number variations and structural variations can be detected, quantified and analysed in similar ways.

  5. Application of Nexus copy number software for CNV detection and analysis.

    PubMed

    Darvishi, Katayoon

    2010-04-01

    Among human structural genomic variation, copy number variants (CNVs) are the most frequently known component, comprised of gains/losses of DNA segments that are generally 1 kb in length or longer. Array-based comparative genomic hybridization (aCGH) has emerged as a powerful tool for detecting genomic copy number variants (CNVs). With the rapid increase in the density of array technology and with the adaptation of new high-throughput technology, a reliable and computationally scalable method for accurate mapping of recurring DNA copy number aberrations has become a main focus in research. Here we introduce Nexus Copy Number software, a platform-independent tool, to analyze the output files of all types of commercial and custom-made comparative genomic hybridization (CGH) and single-nucleotide polymorphism (SNP) arrays, such as those manufactured by Affymetrix, Agilent Technologies, Illumina, and Roche NimbleGen. It also supports data generated by various array image-analysis software tools such as GenePix, ImaGene, and BlueFuse. (c) 2010 by John Wiley & Sons, Inc.

  6. Microsatellite Interruptions Stabilize Primate Genomes and Exist as Population-Specific Single Nucleotide Polymorphisms within Individual Human Genomes

    PubMed Central

    Ananda, Guruprasad; Hile, Suzanne E.; Breski, Amanda; Wang, Yanli; Kelkar, Yogeshwar; Makova, Kateryna D.; Eckert, Kristin A.

    2014-01-01

    Interruptions of microsatellite sequences impact genome evolution and can alter disease manifestation. However, human polymorphism levels at interrupted microsatellites (iMSs) are not known at a genome-wide scale, and the pathways for gaining interruptions are poorly understood. Using the 1000 Genomes Phase-1 variant call set, we interrogated mono-, di-, tri-, and tetranucleotide repeats up to 10 units in length. We detected ∼26,000–40,000 iMSs within each of four human population groups (African, European, East Asian, and American). We identified population-specific iMSs within exonic regions, and discovered that known disease-associated iMSs contain alleles present at differing frequencies among the populations. By analyzing longer microsatellites in primate genomes, we demonstrate that single interruptions result in a genome-wide average two- to six-fold reduction in microsatellite mutability, as compared with perfect microsatellites. Centrally located interruptions lowered mutability dramatically, by two to three orders of magnitude. Using a biochemical approach, we tested directly whether the mutability of a specific iMS is lower because of decreased DNA polymerase strand slippage errors. Modeling the adenomatous polyposis coli tumor suppressor gene sequence, we observed that a single base substitution interruption reduced strand slippage error rates five- to 50-fold, relative to a perfect repeat, during synthesis by DNA polymerases α, β, or η. Computationally, we demonstrate that iMSs arise primarily by base substitution mutations within individual human genomes. Our biochemical survey of human DNA polymerase α, β, δ, κ, and η error rates within certain microsatellites suggests that interruptions are created most frequently by low fidelity polymerases. Our combined computational and biochemical results demonstrate that iMSs are abundant in human genomes and are sources of population-specific genetic variation that may affect genome stability. The genome-wide identification of iMSs in human populations presented here has important implications for current models describing the impact of microsatellite polymorphisms on gene expression. PMID:25033203

  7. Genetic diversity and population structure of Musa accessions in ex situ conservation

    PubMed Central

    2013-01-01

    Background Banana cultivars are mostly derived from hybridization between wild diploid subspecies of Musa acuminata (A genome) and M. balbisiana (B genome), and they exhibit various levels of ploidy and genomic constitution. The Embrapa ex situ Musa collection contains over 220 accessions, of which only a few have been genetically characterized. Knowledge regarding the genetic relationships and diversity between modern cultivars and wild relatives would assist in conservation and breeding strategies. Our objectives were to determine the genomic constitution based on Internal Transcribed Spacer (ITS) regions polymorphism and the ploidy of all accessions by flow cytometry and to investigate the population structure of the collection using Simple Sequence Repeat (SSR) loci as co-dominant markers based on Structure software, not previously performed in Musa. Results From the 221 accessions analyzed by flow cytometry, the correct ploidy was confirmed or established for 212 (95.9%), whereas digestion of the ITS region confirmed the genomic constitution of 209 (94.6%). Neighbor-joining clustering analysis derived from SSR binary data allowed the detection of two major groups, essentially distinguished by the presence or absence of the B genome, while subgroups were formed according to the genomic composition and commercial classification. The co-dominant nature of SSR was explored to analyze the structure of the population based on a Bayesian approach, detecting 21 subpopulations. Most of the subpopulations were in agreement with the clustering analysis. Conclusions The data generated by flow cytometry, ITS and SSR supported the hypothesis about the occurrence of homeologue recombination between A and B genomes, leading to discrepancies in the number of sets or portions from each parental genome. These phenomenons have been largely disregarded in the evolution of banana, as the “single-step domestication” hypothesis had long predominated. These findings will have an impact in future breeding approaches. Structure analysis enabled the efficient detection of ancestry of recently developed tetraploid hybrids by breeding programs, and for some triploids. However, for the main commercial subgroups, Structure appeared to be less efficient to detect the ancestry in diploid groups, possibly due to sampling restrictions. The possibility of inferring the membership among accessions to correct the effects of genetic structure opens possibilities for its use in marker-assisted selection by association mapping. PMID:23497122

  8. Development and characterization of 32 microsatellite loci in Genipa americana (Rubiaceae)1

    PubMed Central

    Manoel, Ricardo O.; Freitas, Miguel L. M.; Barreto, Mariana A.; Moraes, Mário L. T.; Souza, Anete P.; Sebbenn, Alexandre M.

    2014-01-01

    • Premise of the study: Microsatellite primers were developed for the tree species Genipa americana (Rubiaceae) for further population genetic studies. • Methods and Results: We identified 144 clones containing 65 repeat motifs from a genomic library enriched for (CT)8 and (GT)8 motifs. Primer pairs were developed for 32 microsatellite loci and validated in 40 individuals of two natural G. americana populations. Seventeen loci were polymorphic, revealing from three to seven alleles per locus. The observed and expected heterozygosities ranged from 0.24 to 1.00 and from 0.22 to 0.78, respectively. • Conclusions: The 17 primers identified as polymorphic loci are suitable to study the genetic diversity and structure, mating system, and gene flow in G. americana. PMID:25202610

  9. Single nucleotide polymorphisms in the bovine MHC region of Japanese Black cattle are associated with bovine leukemia virus proviral load.

    PubMed

    Takeshima, Shin-Nosuke; Sasaki, Shinji; Meripet, Polat; Sugimoto, Yoshikazu; Aida, Yoko

    2017-04-04

    Bovine leukemia virus (BLV) is the causative agent of enzootic bovine leukosis, a malignant B cell lymphoma that has spread worldwide and causes serious problems for the cattle industry. The BLV proviral load, which represents the BLV genome integrated into host genome, is a useful index for estimating disease progression and transmission risk. Here, we conducted a genome-wide association study to identify single nucleotide polymorphisms (SNPs) associated with BLV proviral load in Japanese Black cattle. The study examined 93 cattle with a high proviral load and 266 with a low proviral load. Three SNPs showed a significant association with proviral load. One SNP was detected in the CNTN3 gene on chromosome 22, and two (which were not in linkage disequilibrium) were detected in the bovine major histocompatibility complex region on chromosome 23. These results suggest that polymorphisms in the major histocompatibility complex region affect proviral load. This is the first report to detect SNPs associated with BLV proviral load in Japanese Black cattle using whole genome association study, and understanding host factors may provide important clues for controlling the spread of BLV in Japanese Black cattle.

  10. Analysis for complete genomic sequence of HLA-B and HLA-C alleles in the Chinese Han population.

    PubMed

    Zhu, F; He, Y; Zhang, W; He, J; He, J; Xu, X; Lv, H; Yan, L

    2011-08-01

    In the present study, we have determined the complete genomic sequence and analysed the intron polymorphism of partial HLA-B and HLA-C alleles in the Chinese Han population. Over 3.0 kb DNA fragments of HLA-B and HLA-C loci were amplified by polymerase chain reaction from partial 5' untranslated region to 3' noncoding region respectively, and then the amplified products were sequenced. Full-length nucleotide sequences of 14 HLA-B alleles and 10 HLA-C alleles were obtained and have been submitted to GenBank and IMGT/HLA database. Two novel alleles of HLA-B*52:01:01:02 and HLA-B*59:01:01:02 were identified, and the complete genomic sequence of HLA-B*52:01:01:01 was firstly reported. Totally 157 and 167 polymorphism positions were found in the full-length genomic sequence of HLA-B and HLA-C loci respectively. Our results suggested that many single nucleotide polymorphisms existed in the exon and intron regions, and the data can provide useful information for understanding the evolution of HLA-B and HLA-C alleles. © 2011 Blackwell Publishing Ltd.

  11. Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

    PubMed Central

    Desikan, Srinidhi; Narayanan, Sujatha

    2015-01-01

    Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019

  12. Validation of the high-throughput marker technology DArT using the model plant Arabidopsis thaliana.

    PubMed

    Wittenberg, Alexander H J; van der Lee, Theo; Cayla, Cyril; Kilian, Andrzej; Visser, Richard G F; Schouten, Henk J

    2005-08-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds of restriction site based polymorphisms between genotypes and does not require DNA sequence information or site-specific oligonucleotides. This paper demonstrates the potential of DArT for genetic mapping by validating the quality and molecular basis of the markers, using the model plant Arabidopsis thaliana. Restriction fragments from a genomic representation of the ecotype Landsberg erecta (Ler) were amplified by PCR, individualized by cloning and spotted onto glass slides. The arrays were then hybridized with labeled genomic representations of the ecotypes Columbia (Col) and Ler and of individuals from an F(2) population obtained from a Col x Ler cross. The scoring of markers with specialized software was highly reproducible and 107 markers could unambiguously be ordered on a genetic linkage map. The marker order on the genetic linkage map coincided with the order on the DNA sequence map. Sequencing of the Ler markers and alignment with the available Col genome sequence confirmed that the polymorphism in DArT markers is largely a result of restriction site polymorphisms.

  13. Chromosomal localization and partial genomic structure of the human peroxisome proliferator activated receptor-gamma (hPPAR gamma) gene.

    PubMed

    Beamer, B A; Negri, C; Yen, C J; Gavrilova, O; Rumberger, J M; Durcan, M J; Yarnall, D P; Hawkins, A L; Griffin, C A; Burns, D K; Roth, J; Reitman, M; Shuldiner, A R

    1997-04-28

    We determined the chromosomal localization and partial genomic structure of the coding region of the human PPAR gamma gene (hPPAR gamma), a nuclear receptor important for adipocyte differentiation and function. Sequence analysis and long PCR of human genomic DNA with primers that span putative introns revealed that intron positions and sizes of hPPAR gamma are similar to those previously determined for the mouse PPAR gamma gene[13]. Fluorescent in situ hybridization localized hPPAR gamma to chromosome 3, band 3p25. Radiation hybrid mapping with two independent primer pairs was consistent with hPPAR gamma being within 1.5 Mb of marker D3S1263 on 3p25-p24.2. These sequences of the intron/exon junctions of the 6 coding exons shared by hPPAR gamma 1 and hPPAR gamma 2 will facilitate screening for possible mutations. Furthermore, D3S1263 is a suitable polymorphic marker for linkage analysis to evaluate PPAR gamma's potential contribution to genetic susceptibility to obesity, lipoatrophy, insulin resistance, and diabetes.

  14. First Insights into the Genetic Diversity of the Pinewood Nematode in Its Native Area Using New Polymorphic Microsatellite Loci

    PubMed Central

    Mallez, Sophie; Castagnone, Chantal; Espada, Margarida; Vieira, Paulo; Eisenback, Jonathan D.; Mota, Manuel; Guillemaud, Thomas; Castagnone-Sereno, Philippe

    2013-01-01

    The pinewood nematode, Bursaphelenchus xylophilus, native to North America, is the causative agent of pine wilt disease and among the most important invasive forest pests in the East-Asian countries, such as Japan and China. Since 1999, it has been found in Europe in the Iberian Peninsula, where it also causes significant damage. In a previous study, 94 pairs of microsatellite primers have been identified in silico in the pinewood nematode genome. In the present study, specific PCR amplifications and polymorphism tests to validate these loci were performed and 17 microsatellite loci that were suitable for routine analysis of B. xylophilus genetic diversity were selected. The polymorphism of these markers was evaluated on nematodes from four field origins and one laboratory collection strain, all originate from the native area. The number of alleles and the expected heterozygosity varied between 2 and 11 and between 0.039 and 0.777, respectively. First insights into the population genetic structure of B. xylophilus were obtained using clustering and multivariate methods on the genotypes obtained from the field samples. The results showed that the pinewood nematode genetic diversity is spatially structured at the scale of the pine tree and probably at larger scales. The role of dispersal by the insect vector versus human activities in shaping this structure is discussed. PMID:23554990

  15. The Population History of Endogenous Retroviruses in Mule Deer (Odocoileus hemionus)

    PubMed Central

    2014-01-01

    Mobile elements are powerful agents of genomic evolution and can be exceptionally informative markers for investigating species and population-level evolutionary history. While several studies have utilized retrotransposon-based insertional polymorphisms to resolve phylogenies, few population studies exist outside of humans. Endogenous retroviruses are LTR-retrotransposons derived from retroviruses that have become stably integrated in the host genome during past infections and transmitted vertically to subsequent generations. They offer valuable insight into host-virus co-evolution and a unique perspective on host evolutionary history because they integrate into the genome at a discrete point in time. We examined the evolutionary history of a cervid endogenous gammaretrovirus (CrERVγ) in mule deer (Odocoileus hemionus). We sequenced 14 CrERV proviruses (CrERV-in1 to -in14), and examined the prevalence and distribution of 13 proviruses in 262 deer among 15 populations from Montana, Wyoming, and Utah. CrERV absence in white-tailed deer (O. virginianus), identical 5′ and 3′ long terminal repeat (LTR) sequences, insertional polymorphism, and CrERV divergence time estimates indicated that most endogenization events occurred within the last 200000 years. Population structure inferred from CrERVs (F ST = 0.008) and microsatellites (θ = 0.01) was low, but significant, with Utah, northwestern Montana, and a Helena herd being particularly differentiated. Clustering analyses indicated regional structuring, and non-contiguous clustering could often be explained by known translocations. Cluster ensemble results indicated spatial localization of viruses, specifically in deer from northeastern and western Montana. This study demonstrates the utility of endogenous retroviruses to elucidate and provide novel insight into both ERV evolutionary history and the history of contemporary host populations. PMID:24336966

  16. Balancing Selection on a Regulatory Region Exhibiting Ancient Variation That Predates Human–Neandertal Divergence

    PubMed Central

    Iskow, Rebecca C.; Austermann, Christian; Scharer, Christopher D.; Raj, Towfique; Boss, Jeremy M.; Sunyaev, Shamil; Price, Alkes; Stranger, Barbara; Simon, Viviana; Lee, Charles

    2013-01-01

    Ancient population structure shaping contemporary genetic variation has been recently appreciated and has important implications regarding our understanding of the structure of modern human genomes. We identified a ∼36-kb DNA segment in the human genome that displays an ancient substructure. The variation at this locus exists primarily as two highly divergent haplogroups. One of these haplogroups (the NE1 haplogroup) aligns with the Neandertal haplotype and contains a 4.6-kb deletion polymorphism in perfect linkage disequilibrium with 12 single nucleotide polymorphisms (SNPs) across diverse populations. The other haplogroup, which does not contain the 4.6-kb deletion, aligns with the chimpanzee haplotype and is likely ancestral. Africans have higher overall pairwise differences with the Neandertal haplotype than Eurasians do for this NE1 locus (p<10−15). Moreover, the nucleotide diversity at this locus is higher in Eurasians than in Africans. These results mimic signatures of recent Neandertal admixture contributing to this locus. However, an in-depth assessment of the variation in this region across multiple populations reveals that African NE1 haplotypes, albeit rare, harbor more sequence variation than NE1 haplotypes found in Europeans, indicating an ancient African origin of this haplogroup and refuting recent Neandertal admixture. Population genetic analyses of the SNPs within each of these haplogroups, along with genome-wide comparisons revealed significant FST (p = 0.00003) and positive Tajima's D (p = 0.00285) statistics, pointing to non-neutral evolution of this locus. The NE1 locus harbors no protein-coding genes, but contains transcribed sequences as well as sequences with putative regulatory function based on bioinformatic predictions and in vitro experiments. We postulate that the variation observed at this locus predates Human–Neandertal divergence and is evolving under balancing selection, especially among European populations. PMID:23593015

  17. The population history of endogenous retroviruses in mule deer (Odocoileus heminous)

    USGS Publications Warehouse

    Kamath, Pauline L.; Elleder, Daniel; Bao, Le; Cross, Paul C.; Powell, John H.; Poss, Mary

    2013-01-01

    Mobile elements are powerful agents of genomic evolution and can be exceptionally informative markers for investigating species and population-level evolutionary history. While several studies have utilized retrotransposon-based insertional polymorphisms to resolve phylogenies, few population studies exist outside of humans. Endogenous retroviruses are LTR-retrotransposons derived from retroviruses that have become stably integrated in the host genome during past infections and transmitted vertically to subsequent generations. They offer valuable insight into host-virus co-evolution and a unique perspective on host evolutionary history because they integrate into the genome at a discrete point in time. We examined the evolutionary history of a cervid endogenous gammaretrovirus (CrERVγ) in mule deer (Odocoileus hemionus). We sequenced 14 CrERV proviruses (CrERV-in1 to -in14), and examined the prevalence and distribution of 13 proviruses in 262 deer among 15 populations from Montana, Wyoming, and Utah. CrERV absence in white-tailed deer (O. virginianus), identical 5′ and 3′ long terminal repeat (LTR) sequences, insertional polymorphism, and CrERV divergence time estimates indicated that most endogenization events occurred within the last 200000 years. Population structure inferred from CrERVs (F ST = 0.008) and microsatellites (θ = 0.01) was low, but significant, with Utah, northwestern Montana, and a Helena herd being particularly differentiated. Clustering analyses indicated regional structuring, and non-contiguous clustering could often be explained by known translocations. Cluster ensemble results indicated spatial localization of viruses, specifically in deer from northeastern and western Montana. This study demonstrates the utility of endogenous retroviruses to elucidate and provide novel insight into both ERV evolutionary history and the history of contemporary host populations.

  18. Genetic Diversity and Population Structure of Whitebark Pine (Pinus albicaulis Engelm.) in Western North America

    PubMed Central

    Liu, Jun-Jun; Sniezko, Richard; Murray, Michael; Wang, Ning; Chen, Hao; Zamany, Arezoo; Sturrock, Rona N.; Savin, Douglas; Kegley, Angelia

    2016-01-01

    Whitebark pine (WBP, Pinus albicaulis Engelm.) is an endangered conifer species due to heavy mortality from white pine blister rust (WPBR, caused by Cronartium ribicola) and mountain pine beetle (Dendroctonus ponderosae). Information about genetic diversity and population structure is of fundamental importance for its conservation and restoration. However, current knowledge on the genetic constitution and genomic variation is still limited for WBP. In this study, an integrated genomics approach was applied to characterize seed collections from WBP breeding programs in western North America. RNA-seq analysis was used for de novo assembly of the WBP needle transcriptome, which contains 97,447 protein-coding transcripts. Within the transcriptome, single nucleotide polymorphisms (SNPs) were discovered, and more than 22,000 of them were non-synonymous SNPs (ns-SNPs). Following the annotation of genes with ns-SNPs, 216 ns-SNPs within candidate genes with putative functions in disease resistance and plant defense were selected to design SNP arrays for high-throughput genotyping. Among these SNP loci, 71 were highly polymorphic, with sufficient variation to identify a unique genotype for each of the 371 individuals originating from British Columbia (Canada), Oregon and Washington (USA). A clear genetic differentiation was evident among seed families. Analyses of genetic spatial patterns revealed varying degrees of diversity and the existence of several genetic subgroups in the WBP breeding populations. Genetic components were associated with geographic variables and phenotypic rating of WPBR disease severity across landscapes, which may facilitate further identification of WBP genotypes and gene alleles contributing to local adaptation and quantitative resistance to WPBR. The WBP genomic resources developed here provide an invaluable tool for further studies and for exploitation and utilization of the genetic diversity preserved within this endangered conifer and other five-needle pines. PMID:27992468

  19. Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

    PubMed Central

    2011-01-01

    Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336

  20. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    PubMed Central

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  1. Estimation of linkage disequilibrium and interspecific gene flow in Ficedula flycatchers by a newly developed 50k single-nucleotide polymorphism array

    PubMed Central

    Kawakami, Takeshi; Backström, Niclas; Burri, Reto; Husby, Arild; Olason, Pall; Rice, Amber M; Ålund, Murielle; Qvarnström, Anna; Ellegren, Hans

    2014-01-01

    With the access to draft genome sequence assemblies and whole-genome resequencing data from population samples, molecular ecology studies will be able to take truly genome-wide approaches. This now applies to an avian model system in ecological and evolutionary research: Old World flycatchers of the genus Ficedula, for which we recently obtained a 1.1 Gb collared flycatcher genome assembly and identified 13 million single-nucleotide polymorphism (SNP)s in population resequencing of this species and its sister species, pied flycatcher. Here, we developed a custom 50K Illumina iSelect flycatcher SNP array with markers covering 30 autosomes and the Z chromosome. Using a number of selection criteria for inclusion in the array, both genotyping success rate and polymorphism information content (mean marker heterozygosity = 0.41) were high. We used the array to assess linkage disequilibrium (LD) and hybridization in flycatchers. Linkage disequilibrium declined quickly to the background level at an average distance of 17 kb, but the extent of LD varied markedly within the genome and was more than 10-fold higher in ‘genomic islands’ of differentiation than in the rest of the genome. Genetic ancestry analysis identified 33 F1 hybrids but no later-generation hybrids from sympatric populations of collared flycatchers and pied flycatchers, contradicting earlier reports of backcrosses identified from much fewer number of markers. With an estimated divergence time as recently as <1 Ma, this suggests strong selection against F1 hybrids and unusually rapid evolution of reproductive incompatibility in an avian system. PMID:24784959

  2. Polymorphic integrations of an endogenous gammaretrovirus in the mule deer genome.

    PubMed

    Elleder, Daniel; Kim, Oekyung; Padhi, Abinash; Bankert, Jason G; Simeonov, Ivan; Schuster, Stephan C; Wittekindt, Nicola E; Motameny, Susanne; Poss, Mary

    2012-03-01

    Endogenous retroviruses constitute a significant genomic fraction in all mammalian species. Typically they are evolutionarily old and fixed in the host species population. Here we report on a novel endogenous gammaretrovirus (CrERVγ; for cervid endogenous gammaretrovirus) in the mule deer (Odocoileus hemionus) that is insertionally polymorphic among individuals from the same geographical location, suggesting that it has a more recent evolutionary origin. Using PCR-based methods, we identified seven CrERVγ proviruses and demonstrated that they show various levels of insertional polymorphism in mule deer individuals. One CrERVγ provirus was detected in all mule deer sampled but was absent from white-tailed deer, indicating that this virus originally integrated after the split of the two species, which occurred approximately one million years ago. There are, on average, 100 CrERVγ copies in the mule deer genome based on quantitative PCR analysis. A CrERVγ provirus was sequenced and contained intact open reading frames (ORFs) for three virus genes. Transcripts were identified covering the entire provirus. CrERVγ forms a distinct branch of the gammaretrovirus phylogeny, with the closest relatives of CrERVγ being endogenous gammaretroviruses from sheep and pig. We demonstrated that white-tailed deer (Odocoileus virginianus) and elk (Cervus canadensis) DNA contain proviruses that are closely related to mule deer CrERVγ in a conserved region of pol; more distantly related sequences can be identified in the genome of another member of the Cervidae, the muntjac (Muntiacus muntjak). The discovery of a novel transcriptionally active and insertionally polymorphic retrovirus in mammals could provide a useful model system to study the dynamic interaction between the host genome and an invading retrovirus.

  3. A selective sweep of >8 Mb on chromosome 26 in the Boxer genome.

    PubMed

    Quilez, Javier; Short, Andrea D; Martínez, Verónica; Kennedy, Lorna J; Ollier, William; Sanchez, Armand; Altet, Laura; Francino, Olga

    2011-07-01

    Modern dog breeds display traits that are either breed-specific or shared by a few breeds as a result of genetic bottlenecks during the breed creation process and artificial selection for breed standards. Selective sweeps in the genome result from strong selection and can be detected as a reduction or elimination of polymorphism in a given region of the genome. Extended regions of homozygosity, indicative of selective sweeps, were identified in a genome-wide scan dataset of 25 Boxers from the United Kingdom genotyped at ~20,000 single-nucleotide polymorphisms (SNPs). These regions were further examined in a second dataset of Boxers collected from a different geographical location and genotyped using higher density SNP arrays (~170,000 SNPs). A selective sweep previously associated with canine brachycephaly was detected on chromosome 1. A novel selective sweep of over 8 Mb was observed on chromosome 26 in Boxer and for a shorter region in English and French bulldogs. It was absent in 171 samples from eight other dog breeds and 7 Iberian wolf samples. A region of extended increased heterozygosity on chromosome 9 overlapped with a previously reported copy number variant (CNV) which was polymorphic in multiple dog breeds. A selective sweep of more than 8 Mb on chromosome 26 was identified in the Boxer genome. This sweep is likely caused by strong artificial selection for a trait of interest and could have inadvertently led to undesired health implications for this breed. Furthermore, we provide supporting evidence for two previously described regions: a selective sweep on chromosome 1 associated with canine brachycephaly and a CNV on chromosome 9 polymorphic in multiple dog breeds.

  4. Polymorphisms of the Tissue Inhibitor of Metalloproteinase 3 Gene Are Associated with Resistance to High-Altitude Pulmonary Edema (HAPE) in a Japanese Population: A Case Control Study Using Polymorphic Microsatellite Markers

    PubMed Central

    Kobayashi, Nobumitsu; Hanaoka, Masayuki; Droma, Yunden; Ito, Michiko; Katsuyama, Yoshihiko; Kubo, Keishi; Ota, Masao

    2013-01-01

    Introduction High-altitude pulmonary edema (HAPE) is a hypoxia-induced, life-threatening, high permeability type of edema attributable to pulmonary capillary stress failure. Genome-wide association analysis is necessary to better understand how genetics influence the outcome of HAPE. Materials and Methods DNA samples were collected from 53 subjects susceptible to HAPE (HAPE-s) and 67 elite Alpinists resistant to HAPE (HAPE-r). The genome scan was carried out using 400 polymorphic microsatellite markers throughout the whole genome in all subjects. In addition, six single nucleotide polymorphisms (SNPs) of the gene encoding the tissue inhibitor of metalloproteinase 3 (TIMP3) were genotyped by Taqman® SNP Genotyping Assays. Results The results were analyzed using case-control comparisons. Whole genome scanning revealed that allele frequencies in nine markers were statistically different between HAPE-s and HAPE-r subjects. The SNP genotyping of the TIMP3 gene revealed that the derived allele C of rs130293 was associated with resistance to HAPE [odds ratio (OR) = 0.21, P = 0.0012) and recessive inheritance of the phenotype of HAPE-s (P = 0.0012). A haplotype CAC carrying allele C of rs130293 was associated with resistance to HAPE. Discussion This genome-wide association study revealed several novel candidate genes associated with susceptibility or resistance to HAPE in a Japanese population. Among those, the minor allele C of rs130293 (C/T) in the TIMP3 gene was linked to resistance to HAPE; while, the ancestral allele T was associated with susceptibility to HAPE. PMID:23991023

  5. Genomic profiling of plastid DNA variation in the Mediterranean olive tree

    PubMed Central

    2011-01-01

    Background Characterisation of plastid genome (or cpDNA) polymorphisms is commonly used for phylogeographic, population genetic and forensic analyses in plants, but detecting cpDNA variation is sometimes challenging, limiting the applications of such an approach. In the present study, we screened cpDNA polymorphism in the olive tree (Olea europaea L.) by sequencing the complete plastid genome of trees with a distinct cpDNA lineage. Our objective was to develop new markers for a rapid genomic profiling (by Multiplex PCRs) of cpDNA haplotypes in the Mediterranean olive tree. Results Eight complete cpDNA genomes of Olea were sequenced de novo. The nucleotide divergence between olive cpDNA lineages was low and not exceeding 0.07%. Based on these sequences, markers were developed for studying two single nucleotide substitutions and length polymorphism of 62 regions (with variable microsatellite motifs or other indels). They were then used to genotype the cpDNA variation in cultivated and wild Mediterranean olive trees (315 individuals). Forty polymorphic loci were detected on this sample, allowing the distinction of 22 haplotypes belonging to the three Mediterranean cpDNA lineages known as E1, E2 and E3. The discriminating power of cpDNA variation was particularly low for the cultivated olive tree with one predominating haplotype, but more diversity was detected in wild populations. Conclusions We propose a method for a rapid characterisation of the Mediterranean olive germplasm. The low variation in the cultivated olive tree indicated that the utility of cpDNA variation for forensic analyses is limited to rare haplotypes. In contrast, the high cpDNA variation in wild populations demonstrated that our markers may be useful for phylogeographic and populations genetic studies in O. europaea. PMID:21569271

  6. The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation.

    PubMed

    Schwartz, John C; Gibson, Mark S; Heimeier, Dorothea; Koren, Sergey; Phillippy, Adam M; Bickhart, Derek M; Smith, Timothy P L; Medrano, Juan F; Hammond, John A

    2017-04-01

    Natural killer (NK) cells are a diverse population of lymphocytes with a range of biological roles including essential immune functions. NK cell diversity is in part created by the differential expression of cell surface receptors which modulate activation and function, including multiple subfamilies of C-type lectin receptors encoded within the NK complex (NKC). Little is known about the gene content of the NKC beyond rodent and primate lineages, other than it appears to be extremely variable between mammalian groups. We compared the NKC structure between mammalian species using new high-quality draft genome assemblies for cattle and goat; re-annotated sheep, pig, and horse genome assemblies; and the published human, rat, and mouse lemur NKC. The major NKC genes are largely in the equivalent positions in all eight species, with significant independent expansions and deletions between species, allowing us to propose a model for NKC evolution during mammalian radiation. The ruminant species, cattle and goats, have independently evolved a second KLRC locus flanked by KLRA and KLRJ, and a novel KLRH-like gene has acquired an activating tail. This novel gene has duplicated several times within cattle, while other activating receptor genes have been selectively disrupted. Targeted genome enrichment in cattle identified varying levels of allelic polymorphism between the NKC genes concentrated in the predicted extracellular ligand-binding domains. This novel recombination and allelic polymorphism is consistent with NKC evolution under balancing selection, suggesting that this diversity influences individual immune responses and may impact on differential outcomes of pathogen infection and vaccination.

  7. Intragenomic polymorphisms among high-copy loci: a genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

    PubMed Central

    Straub, Shannon C.K.; Fishbein, Mark; Liston, Aaron

    2015-01-01

    Despite knowledge that concerted evolution of high-copy loci is often imperfect, studies that investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual’s consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming). PMID:25653903

  8. Analysis of Genetic Diversity and Population Structure of Rice Germplasm from North-Eastern Region of India and Development of a Core Germplasm Set

    PubMed Central

    Singh, Amit Kumar; Kumar, Sundeep; Srinivasan, Kalyani; Tyagi, R. K.; Ahmad, Altaf; Singh, N. K.; Singh, Rakesh

    2014-01-01

    The North-Eastern region (NER) of India, comprising of Arunachal Pradesh, Assam, Manipur, Meghalaya, Mizoram, Nagaland and Tripura, is a hot spot for genetic diversity and the most probable origin of rice. North-east rice collections are known to possess various agronomically important traits like biotic and abiotic stress tolerance, unique grain and cooking quality. The genetic diversity and associated population structure of 6,984 rice accessions, originating from NER, were assessed using 36 genome wide unlinked single nucleotide polymorphism (SNP) markers distributed across the 12 rice chromosomes. All of the 36 SNP loci were polymorphic and bi-allelic, contained five types of base substitutions and together produced nine types of alleles. The polymorphic information content (PIC) ranged from 0.004 for Tripura to 0.375 for Manipur and major allele frequency ranged from 0.50 for Assam to 0.99 for Tripura. Heterozygosity ranged from 0.002 in Nagaland to 0.42 in Mizoram and gene diversity ranged from 0.006 in Arunachal Pradesh to 0.50 in Manipur. The genetic relatedness among the rice accessions was evaluated using an unrooted phylogenetic tree analysis, which grouped all accessions into three major clusters. For determining population structure, populations K = 1 to K = 20 were tested and population K = 3 was present in all the states, with the exception of Meghalaya and Manipur where, K = 5 and K = 4 populations were present, respectively. Principal Coordinate Analysis (PCoA) showed that accessions were distributed according to their population structure. AMOVA analysis showed that, maximum diversity was partitioned at the individual accession level (73% for Nagaland, 58% for Arunachal Pradesh and 57% for Tripura). Using POWERCORE software, a core set of 701 accessions was obtained, which accounted for approximately 10% of the total NE India collections, representing 99.9% of the allelic diversity. The rice core set developed will be a valuable resource for future genomic studies and crop improvement strategies. PMID:25412256

  9. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

    PubMed

    Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

    2015-08-29

    The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Genetic Structures of Copy Number Variants Revealed by Genotyping Single Sperm

    PubMed Central

    Luo, Minjie; Cui, Xiangfeng; Fredman, David; Brookes, Anthony J.; Azaro, Marco A.; Greenawalt, Danielle M.; Hu, Guohong; Wang, Hui-Yun; Tereshchenko, Irina V.; Lin, Yong; Shentu, Yue; Gao, Richeng; Shen, Li; Li, Honghua

    2009-01-01

    Background Copy number variants (CNVs) occupy a significant portion of the human genome and may have important roles in meiotic recombination, human genome evolution and gene expression. Many genetic diseases may be underlain by CNVs. However, because of the presence of their multiple copies, variability in copy numbers and the diploidy of the human genome, detailed genetic structure of CNVs cannot be readily studied by available techniques. Methodology/Principal Findings Single sperm samples were used as the primary subjects for the study so that CNV haplotypes in the sperm donors could be studied individually. Forty-eight CNVs characterized in a previous study were analyzed using a microarray-based high-throughput genotyping method after multiplex amplification. Seventeen single nucleotide polymorphisms (SNPs) were also included as controls. Two single-base variants, either allelic or paralogous, could be discriminated for all markers. Microarray data were used to resolve SNP alleles and CNV haplotypes, to quantitatively assess the numbers and compositions of the paralogous segments in each CNV haplotype. Conclusions/Significance This is the first study of the genetic structure of CNVs on a large scale. Resulting information may help understand evolution of the human genome, gain insight into many genetic processes, and discriminate between CNVs and SNPs. The highly sensitive high-throughput experimental system with haploid sperm samples as subjects may be used to facilitate detailed large-scale CNV analysis. PMID:19384415

  11. Nucleotide Substitution in 3' Arm of Bovine MIR-2467 in Five Cattle Breeds.

    PubMed

    Łukaszewicz, Aneta; Basiak, Szymon; Proskura, Witold Stanisław; Dybus, Andrzej

    2015-01-01

    The T > C single nucleotide polymorphism (SNP) in the MIR2467 gene was investigated in order to confirm its presence in cattle genome and to check for possible differences in its genotype distribution among different breeds. Additional purpose of the study was to investigate in silico potential effect of that substitution on the structure and stability of precursor mir-2467. The study involved 634 individuals of five cattle breeds: Angus, Hereford, Holstein-Friesian, Jersey, and Limousin, which were genotyped using PCR-RFLP assay. In this study, the presence of T > C polymorphism at position 24 was observed in all the cattle breeds excepting Hereford. In addition, the differences in the genotype distribution among analyzed breeds were indicated. On the basis of minimum free energy structure prediction, the C allele was indicated to have possible impact on decreasing the stability of the pre-mir-2467, thus altering its ability to regulate target genes expression.

  12. Genetic diversity and structure of Capparis spinosa L. in Iran as revealed by ISSR markers.

    PubMed

    Ahmadi, Maryam; Saeidi, Hojjatollah

    2018-05-01

    Capparis spinosa L. (caper bush) is an economically and ecologically important perennial shrub that grows across different regions of Iran. In this study, the genetic diversity and population structure of Iranian genepool of C. spinosa is evaluated using Inter Simple Sequence Repeat (ISSR) markers. Using 10 ISSR primers, 387 DNA fragments (bands) were amplified from the genomic DNA of 92 individuals belonging to twenty-one populations of C . spinosa , of which 378 (97.7%) were polymorphic. High level of genetic diversity (percentage of polymorphic loci = 98.2%, h = 0.1382, I = 0.243), high genetic differentiation (G st  = 0.5234) and low gene flow (Nm = 0.4553) among populations were observed. Caper bush populations were divided into 4 groups in the dendrogram, PCoA plot and Bayesian clustering results, mostly corresponded to their geographic regions. The results showed that there are value in sampling Iranian caper bush populations to look for valuable alleles for use in plant breeding programs.

  13. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes

    USDA-ARS?s Scientific Manuscript database

    In this Genomics Era, vast amounts of next generation sequencing data have become publicly-available for multiple genomes across hundreds of species. Analysis of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset...

  14. Maize HapMap2 identifies extant variation from a genome in flux

    USDA-ARS?s Scientific Manuscript database

    The maize genome is the largest, most diverse and complex plant genome sequenced to date. Using high-throughput sequencing to access genetic variation and a population genetics model to score the polymorphisms, we characterize and unite the diversity of the world’s key breeding germplasm, wild rela...

  15. Genetic associations with lipoprotein subfraction measures differ by ethnicity in the multi-ethnic study of atherosclerosis (MESA)

    USDA-ARS?s Scientific Manuscript database

    A recent genome-wide association study associated 62 single nucleotide polymorphisms (SNPs) from 43 genomic loci, with fasting lipoprotein subfractions in European–Americans (EAs) at genome-wide levels of significance across three independent samples. Whether these associations are consistent across...

  16. Structure, inheritance, and expression of hybrid poplar (Populus trichocarpa x Populus deltoides) phenylalanine ammonia-lyase genes.

    PubMed Central

    Subramaniam, R; Reinold, S; Molitor, E K; Douglas, C J

    1993-01-01

    A heterologous probe encoding phenylalanine ammonia-lyase (PAL) was used to identify PAL clones in cDNA libraries made with RNA from young leaf tissue of two Populus deltoides x P. trichocarpa F1 hybrid clones. Sequence analysis of a 2.4-kb cDNA confirmed its identity as a full-length PAl clone. The predicted amino acid sequence is conserved in comparison with that of PAL genes from several other plants. Southern blot analysis of popular genomic DNA from parental and hybrid individuals, restriction site polymorphism in PAL cDNA clones, and sequence heterogeneity in the 3' ends of several cDNA clones suggested that PAL is encoded by at least two genes that can be distinguished by HindIII restriction site polymorphisms. Clones containing each type of PAL gene were isolated from a poplar genomic library. Analysis of the segregation of PAL-specific HindIII restriction fragment-length polymorphisms demonstrated the existence of two independently segregating PAL loci, one of which was mapped to a linkage group of the poplar genetic map. Developmentally regulated PAL expression in poplar was analyzed using RNA blots. Highest expression was observed in young stems, apical buds, and young leaves. Expression was lower in older stems and undetectable in mature leaves. Cellular localization of PAL expression by in situ hybridization showed very high levels of expression in subepidermal cells of leaves early during leaf development. In stems and petioles, expression was associated with subepidermal cells and vascular tissues. PMID:8108506

  17. Detecting the Population Structure and Scanning for Signatures of Selection in Horses (Equus caballus) From Whole-Genome Sequencing Data

    PubMed Central

    Zhang, Cheng; Ni, Pan; Ahmad, Hafiz Ishfaq; Gemingguli, M; Baizilaitibei, A; Gulibaheti, D; Fang, Yaping; Wang, Haiyang; Asif, Akhtar Rasool; Xiao, Changyi; Chen, Jianhai; Ma, Yunlong; Liu, Xiangdong; Du, Xiaoyong; Zhao, Shuhong

    2018-01-01

    Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.

  18. Wheat CBF gene family: identification of polymorphisms in the CBF coding sequence.

    PubMed

    Mohseni, Sara; Che, Hua; Djillali, Zakia; Dumont, Estelle; Nankeu, Joseph; Danyluk, Jean

    2012-12-01

    Expression of cold-regulated genes needed for protection against freezing stress is mediated, in part, by the CBF transcription factor family. Previous studies with temperate cereals suggested that the CBF gene family in wheat was large, and that CBF genes were at the base of an important low temperature tolerance trait. Therefore, the goal of our study was to identify the CBF repertoire in the freezing-tolerant hexaploid wheat cultivar Norstar, and then to examine if the coding region of CBF genes in two spring cultivars contain polymorphisms that could affect the protein sequence and structure. Our analyses reveal that hexaploid wheat contains a complex CBF family consisting of at least 65 CBF genes of which 60 are known to be expressed in the cultivar Norstar. They represent 27 paralogous genes with 1-3 homeologous copies for the A, B, and D genomes. The cultivar Norstar contains two pseudogenes and at least 24 additional proteins having sequences and (or) structures that deviate from the consensus in the conserved AP2 DNA-binding and (or) C-terminal activation-domains. This suggests that in cultivars such as Norstar, low temperature tolerance may be increased through breeding of additional optimal alleles. The examination of the CBF repertoire present in the two spring cultivars, Chinese Spring and Manitou, reveals that they have additional polymorphisms affecting conserved positions in these domains. Understanding the effects of these polymorphisms will provide additional information for the selection of optimum CBF alleles in Triticeae breeding programs.

  19. Development of Cymbidium ensifolium genic-SSR markers and their utility in genetic diversity and population structure analysis in cymbidiums.

    PubMed

    Li, Xiaobai; Jin, Feng; Jin, Liang; Jackson, Aaron; Huang, Cheng; Li, Kehu; Shu, Xiaoli

    2014-12-05

    Cymbidium is a genus of 68 species in the orchid family, with extremely high ornamental value. Marker-assisted selection has proven to be an effective strategy in accelerating plant breeding for many plant species. Analysis of cymbidiums genetic background by molecular markers can be of great value in assisting parental selection and breeding strategy design, however, in plants such as cymbidiums limited genomic resources exist. In order to obtain efficient markers, we deep sequenced the C. ensifolium transcriptome to identify simple sequence repeats derived from gene regions (genic-SSR). The 7,936 genic-SSR markers were identified. A total of 80 genic-SSRs were selected, and primers were designed according to their flanking sequences. Of the 80 genic-SSR primer sets, 62 were amplified in C. ensifolium successfully, and 55 showed polymorphism when cross-tested among 9 Cymbidium species comprising 59 accessions. Unigenes containing the 62 genic-SSRs were searched against Non-redundant (Nr), Gene Ontology database (GO), eukaryotic orthologous groups (KOGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The search resulted in 53 matching Nr sequences, of which 39 had GO terms, 18 were assigned to KOGs, and 15 were annotated with KEGG. Genetic diversity and population structure were analyzed based on 55 polymorphic genic-SSR data among 59 accessions. The genetic distance averaged 0.3911, ranging from 0.016 to 0.618. The polymorphic index content (PIC) of 55 polymorphic markers averaged 0.407, ranging from 0.033 to 0.863. A model-based clustering analysis revealed that five genetic groups existed in the collection. Accessions from the same species were typically grouped together; however, C. goeringii accessions did not always form a separate cluster, suggesting that C. goeringii accessions were polyphyletic. The genic-SSR identified in this study constitute a set of markers that can be applied across multiple Cymbidium species and used for the evaluation of genetic relationships as well as qualitative and quantitative trait mapping studies. Genic-SSR's coupled with the functional annotations provided by the unigenes will aid in mapping candidate genes of specific function.

  20. Genetic mutation underlying orthostatic intolerance and diagnostic and therapeutic methods relating thereto

    NASA Technical Reports Server (NTRS)

    Blakely, Randy D. (Inventor); Robertson, David (Inventor)

    2006-01-01

    Isolated polynucleotide molecules and peptides encoded by these molecules are used in the analysis of human norepinephrine (NE) transporter variants, as well as in diagnostic and therapeutic applications, relating to a human NE transporter polymorphism. By analyzing genomic DNA or amplified genomic DNA, or amplified cDNA derived from mRNA, it is possible to type a human NE transporter with regard to the human NE transporter polymorphism, for example, in the context of diagnosing and treating NE transport impairments, and disorders associated with NE transport impairments, such as orthostatic intolerance.

  1. Genetic diversity and population structure of the endangered marsupial Sarcophilus harrisii (Tasmanian devil)

    PubMed Central

    Miller, Webb; Hayes, Vanessa M.; Ratan, Aakrosh; Petersen, Desiree C.; Wittekindt, Nicola E.; Miller, Jason; Walenz, Brian; Knight, James; Qi, Ji; Zhao, Fangqing; Wang, Qingyu; Bedoya-Reina, Oscar C.; Katiyar, Neerja; Tomsho, Lynn P.; Kasson, Lindsay McClellan; Hardie, Rae-Anne; Woodbridge, Paula; Tindall, Elizabeth A.; Bertelsen, Mads Frost; Dixon, Dale; Pyecroft, Stephen; Helgen, Kristofer M.; Lesk, Arthur M.; Pringle, Thomas H.; Patterson, Nick; Zhang, Yu; Kreiss, Alexandre; Woods, Gregory M.; Jones, Menna E.; Schuster, Stephan C.

    2011-01-01

    The Tasmanian devil (Sarcophilus harrisii) is threatened with extinction because of a contagious cancer known as Devil Facial Tumor Disease. The inability to mount an immune response and to reject these tumors might be caused by a lack of genetic diversity within a dwindling population. Here we report a whole-genome analysis of two animals originating from extreme northwest and southeast Tasmania, the maximal geographic spread, together with the genome from a tumor taken from one of them. A 3.3-Gb de novo assembly of the sequence data from two complementary next-generation sequencing platforms was used to identify 1 million polymorphic genomic positions, roughly one-quarter of the number observed between two genetically distant human genomes. Analysis of 14 complete mitochondrial genomes from current and museum specimens, as well as mitochondrial and nuclear SNP markers in 175 animals, suggests that the observed low genetic diversity in today's population preceded the Devil Facial Tumor Disease disease outbreak by at least 100 y. Using a genetically characterized breeding stock based on the genome sequence will enable preservation of the extant genetic diversity in future Tasmanian devil populations. PMID:21709235

  2. [Acute lymphoblastic leukemia: a genomic perspective].

    PubMed

    Jiménez-Morales, Silvia; Hidalgo-Miranda, Alfredo; Ramírez-Bello, Julián

    In parallel to the human genome sequencing project, several technological platforms have been developed that let us gain insight into the genome structure of human entities, as well as evaluate their usefulness in the clinical approach of the patient. Thus, in acute lymphoblastic leukemia (ALL), the most common pediatric malignancy, genomic tools promise to be useful to detect patients at high risk of relapse, either at diagnosis or during treatment (minimal residual disease), and they also increase the possibility to identify cases at risk of adverse reactions to chemotherapy. Therefore, the physician could offer patient-tailored therapeutic schemes. A clear example of the useful genomic tools is the identification of single nucleotide polymorphisms (SNPs) in the thiopurine methyl transferase (TPMT) gene, where the presence of two null alleles (homozygous or compound heterozygous) indicates the need to reduce the dose of mercaptopurine by up to 90% to avoid toxic effects which could lead to the death of the patient. In this review, we provide an overview of the genomic perspective of ALL, describing some strategies that contribute to the identification of biomarkers with potential clinical application. Copyright © 2017 Hospital Infantil de México Federico Gómez. Publicado por Masson Doyma México S.A. All rights reserved.

  3. The Complete Chloroplast Genome of Banana (Musa acuminata, Zingiberales): Insight into Plastid Monocotyledon Evolution

    PubMed Central

    Martin, Guillaume; Baurens, Franc-Christophe; Cardi, Céline; Aury, Jean-Marc; D’Hont, Angélique

    2013-01-01

    Background Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. Methodology/Principal Findings The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp) and a Small Single Copy region (SSC, 10,768 bp) separated by Inverted Repeat regions (IRs, 35,433 bp). Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1) and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. Conclusion The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas. PMID:23840670

  4. The complete chloroplast genome of banana (Musa acuminata, Zingiberales): insight into plastid monocotyledon evolution.

    PubMed

    Martin, Guillaume; Baurens, Franc-Christophe; Cardi, Céline; Aury, Jean-Marc; D'Hont, Angélique

    2013-01-01

    Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp) and a Small Single Copy region (SSC, 10,768 bp) separated by Inverted Repeat regions (IRs, 35,433 bp). Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1) and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tuskan, Gerald A; Tschaplinski, Timothy J; Chen, Jay

    Genetic determination of gender is a fundamental developmental and evolutionary process in plants. Although it appears that dioecy in Populus is partially genetically controlled, the precise gender-determining systems remain unclear. The recently-released second draft assembly and annotated gene set of the Populus genome provided an opportunity to re-visit this topic. We hypothesized that over evolutionary time, selective pressure has reformed the genome structure and gene composition in the peritelomeric region of the chromosome XIX which has resulted in a distinctive genome structure and cluster of genes contributing to gender determination in Populus. Multiple lines of evidence support this working hypothesis.more » First, the peritelomeric region of the chromosome XIX contains significantly fewer single nucleotide polymorphisms than the rest of Populus genome and has a distinct evolutionary history. Second, the peritelomeric end of chromosome XIX contains the largest cluster of the nucleotide-binding site-leucine-rich repeat (NBS-LRR) class of disease resistances genes in the entire Populus genome. Third, there is a high occurrence of small microRNAs on chromosome XIX coincident to the region containing the putative gender-determining locus and the major cluster of NBS-LRR genes. Further, by analyzing the metabolomic profiles of floral bud in male and female Populus trees using a gas chromatography-mass spectrometry, we found there are gender-specific accumulations of phenolic glycosides. Taken together, these findings provide new insights into the genetic control of gender determination in Populus.« less

  6. Population structure of eleven Spanish ovine breeds and detection of selective sweeps with BayeScan and hapFLK

    PubMed Central

    Manunza, A.; Cardoso, T. F.; Noce, A.; Martínez, A.; Pons, A.; Bermejo, L. A.; Landi, V.; Sànchez, A.; Jordana, J.; Delgado, J. V.; Adán, S.; Capote, J.; Vidal, O.; Ugarte, E.; Arranz, J. J.; Calvo, J. H.; Casellas, J.; Amills, M.

    2016-01-01

    The goals of the current work were to analyse the population structure of 11 Spanish ovine breeds and to detect genomic regions that may have been targeted by selection. A total of 141 individuals were genotyped with the Infinium 50 K Ovine SNP BeadChip (Illumina). We combined this dataset with Spanish ovine data previously reported by the International Sheep Genomics Consortium (N = 229). Multidimensional scaling and Admixture analyses revealed that Canaria de Pelo and, to a lesser extent, Roja Mallorquina, Latxa and Churra are clearly differentiated populations, while the remaining seven breeds (Ojalada, Castellana, Gallega, Xisqueta, Ripollesa, Rasa Aragonesa and Segureña) share a similar genetic background. Performance of a genome scan with BayeScan and hapFLK allowed us identifying three genomic regions that are consistently detected with both methods i.e. Oar3 (150–154 Mb), Oar6 (4–49 Mb) and Oar13 (68–74 Mb). Neighbor-joining trees based on polymorphisms mapping to these three selective sweeps did not show a clustering of breeds according to their predominant productive specialization (except the local tree based on Oar13 SNPs). Such cryptic signatures of selection have been also found in the bovine genome, posing a considerable challenge to understand the biological consequences of artificial selection. PMID:27272025

  7. Population structure of eleven Spanish ovine breeds and detection of selective sweeps with BayeScan and hapFLK.

    PubMed

    Manunza, A; Cardoso, T F; Noce, A; Martínez, A; Pons, A; Bermejo, L A; Landi, V; Sànchez, A; Jordana, J; Delgado, J V; Adán, S; Capote, J; Vidal, O; Ugarte, E; Arranz, J J; Calvo, J H; Casellas, J; Amills, M

    2016-06-07

    The goals of the current work were to analyse the population structure of 11 Spanish ovine breeds and to detect genomic regions that may have been targeted by selection. A total of 141 individuals were genotyped with the Infinium 50 K Ovine SNP BeadChip (Illumina). We combined this dataset with Spanish ovine data previously reported by the International Sheep Genomics Consortium (N = 229). Multidimensional scaling and Admixture analyses revealed that Canaria de Pelo and, to a lesser extent, Roja Mallorquina, Latxa and Churra are clearly differentiated populations, while the remaining seven breeds (Ojalada, Castellana, Gallega, Xisqueta, Ripollesa, Rasa Aragonesa and Segureña) share a similar genetic background. Performance of a genome scan with BayeScan and hapFLK allowed us identifying three genomic regions that are consistently detected with both methods i.e. Oar3 (150-154 Mb), Oar6 (4-49 Mb) and Oar13 (68-74 Mb). Neighbor-joining trees based on polymorphisms mapping to these three selective sweeps did not show a clustering of breeds according to their predominant productive specialization (except the local tree based on Oar13 SNPs). Such cryptic signatures of selection have been also found in the bovine genome, posing a considerable challenge to understand the biological consequences of artificial selection.

  8. Complete mitochondrial genome of Concholepas concholepas inferred by 454 pyrosequencing and mtDNA expression in two mollusc populations.

    PubMed

    Núñez-Acuña, Gustavo; Aguilar-Espinoza, Andrea; Gallardo-Escárate, Cristian

    2013-03-01

    Despite the great relevance of mitochondrial genome analysis in evolutionary studies, there is scarce information on how the transcripts associated with the mitogenome are expressed and their role in the genetic structuring of populations. This work reports the complete mitochondrial genome of the marine gastropod Concholepas concholepas, obtained by 454 pryosequencing, and an analysis of mitochondrial transcripts of two populations 1000 km apart along the Chilean coast. The mitochondrion of C. concholepas is 15,495 base pairs (bp) in size and contains the 37 subunits characteristic of metazoans, as well as a non-coding region of 330 bp. In silico analysis of mitochondrial gene variability showed significant differences among populations. In terms of levels of relative abundance of transcripts associated with mitochondrion in the two populations (assessed by qPCR), the genes associated with complexes III and IV of the mitochondrial genome had the highest levels of expression in the northern population while transcripts associated with the ATP synthase complex had the highest levels of expression in the southern population. Moreover, fifteen polymorphic SNPs were identified in silico between the mitogenomes of the two populations. Four of these markers implied different amino acid substitutions (non-synonymous SNPs). This work contributes novel information regarding the mitochondrial genome structure and mRNA expression levels of C. concholepas. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. Structural polymorphism at LCR and its role in beta-globin gene regulation.

    PubMed

    Kukreti, Shrikant; Kaur, Harpreet; Kaushik, Mahima; Bansal, Aparna; Saxena, Sarika; Kaushik, Shikha; Kukreti, Ritushree

    2010-09-01

    Information on the secondary structures and conformational manifestations of eukaryotic DNA and their biological significance with reference to gene regulation and expression is limited. The human beta-globin gene Locus Control Region (LCR), a dominant regulator of globin gene expression, is a contiguous piece of DNA with five tissue-specific DNase I-hypersensitive sites (HSs). Since these HSs have a high density of transcription factor binding sites, structural interdependencies between HSs and different promoters may directly or indirectly regulate LCR functions. Mutations and SNPs may stabilize or destabilize the local secondary structures, affecting the gene expression by changes in the protein-DNA recognition patterns. Various palindromic or quasi-palindromic segments within LCR, could cause structural polymorphism and geometrical switching of DNA. This emphasizes the importance of understanding of the sequence-dependent variations of the DNA structure. Such structural motifs might act as regulatory elements. The local conformational variability of a DNA segment or action of a DNA specific protein is key to create and maintain active chromatin domains and affect transcription of various tissue specific beta-globin genes. We, summarize here the current status of beta-globin LCR structure and function. Further structural studies at molecular level and functional genomics might solve the regulatory puzzles that control the beta-globin gene locus. Copyright (c) 2010 Elsevier Masson SAS. All rights reserved.

  10. Diversity in 113 cowpea [Vigna unguiculata (L) Walp] accessions assessed with 458 SNP markers.

    PubMed

    Egbadzor, Kenneth F; Ofori, Kwadwo; Yeboah, Martin; Aboagye, Lawrence M; Opoku-Agyeman, Michael O; Danquah, Eric Y; Offei, Samuel K

    2014-01-01

    Single Nucleotide Polymorphism (SNP) markers were used in characterization of 113 cowpea accessions comprising of 108 from Ghana and 5 from abroad. Leaf tissues from plants cultivated at the University of Ghana were genotyped at KBioscience in the United Kingdom. Data was generated for 477 SNPs, out of which 458 revealed polymorphism. The results were used to analyze genetic dissimilarity among the accessions using Darwin 5 software. The markers discriminated among all of the cowpea accessions and the dissimilarity values which ranged from 0.006 to 0.63 were used for factorial plot. Unexpected high levels of heterozygosity were observed on some of the accessions. Accessions known to be closely related clustered together in a dendrogram drawn with WPGMA method. A maximum length sub-tree which comprised of 48 core accessions was constructed. The software package structure was used to separate accessions into three groups, and the programme correctly identified varieties that were known hybrids. The hybrids were those accessions with numerous heterozygous loci. The structure plot showed closely related accessions with similar genome patterns. The SNP markers were more efficient in discriminating among the cowpea germplasm than morphological, seed protein polymorphism and simple sequence repeat studies reported earlier on the same collection.

  11. Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome.

    PubMed

    Zhao, Keyan; Wright, Mark; Kimball, Jennifer; Eizenga, Georgia; McClung, Anna; Kovach, Michael; Tyagi, Wricha; Ali, Md Liakat; Tung, Chih-Wei; Reynolds, Andy; Bustamante, Carlos D; McCouch, Susan R

    2010-05-24

    The domestication of Asian rice (Oryza sativa) was a complex process punctuated by episodes of introgressive hybridization among and between subpopulations. Deep genetic divergence between the two main varietal groups (Indica and Japonica) suggests domestication from at least two distinct wild populations. However, genetic uniformity surrounding key domestication genes across divergent subpopulations suggests cultural exchange of genetic material among ancient farmers. In this study, we utilize a novel 1,536 SNP panel genotyped across 395 diverse accessions of O. sativa to study genome-wide patterns of polymorphism, to characterize population structure, and to infer the introgression history of domesticated Asian rice. Our population structure analyses support the existence of five major subpopulations (indica, aus, tropical japonica, temperate japonica and GroupV) consistent with previous analyses. Our introgression analysis shows that most accessions exhibit some degree of admixture, with many individuals within a population sharing the same introgressed segment due to artificial selection. Admixture mapping and association analysis of amylose content and grain length illustrate the potential for dissecting the genetic basis of complex traits in domesticated plant populations. Genes in these regions control a myriad of traits including plant stature, blast resistance, and amylose content. These analyses highlight the power of population genomics in agricultural systems to identify functionally important regions of the genome and to decipher the role of human-directed breeding in refashioning the genomes of a domesticated species.

  12. [Genome similarity of Baikal omul and sig].

    PubMed

    Bychenko, O S; Sukhanova, L V; Ukolova, S S; Skvortsov, T A; Potapov, V K; Azhikina, T L; Sverdlov, E D

    2009-01-01

    Two members of the Baikal sig family, a lake sig (Coregonus lavaretus baicalensis Dybovsky) and omul (C. autumnalis migratorius Georgi), are close relatives that diverged from the same ancestor 10-20 thousand years ago. In this work, we studied genomic polymorphism of these two fish species. The method of subtraction hybridization (SH) did not reveal the presence of extended sequences in the sig genome and their absence in the omul genome. All the fragments found by SH corresponded to polymorphous noncoding genome regions varying in mononucleotide substitutions and short deletions. Many of them are mapped close to genes of the immune system and have regions identical to the Tc-1-like transposons abundant among fish, whose transcription activity may affect the expression of adjacent genes. Thus, we showed for the first time that genetic differences between Baikal sig family members are extremely small and cannot be revealed by the SH method. This is another endorsement of the hypothesis on the close relationship between Baikal sig and omul and their evolutionarily recent divergence from a common ancestor.

  13. Genome Structural Diversity among 31 Bordetella pertussis Isolates from Two Recent U.S. Whooping Cough Statewide Epidemics.

    PubMed

    Bowden, Katherine E; Weigand, Michael R; Peng, Yanhui; Cassiday, Pamela K; Sammons, Scott; Knipe, Kristen; Rowe, Lori A; Loparev, Vladimir; Sheth, Mili; Weening, Keeley; Tondella, M Lucia; Williams, Margaret M

    2016-01-01

    During 2010 and 2012, California and Vermont, respectively, experienced statewide epidemics of pertussis with differences seen in the demographic affected, case clinical presentation, and molecular epidemiology of the circulating strains. To overcome limitations of the current molecular typing methods for pertussis, we utilized whole-genome sequencing to gain a broader understanding of how current circulating strains are causing large epidemics. Through the use of combined next-generation sequencing technologies, this study compared de novo, single-contig genome assemblies from 31 out of 33 Bordetella pertussis isolates collected during two separate pertussis statewide epidemics and 2 resequenced vaccine strains. Final genome architecture assemblies were verified with whole-genome optical mapping. Sixteen distinct genome rearrangement profiles were observed in epidemic isolate genomes, all of which were distinct from the genome structures of the two resequenced vaccine strains. These rearrangements appear to be mediated by repetitive sequence elements, such as high-copy-number mobile genetic elements and rRNA operons. Additionally, novel and previously identified single nucleotide polymorphisms were detected in 10 virulence-related genes in the epidemic isolates. Whole-genome variation analysis identified state-specific variants, and coding regions bearing nonsynonymous mutations were classified into functional annotated orthologous groups. Comprehensive studies on whole genomes are needed to understand the resurgence of pertussis and develop novel tools to better characterize the molecular epidemiology of evolving B. pertussis populations. IMPORTANCE Pertussis, or whooping cough, is the most poorly controlled vaccine-preventable bacterial disease in the United States, which has experienced a resurgence for more than a decade. Once viewed as a monomorphic pathogen, B. pertussis strains circulating during epidemics exhibit diversity visible on a genome structural level, previously undetectable by traditional sequence analysis using short-read technologies. For the first time, we combine short- and long-read sequencing platforms with restriction optical mapping for single-contig, de novo assembly of 31 isolates to investigate two geographically and temporally independent U.S. pertussis epidemics. These complete genomes reshape our understanding of B. pertussis evolution and strengthen molecular epidemiology toward one day understanding the resurgence of pertussis.

  14. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics

    PubMed Central

    2012-01-01

    Background Proteinaceous toxins are observed across all levels of inter-organismal and intra-genomic conflicts. These include recently discovered prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. They are characterized by a remarkable diversity of C-terminal toxin domains generated by recombination with standalone toxin-coding cassettes. Prior analysis revealed a striking diversity of nuclease and deaminase domains among the toxin modules. We systematically investigated polymorphic toxin systems using comparative genomics, sequence and structure analysis. Results Polymorphic toxin systems are distributed across all major bacterial lineages and are delivered by at least eight distinct secretory systems. In addition to type-II, these include type-V, VI, VII (ESX), and the poorly characterized “Photorhabdus virulence cassettes (PVC)”, PrsW-dependent and MuF phage-capsid-like systems. We present evidence that trafficking of these toxins is often accompanied by autoproteolytic processing catalyzed by HINT, ZU5, PrsW, caspase-like, papain-like, and a novel metallopeptidase associated with the PVC system. We identified over 150 distinct toxin domains in these systems. These span an extraordinary catalytic spectrum to include 23 distinct clades of peptidases, numerous previously unrecognized versions of nucleases and deaminases, ADP-ribosyltransferases, ADP ribosyl cyclases, RelA/SpoT-like nucleotidyltransferases, glycosyltranferases and other enzymes predicted to modify lipids and carbohydrates, and a pore-forming toxin domain. Several of these toxin domains are shared with host-directed effectors of pathogenic bacteria. Over 90 families of immunity proteins might neutralize anywhere between a single to at least 27 distinct types of toxin domains. In some organisms multiple tandem immunity genes or immunity protein domains are organized into polyimmunity loci or polyimmunity proteins. Gene-neighborhood-analysis of polymorphic toxin systems predicts the presence of novel trafficking-related components, and also the organizational logic that allows toxin diversification through recombination. Domain architecture and protein-length analysis revealed that these toxins might be deployed as secreted factors, through directed injection, or via inter-cellular contact facilitated by filamentous structures formed by RHS/YD, filamentous hemagglutinin and other repeats. Phyletic pattern and life-style analysis indicate that polymorphic toxins and polyimmunity loci participate in cooperative behavior and facultative ‘cheating’ in several ecosystems such as the human oral cavity and soil. Multiple domains from these systems have also been repeatedly transferred to eukaryotes and their viruses, such as the nucleo-cytoplasmic large DNA viruses. Conclusions Along with a comprehensive inventory of toxins and immunity proteins, we present several testable predictions regarding active sites and catalytic mechanisms of toxins, their processing and trafficking and their role in intra-specific and inter-specific interactions between bacteria. These systems provide insights regarding the emergence of key systems at different points in eukaryotic evolution, such as ADP ribosylation, interaction of myosin VI with cargo proteins, mediation of apoptosis, hyphal heteroincompatibility, hedgehog signaling, arthropod toxins, cell-cell interaction molecules like teneurins and different signaling messengers. Reviewers This article was reviewed by AM, FE and IZ. PMID:22731697

  15. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome

    USDA-ARS?s Scientific Manuscript database

    One of the key aims of livestock genetics and genomics research is to discover the genetic variants underlying economically important traits such as reproductive performance, feed efficiency, disease susceptibility, and product quality. Next generation sequencing has recently emerged as an economica...

  16. A Genome-Wide Association Study for the Incidence of Persistent Bovine Viral Diarrhea Virus Infection in Cattle.

    USDA-ARS?s Scientific Manuscript database

    Bovine Viral Diarrhea Virus (BVDV) is a diverse group of viruses causing disease in ruminants. The objective was to determine genomic regions harboring single nucleotide polymorphisms (SNP) associated with presence or absence of persistent BVDV infections. A genome wide association approach based on...

  17. Characterization of polymorphic SSRs among Prunus chloroplast genomes

    USDA-ARS?s Scientific Manuscript database

    An in silico mining process yielded 80, 75, and 78 microsatellites in the chloroplast genome of Prunus persica, P. kansuensis, and P. mume. A and T repeats were predominant in the three genomes, accounting for 67.8% on average and most of them were successful in primer design. For the 80 P. persica ...

  18. Semi-Automatic In Silico Gap Closure Enabled De Novo Assembly of Two Dehalobacter Genomes from Metagenomic Data

    PubMed Central

    Tang, Shuiquan; Gong, Yunchen; Edwards, Elizabeth A.

    2012-01-01

    Typically, the assembly and closure of a complete bacterial genome requires substantial additional effort spent in a wet lab for gap resolution and genome polishing. Assembly is further confounded by subspecies polymorphism when starting from metagenome sequence data. In this paper, we describe an in silico gap-resolution strategy that can substantially improve assembly. This strategy resolves assembly gaps in scaffolds using pre-assembled contigs, followed by verification with read mapping. It is capable of resolving assembly gaps caused by repetitive elements and subspecies polymorphisms. Using this strategy, we realized the de novo assembly of the first two Dehalobacter genomes from the metagenomes of two anaerobic mixed microbial cultures capable of reductive dechlorination of chlorinated ethanes and chloroform. Only four additional PCR reactions were required even though the initial assembly with Newbler v. 2.5 produced 101 contigs within 9 scaffolds belonging to two Dehalobacter strains. By applying this strategy to the re-assembly of a recently published genome of Bacteroides, we demonstrate its potential utility for other sequencing projects, both metagenomic and genomic. PMID:23284863

  19. Identification and characterization of dinucleotide repeat (CA)[sub n] markers for genetic mapping in dog

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ostrander, E.A.; Sprague, G.F. Jr.; Rine, J.

    1993-04-01

    A large block of simple sequence repeat (SSR) polymorphisms for the dog genome has been isolated and characterized. Screening of primary libraries by conventional hybridization methods as well as by screening of enriched marker-selected libraries led to the isolation of a large number of genomic clones that contained (CA)[sub n] repeats. The sequences of 101 clones showed that the size and complexity of (CA)[sub n] repeats in the dog genome were similar to those reported for these markers in the human genome. Detailed analysis of a representative subset of these markers revealed that most markers were moderately to highly polymorphic,more » with PIC values exceeding 0.70 for 33% of the markers tested. An association between higher PIC values and markers containing longer (CA)[sub n] repeats was observed in these studies, as previously noted for similar markers in the human genome. A list of primer sequences that tag each characterized marker is provided, and a comprehensive system of nomenclature for the dog genome is suggested. 28 refs., 4 figs., 2 tabs.« less

  20. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    NASA Astrophysics Data System (ADS)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  1. Aquaculture genomics, genetics and breeding in the United States: current status, challenges, and priorities for future research.

    PubMed

    Abdelrahman, Hisham; ElHady, Mohamed; Alcivar-Warren, Acacia; Allen, Standish; Al-Tobasei, Rafet; Bao, Lisui; Beck, Ben; Blackburn, Harvey; Bosworth, Brian; Buchanan, John; Chappell, Jesse; Daniels, William; Dong, Sheng; Dunham, Rex; Durland, Evan; Elaswad, Ahmed; Gomez-Chiarri, Marta; Gosh, Kamal; Guo, Ximing; Hackett, Perry; Hanson, Terry; Hedgecock, Dennis; Howard, Tiffany; Holland, Leigh; Jackson, Molly; Jin, Yulin; Khalil, Karim; Kocher, Thomas; Leeds, Tim; Li, Ning; Lindsey, Lauren; Liu, Shikai; Liu, Zhanjiang; Martin, Kyle; Novriadi, Romi; Odin, Ramjie; Palti, Yniv; Peatman, Eric; Proestou, Dina; Qin, Guyu; Reading, Benjamin; Rexroad, Caird; Roberts, Steven; Salem, Mohamed; Severin, Andrew; Shi, Huitong; Shoemaker, Craig; Stiles, Sheila; Tan, Suxu; Tang, Kathy F J; Thongda, Wilawan; Tiersch, Terrence; Tomasso, Joseph; Prabowo, Wendy Tri; Vallejo, Roger; van der Steen, Hein; Vo, Khoi; Waldbieser, Geoff; Wang, Hanping; Wang, Xiaozhu; Xiang, Jianhai; Yang, Yujia; Yant, Roger; Yuan, Zihao; Zeng, Qifan; Zhou, Tao

    2017-02-20

    Advancing the production efficiency and profitability of aquaculture is dependent upon the ability to utilize a diverse array of genetic resources. The ultimate goals of aquaculture genomics, genetics and breeding research are to enhance aquaculture production efficiency, sustainability, product quality, and profitability in support of the commercial sector and for the benefit of consumers. In order to achieve these goals, it is important to understand the genomic structure and organization of aquaculture species, and their genomic and phenomic variations, as well as the genetic basis of traits and their interrelationships. In addition, it is also important to understand the mechanisms of regulation and evolutionary conservation at the levels of genome, transcriptome, proteome, epigenome, and systems biology. With genomic information and information between the genomes and phenomes, technologies for marker/causal mutation-assisted selection, genome selection, and genome editing can be developed for applications in aquaculture. A set of genomic tools and resources must be made available including reference genome sequences and their annotations (including coding and non-coding regulatory elements), genome-wide polymorphic markers, efficient genotyping platforms, high-density and high-resolution linkage maps, and transcriptome resources including non-coding transcripts. Genomic and genetic control of important performance and production traits, such as disease resistance, feed conversion efficiency, growth rate, processing yield, behaviour, reproductive characteristics, and tolerance to environmental stressors like low dissolved oxygen, high or low water temperature and salinity, must be understood. QTL need to be identified, validated across strains, lines and populations, and their mechanisms of control understood. Causal gene(s) need to be identified. Genetic and epigenetic regulation of important aquaculture traits need to be determined, and technologies for marker-assisted selection, causal gene/mutation-assisted selection, genome selection, and genome editing using CRISPR and other technologies must be developed, demonstrated with applicability, and application to aquaculture industries.Major progress has been made in aquaculture genomics for dozens of fish and shellfish species including the development of genetic linkage maps, physical maps, microarrays, single nucleotide polymorphism (SNP) arrays, transcriptome databases and various stages of genome reference sequences. This paper provides a general review of the current status, challenges and future research needs of aquaculture genomics, genetics, and breeding, with a focus on major aquaculture species in the United States: catfish, rainbow trout, Atlantic salmon, tilapia, striped bass, oysters, and shrimp. While the overall research priorities and the practical goals are similar across various aquaculture species, the current status in each species should dictate the next priority areas within the species. This paper is an output of the USDA Workshop for Aquaculture Genomics, Genetics, and Breeding held in late March 2016 in Auburn, Alabama, with participants from all parts of the United States.

  2. Consequences of Asexuality in Natural Populations: Insights from Stick Insects.

    PubMed

    Bast, Jens; Parker, Darren J; Dumas, Zoé; Jalvingh, Kirsten M; Tran Van, Patrick; Jaron, Kamil S; Figuet, Emeric; Brandt, Alexander; Galtier, Nicolas; Schwander, Tanja

    2018-07-01

    Recombination is a fundamental process with significant impacts on genome evolution. Predicted consequences of the loss of recombination include a reduced effectiveness of selection, changes in the amount of neutral polymorphisms segregating in populations, and an arrest of GC-biased gene conversion. Although these consequences are empirically well documented for nonrecombining genome portions, it remains largely unknown if they extend to the whole genome scale in asexual organisms. We identify the consequences of asexuality using de novo transcriptomes of five independently derived, obligately asexual lineages of stick insects, and their sexual sister-species. We find strong evidence for higher rates of deleterious mutation accumulation, lower levels of segregating polymorphisms and arrested GC-biased gene conversion in asexuals as compared with sexuals. Taken together, our study conclusively shows that predicted consequences of genome evolution under asexuality can indeed be found in natural populations.

  3. Development and Molecular Characterization of Novel Polymorphic Genomic DNA SSR Markers in Lentinula edodes.

    PubMed

    Moon, Suyun; Lee, Hwa-Yong; Shim, Donghwan; Kim, Myungkil; Ka, Kang-Hyeon; Ryoo, Rhim; Ko, Han-Gyu; Koo, Chang-Duck; Chung, Jong-Wook; Ryu, Hojin

    2017-06-01

    Sixteen genomic DNA simple sequence repeat (SSR) markers of Lentinula edodes were developed from 205 SSR motifs present in 46.1-Mb long L. edodes genome sequences. The number of alleles ranged from 3-14 and the major allele frequency was distributed from 0.17-0.96. The values of observed and expected heterozygosity ranged from 0.00-0.76 and 0.07-0.90, respectively. The polymorphic information content value ranged from 0.07-0.89. A dendrogram, based on 16 SSR markers clustered by the paired hierarchical clustering' method, showed that 33 shiitake cultivars could be divided into three major groups and successfully identified. These SSR markers will contribute to the efficient breeding of this species by providing diversity in shiitake varieties. Furthermore, the genomic information covered by the markers can provide a valuable resource for genetic linkage map construction, molecular mapping, and marker-assisted selection in the shiitake mushroom.

  4. A compositional segmentation of the human mitochondrial genome is related to heterogeneities in the guanine mutation rate

    PubMed Central

    Samuels, David C.; Boys, Richard J.; Henderson, Daniel A.; Chinnery, Patrick F.

    2003-01-01

    We applied a hidden Markov model segmentation method to the human mitochondrial genome to identify patterns in the sequence, to compare these patterns to the gene structure of mtDNA and to see whether these patterns reveal additional characteristics important for our understanding of genome evolution, structure and function. Our analysis identified three segmentation categories based upon the sequence transition probabilities. Category 2 segments corresponded to the tRNA and rRNA genes, with a greater strand-symmetry in these segments. Category 1 and 3 segments covered the protein- coding genes and almost all of the non-coding D-loop. Compared to category 1, the mtDNA segments assigned to category 3 had much lower guanine abundance. A comparison to two independent databases of mitochondrial mutations and polymorphisms showed that the high substitution rate of guanine in human mtDNA is largest in the category 3 segments. Analysis of synonymous mutations showed the same pattern. This suggests that this heterogeneity in the mutation rate is partly independent of respiratory chain function and is a direct property of the genome sequence itself. This has important implications for our understanding of mtDNA evolution and its use as a ‘molecular clock’ to determine the rate of population and species divergence. PMID:14530452

  5. Search for methylation-sensitive amplification polymorphisms in mutant figs.

    PubMed

    Rodrigues, M G F; Martins, A B G; Bertoni, B W; Figueira, A; Giuliatti, S

    2013-07-08

    Fig (Ficus carica) breeding programs that use conventional approaches to develop new cultivars are rare, owing to limited genetic variability and the difficulty in obtaining plants via gamete fusion. Cytosine methylation in plants leads to gene repression, thereby affecting transcription without changing the DNA sequence. Previous studies using random amplification of polymorphic DNA and amplified fragment length polymorphism markers revealed no polymorphisms among select fig mutants that originated from gamma-irradiated buds. Therefore, we conducted methylation-sensitive amplified polymorphism analysis to verify the existence of variability due to epigenetic DNA methylation among these mutant selections compared to the main cultivar 'Roxo-de-Valinhos'. Samples of genomic DNA were double-digested with either HpaII (methylation sensitive) or MspI (methylation insensitive) and with EcoRI. Fourteen primer combinations were tested, and on an average, non-methylated CCGG, symmetrically methylated CmCGG, and hemimethylated hmCCGG sites accounted for 87.9, 10.1, and 2.0%, respectively. MSAP analysis was effective in detecting differentially methylated sites in the genomic DNA of fig mutants, and methylation may be responsible for the phenotypic variation between treatments. Further analyses such as polymorphic DNA sequencing are necessary to validate these differences, standardize the regions of methylation, and analyze reads using bioinformatic tools.

  6. Full-Genome Sequence of Infectious Laryngotracheitis Virus (Gallid Alphaherpesvirus 1) Strain VFAR-043, Isolated in Peru

    PubMed Central

    Bendezu Eguis, Jorge; Montesinos, Ricardo; Fernández-Díaz, Manolo

    2018-01-01

    ABSTRACT We report here the first genome sequence of infectious laryngotracheitis virus isolated in Peru from tracheal tissues of layer chickens. The genome showed 99.98% identity to the J2 strain genome sequence. Single nucleotide polymorphisms were detected in five gene-coding sequences related to vaccine development, virus attachment, and viral immune evasion. PMID:29519822

  7. Diseases and Molecular Diagnostics: A Step Closer to Precision Medicine.

    PubMed

    Dwivedi, Shailendra; Purohit, Purvi; Misra, Radhieka; Pareek, Puneet; Goel, Apul; Khattri, Sanjay; Pant, Kamlesh Kumar; Misra, Sanjeev; Sharma, Praveen

    2017-10-01

    The current advent of molecular technologies together with a multidisciplinary interplay of several fields led to the development of genomics, which concentrates on the detection of pathogenic events at the genome level. The structural and functional genomics approaches have now pinpointed the technical challenge in the exploration of disease-related genes and the recognition of their structural alterations or elucidation of gene function. Various promising technologies and diagnostic applications of structural genomics are currently preparing a large database of disease-genes, genetic alterations etc., by mutation scanning and DNA chip technology. Further the functional genomics also exploring the expression genetics (hybridization-, PCR- and sequence-based technologies), two-hybrid technology, next generation sequencing with Bioinformatics and computational biology. Advances in microarray "chip" technology as microarrays have allowed the parallel analysis of gene expression patterns of thousands of genes simultaneously. Sequence information collected from the genomes of many individuals is leading to the rapid discovery of single nucleotide polymorphisms or SNPs. Further advances of genetic engineering have also revolutionized immunoassay biotechnology via engineering of antibody-encoding genes and the phage display technology. The Biotechnology plays an important role in the development of diagnostic assays in response to an outbreak or critical disease response need. However, there is also need to pinpoint various obstacles and issues related to the commercialization and widespread dispersal of genetic knowledge derived from the exploitation of the biotechnology industry and the development and marketing of diagnostic services. Implementation of genetic criteria for patient selection and individual assessment of the risks and benefits of treatment emerges as a major challenge to the pharmaceutical industry. Thus this field is revolutionizing current era and further it may open new vistas in the field of disease management.

  8. Microsatellite marker development by partial sequencing of the sour passion fruit genome (Passiflora edulis Sims).

    PubMed

    Araya, Susan; Martins, Alexandre M; Junqueira, Nilton T V; Costa, Ana Maria; Faleiro, Fábio G; Ferreira, Márcio E

    2017-07-21

    The Passiflora genus comprises hundreds of wild and cultivated species of passion fruit used for food, industrial, ornamental and medicinal purposes. Efforts to develop genomic tools for genetic analysis of P. edulis, the most important commercial Passiflora species, are still incipient. In spite of many recognized applications of microsatellite markers in genetics and breeding, their availability for passion fruit research remains restricted. Microsatellite markers in P. edulis are usually limited in number, show reduced polymorphism, and are mostly based on compound or imperfect repeats. Furthermore, they are confined to only a few Passiflora species. We describe the use of NGS technology to partially assemble the P. edulis genome in order to develop hundreds of new microsatellite markers. A total of 14.11 Gbp of Illumina paired-end sequence reads were analyzed to detect simple sequence repeat sites in the sour passion fruit genome. A sample of 1300 contigs containing perfect repeat microsatellite sequences was selected for PCR primer development. Panels of di- and tri-nucleotide repeat markers were then tested in P. edulis germplasm accessions for validation. DNA polymorphism was detected in 74% of the markers (PIC = 0.16 to 0.77; number of alleles/locus = 2 to 7). A core panel of highly polymorphic markers (PIC = 0.46 to 0.77) was used to cross-amplify PCR products in 79 species of Passiflora (including P. edulis), belonging to four subgenera (Astrophea, Decaloba, Distephana and Passiflora). Approximately 71% of the marker/species combinations resulted in positive amplicons in all species tested. DNA polymorphism was detected in germplasm accessions of six closely related Passiflora species (P. edulis, P. alata, P. maliformis, P. nitida, P. quadrangularis and P. setacea) and the data used for accession discrimination and species assignment. A database of P. edulis DNA sequences obtained by NGS technology was examined to identify microsatellite repeats in the sour passion fruit genome. Markers were submitted to evaluation using accessions of cultivated and wild Passiflora species. The new microsatellite markers detected high levels of DNA polymorphism in sour passion fruit and can potentially be used in genetic analysis of P. edulis and other Passiflora species.

  9. The Role of Genetic Polymorphisms as Related to One-Carbon Metabolism, Vitamin B6, and Gene-Nutrient Interactions in Maintaining Genomic Stability and Cell Viability in Chinese Breast Cancer Patients.

    PubMed

    Wu, Xiayu; Xu, Weijiang; Zhou, Tao; Cao, Neng; Ni, Juan; Zou, Tianning; Liang, Ziqing; Wang, Xu; Fenech, Michael

    2016-06-24

    Folate-mediated one-carbon metabolism (FMOCM) is linked to DNA synthesis, methylation, and cell proliferation. Vitamin B6 (B6) is a cofactor, and genetic polymorphisms of related key enzymes, such as serine hydroxymethyltransferase (SHMT), methionine synthase reductase (MTRR), and methionine synthase (MS), in FMOCM may govern the bioavailability of metabolites and play important roles in the maintenance of genomic stability and cell viability (GSACV). To evaluate the influences of B6, genetic polymorphisms of these enzymes, and gene-nutrient interactions on GSACV, we utilized the cytokinesis-block micronucleus assay (CBMN) and PCR-restriction fragment length polymorphism (PCR-RFLP) techniques in the lymphocytes from female breast cancer cases and controls. GSACV showed a significantly positive correlation with B6 concentration, and 48 nmol/L of B6 was the most suitable concentration for maintaining GSACV in vitro. The GSACV indexes showed significantly different sensitivity to B6 deficiency between cases and controls; the B6 effect on the GSACV variance contribution of each index was significantly higher than that of genetic polymorphisms and the sample state (tumor state). SHMT C1420T mutations may reduce breast cancer susceptibility, whereas MTRR A66G and MS A2756G mutations may increase breast cancer susceptibility. The role of SHMT, MS, and MTRR genotype polymorphisms in GSACV is reduced compared with that of B6. The results appear to suggest that the long-term lack of B6 under these conditions may increase genetic damage and cell injury and that individuals with various genotypes have different sensitivities to B6 deficiency. FMOCM metabolic enzyme gene polymorphism may be related to breast cancer susceptibility to a certain extent due to the effect of other factors such as stress, hormones, cancer therapies, psychological conditions, and diet. Adequate B6 intake may be good for maintaining genome health and preventing breast cancer.

  10. Exploiting rice-sorghum synteny for targeted development of EST-SSRs to enrich the sorghum genetic linkage map.

    PubMed

    Ramu, P; Kassahun, B; Senthilvel, S; Ashok Kumar, C; Jayashree, B; Folkertsma, R T; Reddy, L Ananda; Kuruvinashetti, M S; Haussmann, B I G; Hash, C T

    2009-11-01

    The sequencing and detailed comparative functional analysis of genomes of a number of select botanical models open new doors into comparative genomics among the angiosperms, with potential benefits for improvement of many orphan crops that feed large populations. In this study, a set of simple sequence repeat (SSR) markers was developed by mining the expressed sequence tag (EST) database of sorghum. Among the SSR-containing sequences, only those sharing considerable homology with rice genomic sequences across the lengths of the 12 rice chromosomes were selected. Thus, 600 SSR-containing sorghum EST sequences (50 homologous sequences on each of the 12 rice chromosomes) were selected, with the intention of providing coverage for corresponding homologous regions of the sorghum genome. Primer pairs were designed and polymorphism detection ability was assessed using parental pairs of two existing sorghum mapping populations. About 28% of these new markers detected polymorphism in this 4-entry panel. A subset of 55 polymorphic EST-derived SSR markers were mapped onto the existing skeleton map of a recombinant inbred population derived from cross N13 x E 36-1, which is segregating for Striga resistance and the stay-green component of terminal drought tolerance. These new EST-derived SSR markers mapped across all 10 sorghum linkage groups, mostly to regions expected based on prior knowledge of rice-sorghum synteny. The ESTs from which these markers were derived were then mapped in silico onto the aligned sorghum genome sequence, and 88% of the best hits corresponded to linkage-based positions. This study demonstrates the utility of comparative genomic information in targeted development of markers to fill gaps in linkage maps of related crop species for which sufficient genomic tools are not available.

  11. Pan-genome multilocus sequence typing and outbreak-specific reference-based single nucleotide polymorphism analysis to resolve two concurrent Staphylococcus aureus outbreaks in neonatal services.

    PubMed

    Roisin, S; Gaudin, C; De Mendonça, R; Bellon, J; Van Vaerenbergh, K; De Bruyne, K; Byl, B; Pouseele, H; Denis, O; Supply, P

    2016-06-01

    We used a two-step whole genome sequencing analysis for resolving two concurrent outbreaks in two neonatal services in Belgium, caused by exfoliative toxin A-encoding-gene-positive (eta+) methicillin-susceptible Staphylococcus aureus with an otherwise sporadic spa-type t209 (ST-109). Outbreak A involved 19 neonates and one healthcare worker in a Brussels hospital from May 2011 to October 2013. After a first episode interrupted by decolonization procedures applied over 7 months, the outbreak resumed concomitantly with the onset of outbreak B in a hospital in Asse, comprising 11 neonates and one healthcare worker from mid-2012 to January 2013. Pan-genome multilocus sequence typing, defined on the basis of 42 core and accessory reference genomes, and single-nucleotide polymorphisms mapped on an outbreak-specific de novo assembly were used to compare 28 available outbreak isolates and 19 eta+/spa-type t209 isolates identified by routine or nationwide surveillance. Pan-genome multilocus sequence typing showed that the outbreaks were caused by independent clones not closely related to any of the surveillance isolates. Isolates from only ten cases with overlapping stays in outbreak A, including four pairs of twins, showed no or only a single nucleotide polymorphism variation, indicating limited sequential transmission. Detection of larger genomic variation, even from the start of the outbreak, pointed to sporadic seeding from a pre-existing exogenous source, which persisted throughout the whole course of outbreak A. Whole genome sequencing analysis can provide unique fine-tuned insights into transmission pathways of complex outbreaks even at their inception, which, with timely use, could valuably guide efforts for early source identification. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.

  12. The location of a disease-associated polymorphism and genomic structure of the human 52-kDa Ro/SSA locus (SSA1)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsugu, H.; Horowitz, R.; Gibson, N.

    1994-12-01

    Sera from approximately 30% of patients with systemic lupus erythematosus (SLE) contain high titers of autoantibodies that bind to the 52-kDa Ro/SSA protein. We previously detected polymorphisms in the 52-kDa Ro/SSA gene (SSA1) with restriction enzymes, one of which is strongly associated with the presence of SLE (P < 0.0005) in African Americans. A higher disease frequency and more severe forms of the disease are commonly noted among these female patients. To determine the location and nature of this polymorphism, we obtained two clones that span 8.5 kb of the 52-kDa Ro/SSA locus including its upstream regulatory region. Six exonsmore » were identified, and their nucleotide sequences plus adjacent noncoding regions were determined. No differences were found between these exons and the coding region of one of the reported cDNAs. The disease-associated polymorphic site suggested by a restriction enzyme map and confirmed by DNA amplification and nucleotide sequencing was present upstream of exon 1. This polymorphism may be a genetic marker for a disease-related variation in the coding region for the protein or in the upstream regulatory region of this gene. Although this RFLP is present in Japanese, it is not associated with lupus in this race. 41 refs., 4 figs., 2 tabs.« less

  13. Microsatellite markers for the native Texas perennial grass, Panicum hallii (Poaceae).

    PubMed

    Lowry, David B; Purmal, Colin T; Meyer, Eli; Juenger, Thomas E

    2012-03-01

    We developed microsatellites for Panicum hallii for studies of gene flow, population structure, breeding experiments, and genetic mapping. Next-generation (454) genomic sequence data were used to design markers. Eighteen robust markers were discovered, 15 of which were polymorphic across six accessions of P. hallii var. hallii. Fourteen of the markers cross-amplified in a P. capillare accession. For the 15 polymorphic markers, the total number of alleles per locus ranged from two to 26 (mean: 11.0) across six populations (11-19 individuals per population). Observed heterozygosity (mean: 0.031) was 13.7 times lower than the expected heterozygosity (mean: 0.426). The deficit of heterozygous individuals is consistent with P. hallii having a high rate of self-fertilization. These markers will be useful for studies in P. hallii and related species.

  14. Motif mismatches in microsatellites: insights from genome-wide investigation among 20 insect species.

    PubMed

    Behura, Susanta K; Severson, David W

    2015-02-01

    We present a detailed genome-wide comparative study of motif mismatches of microsatellites among 20 insect species representing five taxonomic orders. The results show that varying proportions (∼15-46%) of microsatellites identified in these species are imperfect in motif structure, and that they also vary in chromosomal distribution within genomes. It was observed that the genomic abundance of imperfect repeats is significantly associated with the length and number of motif mismatches of microsatellites. Furthermore, microsatellites with a higher number of mismatches tend to have lower abundance in the genome, suggesting that sequence heterogeneity of repeat motifs is a key determinant of genomic abundance of microsatellites. This relationship seems to be a general feature of microsatellites even in unrelated species such as yeast, roundworm, mouse and human. We provide a mechanistic explanation of the evolutionary link between motif heterogeneity and genomic abundance of microsatellites by examining the patterns of motif mismatches and allele sequences of single-nucleotide polymorphisms identified within microsatellite loci. Using Drosophila Reference Genetic Panel data, we further show that pattern of allelic variation modulates motif heterogeneity of microsatellites, and provide estimates of allele age of specific imperfect microsatellites found within protein-coding genes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  15. An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments.

    PubMed

    Feltus, F Alex; Wan, Jun; Schulze, Stefan R; Estill, James C; Jiang, Ning; Paterson, Andrew H

    2004-09-01

    Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% +/- 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% +/- 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp.

  16. An SNP Resource for Rice Genetics and Breeding Based on Subspecies Indica and Japonica Genome Alignments

    PubMed Central

    Feltus, F. Alex; Wan, Jun; Schulze, Stefan R.; Estill, James C.; Jiang, Ning; Paterson, Andrew H.

    2004-01-01

    Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% ± 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% ± 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp. PMID:15342564

  17. Genome-wide analysis of intraspecific DNA polymorphism in 'Micro-Tom', a model cultivar of tomato (Solanum lycopersicum).

    PubMed

    Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh

    2014-02-01

    Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.

  18. Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp) using a soybean genome array.

    PubMed

    Das, Sayan; Bhat, Prasanna R; Sudhakar, Chinta; Ehlers, Jeffrey D; Wanamaker, Steve; Roberts, Philip A; Cui, Xinping; Close, Timothy J

    2008-02-28

    Cowpea (Vigna unguiculata L. Walp) is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP) is a microarray-based marker which can be used for high throughput genotyping and high density mapping. Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max) genome array. Robustified projection pursuit (RPP) was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL) population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.

  19. Retrotransposon Capture Sequencing (RC-Seq): A Targeted, High-Throughput Approach to Resolve Somatic L1 Retrotransposition in Humans.

    PubMed

    Sanchez-Luque, Francisco J; Richardson, Sandra R; Faulkner, Geoffrey J

    2016-01-01

    Mobile genetic elements (MGEs) are of critical importance in genomics and developmental biology. Polymorphic and somatic MGE insertions have the potential to impact the phenotype of an individual, depending on their genomic locations and functional consequences. However, the identification of polymorphic and somatic insertions among the plethora of copies residing in the genome presents a formidable technical challenge. Whole genome sequencing has the potential to address this problem; however, its efficacy depends on the abundance of cells carrying the new insertion. Robust detection of somatic insertions present in only a subset of cells within a given sample can also be prohibitively expensive due to a requirement for high sequencing depth. Here, we describe retrotransposon capture sequencing (RC-seq), a sequence capture approach in which Illumina libraries are enriched for fragments containing the 5' and 3' termini of specific MGEs. RC-seq allows the detection of known polymorphic insertions present in an individual, as well as the identification of rare or private germline insertions not previously described. Furthermore, RC-seq can be used to detect and characterize somatic insertions, providing a valuable tool to elucidate the extent and characteristics of MGE activity in healthy tissues and in various disease states.

  20. Characterization of polymorphic chloroplast microsatellites in Prunus species and maternal lineages in peach genotypes

    USDA-ARS?s Scientific Manuscript database

    Several available Prunus chloroplast genomes have not been exploited to develop polymorphic chloroplast microsatellites that could be useful in Prunus maternal lineage and phylogenetic analysis. In this study, using available bioinformatics tools, 80, 75, and 78 microsatellites were identified from ...

  1. [Single nucleotide polymorphism and its application in allogeneic hematopoietic stem cell transplantation--review].

    PubMed

    Li, Su-Xia

    2004-12-01

    Single nucleotide polymorphism (SNP) is the third genetic marker after restriction fragment length polymorphism (RFLP) and short tandem repeat. It represents the most density genetic variability in the human genome and has been widely used in gene location, cloning, and research of heredity variation, as well as parenthood identification in forensic medicine. As steady heredity polymorphism, single nucleotide polymorphism is becoming the focus of attention in monitoring chimerism and minimal residual disease in the patients after allogeneic hematopoietic stem cell transplantation. The article reviews SNP heredity characterization, analysis techniques and its applications in allogeneic stem cell transplantation and other fields.

  2. 2b-RAD genotyping for population genomic studies of Chagas disease vectors: Rhodnius ecuadoriensis in Ecuador.

    PubMed

    Hernandez-Castro, Luis E; Paterno, Marta; Villacís, Anita G; Andersson, Björn; Costales, Jaime A; De Noia, Michele; Ocaña-Mayorga, Sofía; Yumiseva, Cesar A; Grijalva, Mario J; Llewellyn, Martin S

    2017-07-01

    Rhodnius ecuadoriensis is the main triatomine vector of Chagas disease, American trypanosomiasis, in Southern Ecuador and Northern Peru. Genomic approaches and next generation sequencing technologies have become powerful tools for investigating population diversity and structure which is a key consideration for vector control. Here we assess the effectiveness of three different 2b restriction site-associated DNA (2b-RAD) genotyping strategies in R. ecuadoriensis to provide sufficient genomic resolution to tease apart microevolutionary processes and undertake some pilot population genomic analyses. The 2b-RAD protocol was carried out in-house at a non-specialized laboratory using 20 R. ecuadoriensis adults collected from the central coast and southern Andean region of Ecuador, from June 2006 to July 2013. 2b-RAD sequencing data was performed on an Illumina MiSeq instrument and analyzed with the STACKS de novo pipeline for loci assembly and Single Nucleotide Polymorphism (SNP) discovery. Preliminary population genomic analyses (global AMOVA and Bayesian clustering) were implemented. Our results showed that the 2b-RAD genotyping protocol is effective for R. ecuadoriensis and likely for other triatomine species. However, only BcgI and CspCI restriction enzymes provided a number of markers suitable for population genomic analysis at the read depth we generated. Our preliminary genomic analyses detected a signal of genetic structuring across the study area. Our findings suggest that 2b-RAD genotyping is both a cost effective and methodologically simple approach for generating high resolution genomic data for Chagas disease vectors with the power to distinguish between different vector populations at epidemiologically relevant scales. As such, 2b-RAD represents a powerful tool in the hands of medical entomologists with limited access to specialized molecular biological equipment.

  3. 2b-RAD genotyping for population genomic studies of Chagas disease vectors: Rhodnius ecuadoriensis in Ecuador

    PubMed Central

    Villacís, Anita G.; Andersson, Björn; Costales, Jaime A.; De Noia, Michele; Ocaña-Mayorga, Sofía; Yumiseva, Cesar A.; Grijalva, Mario J.; Llewellyn, Martin S.

    2017-01-01

    Background Rhodnius ecuadoriensis is the main triatomine vector of Chagas disease, American trypanosomiasis, in Southern Ecuador and Northern Peru. Genomic approaches and next generation sequencing technologies have become powerful tools for investigating population diversity and structure which is a key consideration for vector control. Here we assess the effectiveness of three different 2b restriction site-associated DNA (2b-RAD) genotyping strategies in R. ecuadoriensis to provide sufficient genomic resolution to tease apart microevolutionary processes and undertake some pilot population genomic analyses. Methodology/Principal findings The 2b-RAD protocol was carried out in-house at a non-specialized laboratory using 20 R. ecuadoriensis adults collected from the central coast and southern Andean region of Ecuador, from June 2006 to July 2013. 2b-RAD sequencing data was performed on an Illumina MiSeq instrument and analyzed with the STACKS de novo pipeline for loci assembly and Single Nucleotide Polymorphism (SNP) discovery. Preliminary population genomic analyses (global AMOVA and Bayesian clustering) were implemented. Our results showed that the 2b-RAD genotyping protocol is effective for R. ecuadoriensis and likely for other triatomine species. However, only BcgI and CspCI restriction enzymes provided a number of markers suitable for population genomic analysis at the read depth we generated. Our preliminary genomic analyses detected a signal of genetic structuring across the study area. Conclusions/Significance Our findings suggest that 2b-RAD genotyping is both a cost effective and methodologically simple approach for generating high resolution genomic data for Chagas disease vectors with the power to distinguish between different vector populations at epidemiologically relevant scales. As such, 2b-RAD represents a powerful tool in the hands of medical entomologists with limited access to specialized molecular biological equipment. PMID:28723901

  4. Extensive sequence-influenced DNA methylation polymorphism in the human genome

    PubMed Central

    2010-01-01

    Background Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters), but there are also interesting patterns of CpG methylation found outside of CpG islands. Results We compared DNA methylation patterns on both alleles between many pairs (and larger groups) of related and unrelated individuals. Direct observation and simulation experiments revealed that around 10% of common single nucleotide polymorphisms (SNPs) reside in regions with differences in the propensity for local DNA methylation between the two alleles. We further showed that for the most common form of SNP, a polymorphism at a CpG dinucleotide, the presence of the CpG at the SNP positively affected local DNA methylation in cis. Conclusions Taken together with the known effect of DNA methylation on mutation rate, our results suggest an interesting interdependence between genetics and epigenetics underlying diversity in the human genome. PMID:20497546

  5. Development and validation of 697 novel polymorphic genomic and EST-SSR markers in the American cranberry (Vaccinium macrocarpon Ait.).

    PubMed

    Schlautman, Brandon; Fajardo, Diego; Bougie, Tierney; Wiesman, Eric; Polashock, James; Vorsa, Nicholi; Steffan, Shawn; Zalapa, Juan

    2015-01-27

    The American cranberry, Vaccinium macrocarpon Ait., is an economically important North American fruit crop that is consumed because of its unique flavor and potential health benefits. However, a lack of abundant, genome-wide molecular markers has limited the adoption of modern molecular assisted selection approaches in cranberry breeding programs. To increase the number of available markers in the species, this study identified, tested, and validated microsatellite markers from existing nuclear and transcriptome sequencing data. In total, new primers were designed, synthesized, and tested for 979 SSR loci; 697 of the markers amplified allele patterns consistent with single locus segregation in a diploid organism and were considered polymorphic. Of the 697 polymorphic loci, 507 were selected for additional genetic diversity and segregation analyses in 29 cranberry genotypes. More than 95% of the 507 loci did not display segregation distortion at the p < 0.05 level, and contained moderate to high levels of polymorphism with a polymorphic information content >0.25. This comprehensive collection of developed and validated microsatellite loci represents a substantial addition to the molecular tools available for geneticists, genomicists, and breeders in cranberry and Vaccinium.

  6. RExPrimer: an integrated primer designing tool increases PCR effectiveness by avoiding 3' SNP-in-primer and mis-priming from structural variation

    PubMed Central

    2009-01-01

    Background Polymerase chain reaction (PCR) is very useful in many areas of molecular biology research. It is commonly observed that PCR success is critically dependent on design of an effective primer pair. Current tools for primer design do not adequately address the problem of PCR failure due to mis-priming on target-related sequences and structural variations in the genome. Methods We have developed an integrated graphical web-based application for primer design, called RExPrimer, which was written in Python language. The software uses Primer3 as the primer designing core algorithm. Locally stored sequence information and genomic variant information were hosted on MySQLv5.0 and were incorporated into RExPrimer. Results RExPrimer provides many functionalities for improved PCR primer design. Several databases, namely annotated human SNP databases, insertion/deletion (indel) polymorphisms database, pseudogene database, and structural genomic variation databases were integrated into RExPrimer, enabling an effective without-leaving-the-website validation of the resulting primers. By incorporating these databases, the primers reported by RExPrimer avoid mis-priming to related sequences (e.g. pseudogene, segmental duplication) as well as possible PCR failure because of structural polymorphisms (SNP, indel, and copy number variation (CNV)). To prevent mismatching caused by unexpected SNPs in the designed primers, in particular the 3' end (SNP-in-Primer), several SNP databases covering the broad range of population-specific SNP information are utilized to report SNPs present in the primer sequences. Population-specific SNP information also helps customize primer design for a specific population. Furthermore, RExPrimer offers a graphical user-friendly interface through the use of scalable vector graphic image that intuitively presents resulting primers along with the corresponding gene structure. In this study, we demonstrated the program effectiveness in successfully generating primers for strong homologous sequences. Conclusion The improvements for primer design incorporated into RExPrimer were demonstrated to be effective in designing primers for challenging PCR experiments. Integration of SNP and structural variation databases allows for robust primer design for a variety of PCR applications, irrespective of the sequence complexity in the region of interest. This software is freely available at http://www4a.biotec.or.th/rexprimer. PMID:19958502

  7. Speciation network in Laurasiatheria: retrophylogenomic signals.

    PubMed

    Doronina, Liliya; Churakov, Gennady; Kuritzin, Andrej; Shi, Jingjing; Baertsch, Robert; Clawson, Hiram; Schmitz, Jürgen

    2017-06-01

    Rapid species radiation due to adaptive changes or occupation of new ecospaces challenges our understanding of ancestral speciation and the relationships of modern species. At the molecular level, rapid radiation with successive speciations over short time periods-too short to fix polymorphic alleles-is described as incomplete lineage sorting. Incomplete lineage sorting leads to random fixation of genetic markers and hence, random signals of relationships in phylogenetic reconstructions. The situation is further complicated when you consider that the genome is a mosaic of ancestral and modern incompletely sorted sequence blocks that leads to reconstructed affiliations to one or the other relative, depending on the fixation of their shared ancestral polymorphic alleles. The laurasiatherian relationships among Chiroptera, Perissodactyla, Cetartiodactyla, and Carnivora present a prime example for such enigmatic affiliations. We performed whole-genome screenings for phylogenetically diagnostic retrotransposon insertions involving the representatives bat (Chiroptera), horse (Perissodactyla), cow (Cetartiodactyla), and dog (Carnivora), and extracted among 162,000 preselected cases 102 virtually homoplasy-free, phylogenetically informative retroelements to draw a complete picture of the highly complex evolutionary relations within Laurasiatheria. All possible evolutionary scenarios received considerable retrotransposon support, leaving us with a network of affiliations. However, the Cetartiodactyla-Carnivora relationship as well as the basal position of Chiroptera and an ancestral laurasiatherian hybridization process did exhibit some very clear, distinct signals. The significant accordance of retrotransposon presence/absence patterns and flanking nucleotide changes suggest an important influence of mosaic genome structures in the reconstruction of species histories. © 2017 Doronina et al.; Published by Cold Spring Harbor Laboratory Press.

  8. Speciation network in Laurasiatheria: retrophylogenomic signals

    PubMed Central

    Doronina, Liliya; Churakov, Gennady; Kuritzin, Andrej; Shi, Jingjing; Baertsch, Robert; Clawson, Hiram; Schmitz, Jürgen

    2017-01-01

    Rapid species radiation due to adaptive changes or occupation of new ecospaces challenges our understanding of ancestral speciation and the relationships of modern species. At the molecular level, rapid radiation with successive speciations over short time periods—too short to fix polymorphic alleles—is described as incomplete lineage sorting. Incomplete lineage sorting leads to random fixation of genetic markers and hence, random signals of relationships in phylogenetic reconstructions. The situation is further complicated when you consider that the genome is a mosaic of ancestral and modern incompletely sorted sequence blocks that leads to reconstructed affiliations to one or the other relative, depending on the fixation of their shared ancestral polymorphic alleles. The laurasiatherian relationships among Chiroptera, Perissodactyla, Cetartiodactyla, and Carnivora present a prime example for such enigmatic affiliations. We performed whole-genome screenings for phylogenetically diagnostic retrotransposon insertions involving the representatives bat (Chiroptera), horse (Perissodactyla), cow (Cetartiodactyla), and dog (Carnivora), and extracted among 162,000 preselected cases 102 virtually homoplasy-free, phylogenetically informative retroelements to draw a complete picture of the highly complex evolutionary relations within Laurasiatheria. All possible evolutionary scenarios received considerable retrotransposon support, leaving us with a network of affiliations. However, the Cetartiodactyla–Carnivora relationship as well as the basal position of Chiroptera and an ancestral laurasiatherian hybridization process did exhibit some very clear, distinct signals. The significant accordance of retrotransposon presence/absence patterns and flanking nucleotide changes suggest an important influence of mosaic genome structures in the reconstruction of species histories. PMID:28298429

  9. Ploidy Variation in Kluyveromyces marxianus Separates Dairy and Non-dairy Isolates

    PubMed Central

    Ortiz-Merino, Raúl A.; Varela, Javier A.; Coughlan, Aisling Y.; Hoshida, Hisashi; da Silveira, Wendel B.; Wilde, Caroline; Kuijpers, Niels G. A.; Geertman, Jan-Maarten; Wolfe, Kenneth H.; Morrissey, John P.

    2018-01-01

    Kluyveromyces marxianus is traditionally associated with fermented dairy products, but can also be isolated from diverse non-dairy environments. Because of thermotolerance, rapid growth and other traits, many different strains are being developed for food and industrial applications but there is, as yet, little understanding of the genetic diversity or population genetics of this species. K. marxianus shows a high level of phenotypic variation but the only phenotype that has been clearly linked to a genetic polymorphism is lactose utilisation, which is controlled by variation in the LAC12 gene. The genomes of several strains have been sequenced in recent years and, in this study, we sequenced a further nine strains from different origins. Analysis of the Single Nucleotide Polymorphisms (SNPs) in 14 strains was carried out to examine genome structure and genetic diversity. SNP diversity in K. marxianus is relatively high, with up to 3% DNA sequence divergence between alleles. It was found that the isolates include haploid, diploid, and triploid strains, as shown by both SNP analysis and flow cytometry. Diploids and triploids contain long genomic tracts showing loss of heterozygosity (LOH). All six isolates from dairy environments were diploid or triploid, whereas 6 out 7 isolates from non-dairy environment were haploid. This also correlated with the presence of functional LAC12 alleles only in dairy haplotypes. The diploids were hybrids between a non-dairy and a dairy haplotype, whereas triploids included three copies of a dairy haplotype. PMID:29619042

  10. Genome-wide genetic diversity, population structure and admixture analysis in African and Asian cattle breeds.

    PubMed

    Edea, Z; Bhuiyan, M S A; Dessie, T; Rothschild, M F; Dadi, H; Kim, K S

    2015-02-01

    Knowledge about genetic diversity and population structure is useful for designing effective strategies to improve the production, management and conservation of farm animal genetic resources. Here, we present a comprehensive genome-wide analysis of genetic diversity, population structure and admixture based on 244 animals sampled from 10 cattle populations in Asia and Africa and genotyped for 69,903 autosomal single-nucleotide polymorphisms (SNPs) mainly derived from the indicine breed. Principal component analysis, STRUCTURE and distance analysis from high-density SNP data clearly revealed that the largest genetic difference occurred between the two domestic lineages (taurine and indicine), whereas Ethiopian cattle populations represent a mosaic of the humped zebu and taurine. Estimation of the genetic influence of zebu and taurine revealed that Ethiopian cattle were characterized by considerable levels of introgression from South Asian zebu, whereas Bangladeshi populations shared very low taurine ancestry. The relationships among Ethiopian cattle populations reflect their history of origin and admixture rather than phenotype-based distinctions. The high within-individual genetic variability observed in Ethiopian cattle represents an untapped opportunity for adaptation to changing environments and for implementation of within-breed genetic improvement schemes. Our results provide a basis for future applications of genome-wide SNP data to exploit the unique genetic makeup of indigenous cattle breeds and to facilitate their improvement and conservation.

  11. Molecular characterization of the Gossypium Diversity Reference Set of the US National Cotton Germplasm Collection.

    PubMed

    Hinze, Lori L; Fang, David D; Gore, Michael A; Scheffler, Brian E; Yu, John Z; Frelichowski, James; Percy, Richard G

    2015-02-01

    A core marker set containing markers developed to be informative within a single commercial cotton species can elucidate diversity structure within a multi-species subset of the Gossypium germplasm collection. An understanding of the genetic diversity of cotton (Gossypium spp.) as represented in the US National Cotton Germplasm Collection is essential to develop strategies for collecting, conserving, and utilizing these germplasm resources. The US collection is one of the largest world collections and includes not only accessions with improved yield and fiber quality within cultivated species, but also accessions possessing sources of abiotic and biotic stress resistance often found in wild species. We evaluated the genetic diversity of a subset of 272 diploid and 1,984 tetraploid accessions in the collection (designated the Gossypium Diversity Reference Set) using a core set of 105 microsatellite markers. Utility of the core set of markers in differentiating intra-genome variation was much greater in commercial tetraploid genomes (99.7 % polymorphic bands) than in wild diploid genomes (72.7 % polymorphic bands), and may have been influenced by pre-selection of markers for effectiveness in the commercial species. Principal coordinate analyses revealed that the marker set differentiated interspecific variation among tetraploid species, but was only capable of partially differentiating among species and genomes of the wild diploids. Putative species-specific marker bands in G. hirsutum (73) and G. barbadense (81) were identified that could be used for qualitative identification of misclassifications, redundancies, and introgression within commercial tetraploid species. The results of this broad-scale molecular characterization are essential to the management and conservation of the collection and provide insight and guidance in the use of the collection by the cotton research community in their cotton improvement efforts.

  12. Single nucleotide polymorphism and haplotype effects associated with somatic cell score in German Holstein cattle

    PubMed Central

    2014-01-01

    Background To better understand the genetic determination of udder health, we performed a genome-wide association study (GWAS) on a population of 2354 German Holstein bulls for which daughter yield deviations (DYD) for somatic cell score (SCS) were available. For this study, we used genetic information of 44 576 informative single nucleotide polymorphisms (SNPs) and 11 725 inferred haplotype blocks. Results When accounting for the sub-structure of the analyzed population, 16 SNPs and 10 haplotypes in six genomic regions were significant at the Bonferroni threshold of P ≤ 1.14 × 10-6. The size of the identified regions ranged from 0.05 to 5.62 Mb. Genomic regions on chromosomes 5, 6, 18 and 19 coincided with known QTL affecting SCS, while additional genomic regions were found on chromosomes 13 and X. Of particular interest is the region on chromosome 6 between 85 and 88 Mb, where QTL for mastitis traits and significant SNPs for SCS in different Holstein populations coincide with our results. In all identified regions, except for the region on chromosome X, significant SNPs were present in significant haplotypes. The minor alleles of identified SNPs on chromosomes 18 and 19, and the major alleles of SNPs on chromosomes 6 and X were favorable for a lower SCS. Differences in somatic cell count (SCC) between alternative SNP alleles reached 14 000 cells/mL. Conclusions The results support the polygenic nature of the genetic determination of SCS, confirm the importance of previously reported QTL, and provide evidence for the segregation of additional QTL for SCS in Holstein cattle. The small size of the regions identified here will facilitate the search for causal genetic variations that affect gene functions. PMID:24898131

  13. Multilocus patterns of polymorphism and selection across the X chromosome of Caenorhabditis remanei.

    PubMed

    Cutter, Asher D

    2008-03-01

    Natural selection and neutral processes such as demography, mutation, and gene conversion all contribute to patterns of polymorphism within genomes. Identifying the relative importance of these varied components in evolution provides the principal challenge for population genetics. To address this issue in the nematode Caenorhabditis remanei, I sampled nucleotide polymorphism at 40 loci across the X chromosome. The site-frequency spectrum for these loci provides no evidence for population size change, and one locus presents a candidate for linkage to a target of balancing selection. Selection for codon usage bias leads to the non-neutrality of synonymous sites, and despite its weak magnitude of effect (N(e)s approximately 0.1), is responsible for profound patterns of diversity and divergence in the C. remanei genome. Although gene conversion is evident for many loci, biased gene conversion is not identified as a significant evolutionary process in this sample. No consistent association is observed between synonymous-site diversity and linkage-disequilibrium-based estimators of the population recombination parameter, despite theoretical predictions about background selection or widespread genetic hitchhiking, but genetic map-based estimates of recombination are needed to rigorously test for a diversity-recombination relationship. Coalescent simulations also illustrate how a spurious correlation between diversity and linkage-disequilibrium-based estimators of recombination can occur, due in part to the presence of unbiased gene conversion. These results illustrate the influence that subtle natural selection can exert on polymorphism and divergence, in the form of codon usage bias, and demonstrate the potential of C. remanei for detecting natural selection from genomic scans of polymorphism.

  14. Differences in selection drive olfactory receptor genes in different directions in dogs and wolf.

    PubMed

    Chen, Rui; Irwin, David M; Zhang, Ya-Ping

    2012-11-01

    The olfactory receptor (OR) gene family is the largest gene family found in mammalian genomes. It is known to evolve through a birth-and-death process. Here, we characterized the sequences of 16 segregating OR pseudogenes in the samples of the wolf and the Chinese village dog (CVD) and compared them with the sequences from dogs of different breeds. Our results show that the segregating OR pseudogenes in breed dogs are under strong purifying selection, while evolving neutrally in the CVD, and show a more complicated pattern in the wolf. In the wolf, we found a trend to remove deleterious polymorphisms and accumulate nondeleterious polymorphisms. On the basis of protein structure of the ORs, we found that the distribution of different types of polymorphisms (synonymous, nonsynonymous, tolerated, and untolerated) varied greatly between the wolf and the breed dogs. In summary, our results suggest that different forms of selection have acted on the segregating OR pseudogenes in the CVD since domestication, breed dogs after breed formation, and ancestral wolf population, which has driven the evolution of these genes in different directions.

  15. Extensive 5.8S nrDNA polymorphism in Mammillaria (Cactaceae) with special reference to the identification of pseudogenic internal transcribed spacer regions.

    PubMed

    Harpke, Doerte; Peterson, Angela

    2008-05-01

    The internal transcribed spacer (ITS) region (ITS1, 5.8S rDNA, ITS2) represents the most widely applied nuclear marker in eukaryotic phylogenetics. Although this region has been assumed to evolve in concert, the number of investigations revealing high degrees of intra-individual polymorphism connected with the presence of pseudogenes has risen. The 5.8S rDNA is the most important diagnostic marker for functionality of the ITS region. In Mammillaria, intra-individual 5.8S rDNA polymorphisms of up to 36% and up to nine different types have been found. Twenty-eight of 30 cloned genomic Mammillaria sequences were identified as putative pseudogenes. For the identification of pseudogenic ITS regions, in addition to formal tests based on substitution rates, we attempted to focus on functional features of the 5.8S rDNA (5.8S motif, secondary structure). The importance of functional data for the identification of pseudogenes is outlined and discussed. The identification of pseudogenes is essential, because they may cause erroneous phylogenies and taxonomic problems.

  16. Length and nucleotide sequence polymorphism at the trnL and trnF non-coding regions of chloroplast genomes among Saccharum and Erianthus species

    USDA-ARS?s Scientific Manuscript database

    The aneupolyploidy genome of sugarcane (Saccharum hybrids spp.) and lack of a classical genetic linkage map make genetics research most difficult for sugarcane. Whole genome sequencing and genetic characterization of sugarcane and related taxa are far behind other crops. In this study, universal PCR...

  17. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    DTIC Science & Technology

    2011-09-01

    Almasy, L, Blangero, J. (2009) Human QTL linkage mapping. Genetica 136:333-340. Amos, CI. (2007) Successful design and conduct of genome-wide...quantitative trait loci. Genetica 136:237-243. Skol AD, Scott LJ, Abecasis GR, Boehnke M. (2006) Joint analysis is more efficient than replication

  18. Restriction fragment length polymorphism and allozyme linkage map of Cuphea lanceolata.

    PubMed

    Webb, D M; Knapp, S J; Tagliani, L A

    1992-02-01

    Cuphea lanceolata Ait. has had a significant role in the domestication of Cuphea and is a useful experimental organism for investigating how medium-chain lipids are synthesized in developing seeds. To expand the genetics of this species, a linkage map of the C. lanceolata genome was constructed using five allozyme and 32 restriction-fragment-length-polymorphism (RFLP) marker loci. These loci were assigned to six linkage groups that correspond to the six chromosomes of this species. Map length is 288 cM. Levels of polymorphism were estimated for three inbred lines of C. lanceolata and an inbred line of C. viscosissima using 84 random genomic clones and two restriction enzymes, EcoRI and HindIII. Of the probes 29% detected RFLPs between C. lanceolata and C. viscosissima lines. Crosses between these species can be exploited to expand the map.

  19. BrassicaTED - a public database for utilization of miniature transposable elements in Brassica species.

    PubMed

    Murukarthick, Jayakodi; Sampath, Perumal; Lee, Sang Choon; Choi, Beom-Soon; Senthil, Natesan; Liu, Shengyi; Yang, Tae-Jin

    2014-06-20

    MITE, TRIM and SINEs are miniature form transposable elements (mTEs) that are ubiquitous and dispersed throughout entire plant genomes. Tens of thousands of members cause insertion polymorphism at both the inter- and intra- species level. Therefore, mTEs are valuable targets and resources for development of markers that can be utilized for breeding, genetic diversity and genome evolution studies. Taking advantage of the completely sequenced genomes of Brassica rapa and B. oleracea, characterization of mTEs and building a curated database are prerequisite to extending their utilization for genomics and applied fields in Brassica crops. We have developed BrassicaTED as a unique web portal containing detailed characterization information for mTEs of Brassica species. At present, BrassicaTED has datasets for 41 mTE families, including 5894 and 6026 members from 20 MITE families, 1393 and 1639 members from 5 TRIM families, 1270 and 2364 members from 16 SINE families in B. rapa and B. oleracea, respectively. BrassicaTED offers different sections to browse structural and positional characteristics for every mTE family. In addition, we have added data on 289 MITE insertion polymorphisms from a survey of seven Brassica relatives. Genes with internal mTE insertions are shown with detailed gene annotation and microarray-based comparative gene expression data in comparison with their paralogs in the triplicated B. rapa genome. This database also includes a novel tool, K BLAST (Karyotype BLAST), for clear visualization of the locations for each member in the B. rapa and B. oleracea pseudo-genome sequences. BrassicaTED is a newly developed database of information regarding the characteristics and potential utility of mTEs including MITE, TRIM and SINEs in B. rapa and B. oleracea. The database will promote the development of desirable mTE-based markers, which can be utilized for genomics and breeding in Brassica species. BrassicaTED will be a valuable repository for scientists and breeders, promoting efficient research on Brassica species. BrassicaTED can be accessed at http://im-crop.snu.ac.kr/BrassicaTED/index.php.

  20. Relationships among calpastatin single nucleotide polymorphisms, calpastatin expression and tenderness in pork longissimus

    USDA-ARS?s Scientific Manuscript database

    Genome scans in the pig have identified a region on chromosome 2 (SSC2) associated with tenderness. Calpastatin is a likely positional candidate gene in this region because of its inhibitory role in the calpain system that is involved in postmortem tenderization. Novel single nucleotide polymorphism...

  1. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

    USDA-ARS?s Scientific Manuscript database

    Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...

  2. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass.

    PubMed

    Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang; Yerges-Armstrong, Laura M; Chou, Wen-Chi; Stolk, Lisette; Livshits, Gregory; Broer, Linda; Johnson, Toby; Koller, Daniel L; Kutalik, Zoltán; Luan, Jian'an; Malkin, Ida; Ried, Janina S; Smith, Albert V; Thorleifsson, Gudmar; Vandenput, Liesbeth; Hua Zhao, Jing; Zhang, Weihua; Aghdassi, Ali; Åkesson, Kristina; Amin, Najaf; Baier, Leslie J; Barroso, Inês; Bennett, David A; Bertram, Lars; Biffar, Rainer; Bochud, Murielle; Boehnke, Michael; Borecki, Ingrid B; Buchman, Aron S; Byberg, Liisa; Campbell, Harry; Campos Obanda, Natalia; Cauley, Jane A; Cawthon, Peggy M; Cederberg, Henna; Chen, Zhao; Cho, Nam H; Jin Choi, Hyung; Claussnitzer, Melina; Collins, Francis; Cummings, Steven R; De Jager, Philip L; Demuth, Ilja; Dhonukshe-Rutten, Rosalie A M; Diatchenko, Luda; Eiriksdottir, Gudny; Enneman, Anke W; Erdos, Mike; Eriksson, Johan G; Eriksson, Joel; Estrada, Karol; Evans, Daniel S; Feitosa, Mary F; Fu, Mao; Garcia, Melissa; Gieger, Christian; Girke, Thomas; Glazer, Nicole L; Grallert, Harald; Grewal, Jagvir; Han, Bok-Ghee; Hanson, Robert L; Hayward, Caroline; Hofman, Albert; Hoffman, Eric P; Homuth, Georg; Hsueh, Wen-Chi; Hubal, Monica J; Hubbard, Alan; Huffman, Kim M; Husted, Lise B; Illig, Thomas; Ingelsson, Erik; Ittermann, Till; Jansson, John-Olov; Jordan, Joanne M; Jula, Antti; Karlsson, Magnus; Khaw, Kay-Tee; Kilpeläinen, Tuomas O; Klopp, Norman; Kloth, Jacqueline S L; Koistinen, Heikki A; Kraus, William E; Kritchevsky, Stephen; Kuulasmaa, Teemu; Kuusisto, Johanna; Laakso, Markku; Lahti, Jari; Lang, Thomas; Langdahl, Bente L; Launer, Lenore J; Lee, Jong-Young; Lerch, Markus M; Lewis, Joshua R; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Liu, Tian; Liu, Youfang; Ljunggren, Östen; Lorentzon, Mattias; Luben, Robert N; Maixner, William; McGuigan, Fiona E; Medina-Gomez, Carolina; Meitinger, Thomas; Melhus, Håkan; Mellström, Dan; Melov, Simon; Michaëlsson, Karl; Mitchell, Braxton D; Morris, Andrew P; Mosekilde, Leif; Newman, Anne; Nielson, Carrie M; O'Connell, Jeffrey R; Oostra, Ben A; Orwoll, Eric S; Palotie, Aarno; Parker, Stephen C J; Peacock, Munro; Perola, Markus; Peters, Annette; Polasek, Ozren; Prince, Richard L; Räikkönen, Katri; Ralston, Stuart H; Ripatti, Samuli; Robbins, John A; Rotter, Jerome I; Rudan, Igor; Salomaa, Veikko; Satterfield, Suzanne; Schadt, Eric E; Schipf, Sabine; Scott, Laura; Sehmi, Joban; Shen, Jian; Soo Shin, Chan; Sigurdsson, Gunnar; Smith, Shad; Soranzo, Nicole; Stančáková, Alena; Steinhagen-Thiessen, Elisabeth; Streeten, Elizabeth A; Styrkarsdottir, Unnur; Swart, Karin M A; Tan, Sian-Tsung; Tarnopolsky, Mark A; Thompson, Patricia; Thomson, Cynthia A; Thorsteinsdottir, Unnur; Tikkanen, Emmi; Tranah, Gregory J; Tuomilehto, Jaakko; van Schoor, Natasja M; Verma, Arjun; Vollenweider, Peter; Völzke, Henry; Wactawski-Wende, Jean; Walker, Mark; Weedon, Michael N; Welch, Ryan; Wichmann, H-Erich; Widen, Elisabeth; Williams, Frances M K; Wilson, James F; Wright, Nicole C; Xie, Weijia; Yu, Lei; Zhou, Yanhua; Chambers, John C; Döring, Angela; van Duijn, Cornelia M; Econs, Michael J; Gudnason, Vilmundur; Kooner, Jaspal S; Psaty, Bruce M; Spector, Timothy D; Stefansson, Kari; Rivadeneira, Fernando; Uitterlinden, André G; Wareham, Nicholas J; Ossowski, Vicky; Waterworth, Dawn; Loos, Ruth J F; Karasik, David; Harris, Tamara B; Ohlsson, Claes; Kiel, Douglas P

    2017-07-19

    Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 × 10 -8 ) or suggestively genome wide (p < 2.3 × 10 -6 ). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass.Lean body mass is a highly heritable trait and is associated with various health conditions. Here, Kiel and colleagues perform a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.

  3. [Mitochondrial DNA polymorphisms shared between modern humans and neanderthals: adaptive convergence or evidence for interspecific hybridization?].

    PubMed

    Maliarchuk, B A

    2013-09-01

    An analysis of the variability of the nucleotide sequences in the mitochondrial genome of modern humans, neanderthals, Denisovans, and other primates has shown that there are shared polymorphisms at positions 2758 and 7146 between modern Homo sapiens (in phylogenetic cluster L2'3'4'5'6) and Homo neanderthalensis (in the group of European neanderthals younger than 48000 years). It is suggested that the convergence may be due to adaptive changes in the mitochondrial genomes of modern humans and neanderthals or interspecific hybridization associated with mtDNA recombination.

  4. Skin score correlates with global DNA methylation and GSTO1 A140D polymorphism in arsenic-affected population of Eastern India.

    PubMed

    Majumder, Moumita; Dasgupta, Uma B; Guha Mazumder, D N; Das, Nilansu

    2017-07-01

    Arsenic is a potent environmental toxicant causing serious public health concerns in India, Bangladesh and other parts of the world. Gene- and promoter-specific hypermethylation has been reported in different arsenic-exposed cell lines, whereas whole genome DNA methylation study suggested genomic hypo- and hypermethylation after arsenic exposure in in vitro and in vivo studies. Along with other characteristic biomarkers, arsenic toxicity leads to typical skin lesions. The present study demonstrates significant correlation between severities of skin manifestations with their whole genome DNA methylation status as well as with a particular polymorphism (Ala 140 Asp) status in arsenic metabolizing enzyme Glutathione S-transferase Omega-1 (GSTO1) in arsenic-exposed population of the district of Nadia, West Bengal, India.

  5. Maternal lineages of peach genotypes

    USDA-ARS?s Scientific Manuscript database

    Simple sequence repeats (SSRs) in chloroplast genomes are useful markers to determine maternal lineages. The SSR mining results revealed that most chloroplast SSRs among three Prunus chloroplast genomes were conserved in locations and motif types, but polymorphic in motif and/or amplicon lengths. Fi...

  6. Human pigmentation genes under environmental selection

    PubMed Central

    2012-01-01

    Genome-wide association studies and comparative genomics have established major loci and specific polymorphisms affecting human skin, hair and eye color. Environmental changes have had an impact on selected pigmentation genes as populations have expanded into different regions of the globe. PMID:23110848

  7. Failure to Replicate a Genetic Association May Provide Important Clues About Genetic Architecture

    PubMed Central

    Greene, Casey S.; Penrod, Nadia M.; Williams, Scott M.; Moore, Jason H.

    2009-01-01

    Replication has become the gold standard for assessing statistical results from genome-wide association studies. Unfortunately this replication requirement may cause real genetic effects to be missed. A real result can fail to replicate for numerous reasons including inadequate sample size or variability in phenotype definitions across independent samples. In genome-wide association studies the allele frequencies of polymorphisms may differ due to sampling error or population differences. We hypothesize that some statistically significant independent genetic effects may fail to replicate in an independent dataset when allele frequencies differ and the functional polymorphism interacts with one or more other functional polymorphisms. To test this hypothesis, we designed a simulation study in which case-control status was determined by two interacting polymorphisms with heritabilities ranging from 0.025 to 0.4 with replication sample sizes ranging from 400 to 1600 individuals. We show that the power to replicate the statistically significant independent main effect of one polymorphism can drop dramatically with a change of allele frequency of less than 0.1 at a second interacting polymorphism. We also show that differences in allele frequency can result in a reversal of allelic effects where a protective allele becomes a risk factor in replication studies. These results suggest that failure to replicate an independent genetic effect may provide important clues about the complexity of the underlying genetic architecture. We recommend that polymorphisms that fail to replicate be checked for interactions with other polymorphisms, particularly when samples are collected from groups with distinct ethnic backgrounds or different geographic regions. PMID:19503614

  8. Rapid genetic and epigenetic alterations under intergeneric genomic shock in newly synthesized Chrysanthemum morifolium x Leucanthemum paludosum hybrids (Asteraceae).

    PubMed

    Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Fang, Weimin; Guan, Zhiyong; Teng, Nianjun; Liao, Yuan; Chen, Fadi

    2014-01-01

    The Asteraceae family is at the forefront of the evolution due to frequent hybridization. Hybridization is associated with the induction of widespread genetic and epigenetic changes and has played an important role in the evolution of many plant taxa. We attempted the intergeneric cross Chrysanthemum morifolium × Leucanthemum paludosum. To obtain the success in cross, we have to turn to ovule rescue. DNA profiling of the amphihaploid and amphidiploid was investigated using amplified fragment length polymorphism, sequence-related amplified polymorphism, start codon targeted polymorphism, and methylation-sensitive amplification polymorphism (MSAP). Hybridization induced rapid changes at the genetic and the epigenetic levels. The genetic changes mainly involved loss of parental fragments and gaining of novel fragments, and some eliminated sequences possibly from the noncoding region of L. paludosum. The MSAP analysis indicated that the level of DNA methylation was lower in the amphiploid (∼45%) than in the parental lines (51.5-50.6%), whereas it increased after amphidiploid formation. Events associated with intergeneric genomic shock were a feature of C. morifolium × L. paludosum hybrid, given that the genetic relationship between the parental species is relatively distant. Our results provide genetic and epigenetic evidence for understanding genomic shock in wide crosses between species in Asteraceae and suggest a need to expand our current evolutionary framework to encompass a genetic/epigenetic dimension when seeking to understand wide crosses.

  9. Evidence for multiple interspecific hybridization in Saccharomyces sensu stricto species.

    PubMed

    de Barros Lopes, Miguel; Bellon, Jennifer R; Shirley, Neil J; Ganter, Philip F

    2002-01-01

    Fluorescent amplified fragment length polymorphism analysis demonstrates a high level of gene exchange between Saccharomyces sensu stricto species, with some strains having undergone multiple interspecific hybridization events with subsequent changes in genome complexity. Two lager strains were shown to be hybrids between Saccharomyces cerevisiae and the alloploid species Saccharomyces pastorianus. The genome structure of CBS 380(T), the type strain of Saccharomyces bayanus, is also consistent with S. pastorianus gene transfer. The results indicate that the cider yeast, CID1, possesses nuclear DNA from three separate species. Mating experiments show that there are no barriers to interspecific conjugation of haploid cells. Furthermore, the allopolyploid strains were able to undergo further hybridizations with other Saccharomyces sensu stricto yeasts. These results demonstrate that introgression between the Saccharomyces sensu stricto species is likely.

  10. Genome Evolution and Meiotic Maps by Massively Parallel DNA Sequencing: Spotted Gar, an Outgroup for the Teleost Genome Duplication

    PubMed Central

    Amores, Angel; Catchen, Julian; Ferrara, Allyse; Fontenot, Quenton; Postlethwait, John H.

    2011-01-01

    Genomic resources for hundreds of species of evolutionary, agricultural, economic, and medical importance are unavailable due to the expense of well-assembled genome sequences and difficulties with multigenerational studies. Teleost fish provide many models for human disease but possess anciently duplicated genomes that sometimes obfuscate connectivity. Genomic information representing a fish lineage that diverged before the teleost genome duplication (TGD) would provide an outgroup for exploring the mechanisms of evolution after whole-genome duplication. We exploited massively parallel DNA sequencing to develop meiotic maps with thrift and speed by genotyping F1 offspring of a single female and a single male spotted gar (Lepisosteus oculatus) collected directly from nature utilizing only polymorphisms existing in these two wild individuals. Using Stacks, software that automates the calling of genotypes from polymorphisms assayed by Illumina sequencing, we constructed a map containing 8406 markers. RNA-seq on two map-cross larvae provided a reference transcriptome that identified nearly 1000 mapped protein-coding markers and allowed genome-wide analysis of conserved synteny. Results showed that the gar lineage diverged from teleosts before the TGD and its genome is organized more similarly to that of humans than teleosts. Thus, spotted gar provides a critical link between medical models in teleost fish, to which gar is biologically similar, and humans, to which gar is genomically similar. Application of our F1 dense mapping strategy to species with no prior genome information promises to facilitate comparative genomics and provide a scaffold for ordering the numerous contigs arising from next generation genome sequencing. PMID:21828280

  11. Genomic diversity and population structure of three autochthonous Greek sheep breeds assessed with genome-wide DNA arrays.

    PubMed

    Michailidou, S; Tsangaris, G; Fthenakis, G C; Tzora, A; Skoufos, I; Karkabounas, S C; Banos, G; Argiriou, A; Arsenos, G

    2018-06-01

    In the present study, genome-wide genotyping was applied to characterize the genetic diversity and population structure of three autochthonous Greek breeds: Boutsko, Karagouniko and Chios. Dairy sheep are among the most significant livestock species in Greece numbering approximately 9 million animals which are characterized by large phenotypic variation and reared under various farming systems. A total of 96 animals were genotyped with the Illumina's OvineSNP50K microarray beadchip, to study the population structure of the breeds and develop a specialized panel of single-nucleotide polymorphisms (SNPs), which could distinguish one breed from the others. Quality control on the dataset resulted in 46,125 SNPs, which were used to evaluate the genetic structure of the breeds. Population structure was assessed through principal component analysis (PCA) and admixture analysis, whereas inbreeding was estimated based on runs of homozygosity (ROHs) coefficients, genomic relationship matrix inbreeding coefficients (F GRM ) and patterns of linkage disequilibrium (LD). Associations between SNPs and breeds were analyzed with different inheritance models, to identify SNPs that distinguish among the breeds. Results showed high levels of genetic heterogeneity in the three breeds. Genetic distances among breeds were modest, despite their different ancestries. Chios and Karagouniko breeds were more genetically related to each other compared to Boutsko. Analysis revealed 3802 candidate SNPs that can be used to identify two-breed crosses and purebred animals. The present study provides, for the first time, data on the genetic background of three Greek indigenous dairy sheep breeds as well as a specialized marker panel that can be applied for traceability purposes as well as targeted genetic improvement schemes and conservation programs.

  12. Whole-Genome Sequencing of Recent Listeria monocytogenes Isolates from Germany Reveals Population Structure and Disease Clusters.

    PubMed

    Halbedel, Sven; Prager, Rita; Fuchs, Stephan; Trost, Eva; Werner, Guido; Flieger, Antje

    2018-06-01

    Listeria monocytogenes causes foodborne outbreaks with high mortality. For improvement of outbreak cluster detection, the German consiliary laboratory for listeriosis implemented whole-genome sequencing (WGS) in 2015. A total of 424 human L. monocytogenes isolates collected in 2007 to 2017 were subjected to WGS and core-genome multilocus sequence typing (cgMLST). cgMLST grouped the isolates into 38 complexes, reflecting 4 known and 34 unknown disease clusters. Most of these complexes were confirmed by single nucleotide polymorphism (SNP) calling, but some were further differentiated. Interestingly, several cgMLST cluster types were further subtyped by pulsed-field gel electrophoresis, partly due to phage insertions in the accessory genome. Our results highlight the usefulness of cgMLST for routine cluster detection but also show that cgMLST complexes require validation by methods providing higher typing resolution. Twelve cgMLST clusters included recent cases, suggesting activity of the source. Therefore, the cgMLST nomenclature data presented here may support future public health actions. Copyright © 2018 American Society for Microbiology.

  13. The Role of Constitutional Copy Number Variants in Breast Cancer

    PubMed Central

    Walker, Logan C.; Wiggins, George A.R.; Pearson, John F.

    2015-01-01

    Constitutional copy number variants (CNVs) include inherited and de novo deviations from a diploid state at a defined genomic region. These variants contribute significantly to genetic variation and disease in humans, including breast cancer susceptibility. Identification of genetic risk factors for breast cancer in recent years has been dominated by the use of genome-wide technologies, such as single nucleotide polymorphism (SNP)-arrays, with a significant focus on single nucleotide variants. To date, these large datasets have been underutilised for generating genome-wide CNV profiles despite offering a massive resource for assessing the contribution of these structural variants to breast cancer risk. Technical challenges remain in determining the location and distribution of CNVs across the human genome due to the accuracy of computational prediction algorithms and resolution of the array data. Moreover, better methods are required for interpreting the functional effect of newly discovered CNVs. In this review, we explore current and future application of SNP array technology to assess rare and common CNVs in association with breast cancer risk in humans. PMID:27600231

  14. Genetic Diversity of Blumeria graminis f. sp. hordei in Central Europe and Its Comparison with Australian Population

    PubMed Central

    Komínková, Eva; Dreiseitl, Antonín; Malečková, Eva; Doležel, Jaroslav

    2016-01-01

    Population surveys of Blumeria graminis f. sp. hordei (Bgh), a causal agent of more than 50% of barley fungal infections in the Czech Republic, have been traditionally based on virulence tests, at times supplemented with non-specific Restriction fragment length polymorphism or Random amplified polymorphic DNA markers. A genomic sequence of Bgh, which has become available recently, enables identification of potential markers suitable for population genetics studies. Two major strategies relying on transposable elements and microsatellites were employed in this work to develop a set of Repeat junction markers, Single sequence repeat and Single nucleotide polymorphism markers. A resolution power of the new panel of markers comprising 33 polymorphisms was demonstrated by a phylogenetic analysis of 158 Bgh isolates. A core set of 97 Czech isolates was compared to a set 50 Australian isolates on the background of 11 diverse isolates collected throughout the world. 73.2% of Czech isolates were found to be genetically unique. An extreme diversity of this collection was in strong contrast with the uniformity of the Australian one. This work paves the way for studies of population structure and dynamics based on genetic variability among different Bgh isolates originating from geographically limited regions. PMID:27875588

  15. Molecular phylogeny and SNP variation of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) derived from genome sequences.

    PubMed

    Cronin, Matthew A; Rincon, Gonzalo; Meredith, Robert W; MacNeil, Michael D; Islas-Trejo, Alma; Cánovas, Angela; Medrano, Juan F

    2014-01-01

    We assessed the relationships of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) with high throughput genomic sequencing data with an average coverage of 25× for each species. A total of 1.4 billion 100-bp paired-end reads were assembled using the polar bear and annotated giant panda (Ailuropoda melanoleuca) genome sequences as references. We identified 13.8 million single nucleotide polymorphisms (SNP) in the 3 species aligned to the polar bear genome. These data indicate that polar bears and brown bears share more SNP with each other than either does with black bears. Concatenation and coalescence-based analysis of consensus sequences of approximately 1 million base pairs of ultraconserved elements in the nuclear genome resulted in a phylogeny with black bears as the sister group to brown and polar bears, and all brown bears are in a separate clade from polar bears. Genotypes for 162 SNP loci of 336 bears from Alaska and Montana showed that the species are genetically differentiated and there is geographic population structure of brown and black bears but not polar bears.

  16. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat.

    PubMed

    Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C

    2014-06-01

    Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection.

  17. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat

    PubMed Central

    Zhao, Y; Mette, M F; Gowda, M; Longin, C F H; Reif, J C

    2014-01-01

    Based on data from field trials with a large collection of 135 elite winter wheat inbred lines and 1604 F1 hybrids derived from them, we compared the accuracy of prediction of marker-assisted selection and current genomic selection approaches for the model traits heading time and plant height in a cross-validation approach. For heading time, the high accuracy seen with marker-assisted selection severely dropped with genomic selection approaches RR-BLUP (ridge regression best linear unbiased prediction) and BayesCπ, whereas for plant height, accuracy was low with marker-assisted selection as well as RR-BLUP and BayesCπ. Differences in the linkage disequilibrium structure of the functional and single-nucleotide polymorphism markers relevant for the two traits were identified in a simulation study as a likely explanation for the different trends in accuracies of prediction. A new genomic selection approach, weighted best linear unbiased prediction (W-BLUP), designed to treat the effects of known functional markers more appropriately, proved to increase the accuracy of prediction for both traits and thus closes the gap between marker-assisted and genomic selection. PMID:24518889

  18. A Genetic Map Between Gossypium hirsutum and the Brazilian Endemic G. mustelinum and Its Application to QTL Mapping

    PubMed Central

    Wang, Baohua; Liu, Limei; Zhang, Dong; Zhuang, Zhimin; Guo, Hui; Qiao, Xin; Wei, Lijuan; Rong, Junkang; May, O. Lloyd; Paterson, Andrew H.; Chee, Peng W.

    2016-01-01

    Among the seven tetraploid cotton species, little is known about transmission genetics and genome organization in Gossypium mustelinum, the species most distant from the source of most cultivated cotton, G. hirsutum. In this research, an F2 population was developed from an interspecific cross between G. hirsutum and G. mustelinum (HM). A genetic linkage map was constructed mainly using simple sequence repeat (SSRs) and restriction fragment length polymorphism (RFLP) DNA markers. The arrangements of most genetic loci along the HM chromosomes were identical to those of other tetraploid cotton species. However, both major and minor structural rearrangements were also observed, for which we propose a parsimony-based model for structural divergence of tetraploid cottons from common ancestors. Sequences of mapped markers were used for alignment with the 26 scaffolds of the G. hirsutum draft genome, and showed high consistency. Quantitative trait locus (QTL) mapping of fiber elongation in advanced backcross populations derived from the same parents demonstrated the value of the HM map. The HM map will serve as a valuable resource for QTL mapping and introgression of G. mustelinum alleles into G. hirsutum, and help clarify evolutionary relationships between the tetraploid cotton genomes. PMID:27172208

  19. Chromosomal translocations and palindromic AT-rich repeats

    PubMed Central

    Kato, Takema; Kurahashi, Hiroki; Emanuel1, Beverly S.

    2012-01-01

    Repetitive DNA sequences constitute 30% of the human genome, and are often sites of genomic rearrangement. Recently, it has been found that several constitutional translocations, especially those that involve chromosome 22, take place utilizing palindromic sequences on 22q11 and on the partner chromosome. Analysis of translocation junction fragments shows that the breakpoints of such palindrome-mediated translocations are localized at the center of palindromic AT-rich repeats (PATRRs). The presence of PATRRs at the breakpoints, indicates a palindrome-mediated mechanism involved in the generation of these constitutional translocations. Identification of these PATRR-mediated translocations suggests a universal pathway for gross chromosomal rearrangement in the human genome. De novo occurrences of PATRR-mediated translocations can be detected by PCR in normal sperm samples but not somatic cells. Polymorphisms of various PATRRs influence their propensity for adopting a secondary structure, which in turn affects de novo translocation frequency. We propose that the PATRRs form an unstable secondary structure, which leads to double-strand breaks at the center of the PATRR. The double-strand breaks appear to be followed by a non-homologous end-joining repair pathway, ultimately leading to the translocations. This review considers recent findings concerning the mechanism of meiosis-specific, PATRR-mediated translocations. PMID:22402448

  20. Effect of gene polymorphisms on periodontal diseases

    PubMed Central

    Tarannum, Fouzia; Faizuddin, Mohamed

    2012-01-01

    Periodontal diseases are inflammatory diseases of supporting structures of the tooth. It results in the destruction of the supporting structures and most of the destructive processes involved are host derived. The processes leading to destruction and regeneration of the destroyed tissues are of great interest to both researchers and clinicians. The selective susceptibility of subjects for periodontitis has remained an enigma and wide varieties of risk factors have been implicated for the manifestation and progression of periodontitis. Genetic factors have been a new addition to the list of risk factors for periodontal diseases. With the availability of human genome sequence and the knowledge of the complement of the genes, it should be possible to identify the metabolic pathways involved in periodontal destruction and regeneration. Most forms of periodontitis represent a life-long account of interactions between the genome, behaviour, and environment. The current practical utility of genetic knowledge in periodontitis is limited. The information contained within the human genome can potentially lead to a better understanding of the control mechanisms modulating the production of inflammatory mediators as well as provides potential therapeutic targets for periodontal disease. Allelic variants at multiple gene loci probably influence periodontitis susceptibility. PMID:22754216

  1. Association of genome-wide variation with the risk of incident heart failure in adults of European and African ancestry: a prospective meta-analysis from the cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium.

    PubMed

    Smith, Nicholas L; Felix, Janine F; Morrison, Alanna C; Demissie, Serkalem; Glazer, Nicole L; Loehr, Laura R; Cupples, L Adrienne; Dehghan, Abbas; Lumley, Thomas; Rosamond, Wayne D; Lieb, Wolfgang; Rivadeneira, Fernando; Bis, Joshua C; Folsom, Aaron R; Benjamin, Emelia; Aulchenko, Yurii S; Haritunians, Talin; Couper, David; Murabito, Joanne; Wang, Ying A; Stricker, Bruno H; Gottdiener, John S; Chang, Patricia P; Wang, Thomas J; Rice, Kenneth M; Hofman, Albert; Heckbert, Susan R; Fox, Ervin R; O'Donnell, Christopher J; Uitterlinden, Andre G; Rotter, Jerome I; Willerson, James T; Levy, Daniel; van Duijn, Cornelia M; Psaty, Bruce M; Witteman, Jacqueline C M; Boerwinkle, Eric; Vasan, Ramachandran S

    2010-06-01

    Although genetic factors contribute to the onset of heart failure (HF), no large-scale genome-wide investigation of HF risk has been published to date. We have investigated the association of 2,478,304 single-nucleotide polymorphisms with incident HF by meta-analyzing data from 4 community-based prospective cohorts: the Atherosclerosis Risk in Communities Study, the Cardiovascular Health Study, the Framingham Heart Study, and the Rotterdam Study. Eligible participants for these analyses were of European or African ancestry and free of clinical HF at baseline. Each study independently conducted genome-wide scans and imputed data to the approximately 2.5 million single-nucleotide polymorphisms in HapMap. Within each study, Cox proportional hazards regression models provided age- and sex-adjusted estimates of the association between each variant and time to incident HF. Fixed-effect meta-analyses combined results for each single-nucleotide polymorphism from the 4 cohorts to produce an overall association estimate and P value. A genome-wide significance P value threshold was set a priori at 5.0x10(-7). During a mean follow-up of 11.5 years, 2526 incident HF events (12%) occurred in 20 926 European-ancestry participants. The meta-analysis identified a genome-wide significant locus at chromosomal position 15q22 (1.4x10(-8)), which was 58.8 kb from USP3. Among 2895 African-ancestry participants, 466 incident HF events (16%) occurred during a mean follow-up of 13.7 years. One genome-wide significant locus was identified at 12q14 (6.7x10(-8)), which was 6.3 kb from LRIG3. We identified 2 loci that were associated with incident HF and exceeded genome-wide significance. The findings merit replication in other community-based settings of incident HF.

  2. DNA octaplex formation with an I-motif of water-mediated A-quartets: reinterpretation of the crystal structure of d(GCGAAAGC).

    PubMed

    Sato, Yoshiteru; Mitomi, Kenta; Sunami, Tomoko; Kondo, Jiro; Takénaka, Akio

    2006-12-01

    The crystal structure of the tetragonal form of d(gcGAAAgc) has been revised and reasonably refined including the disordered residues. The two DNA strands form a base-intercalated duplex, and the four duplexes are assembled according to the crystallographic 222 symmetry to form an octaplex. In the central region, the eight strands are associated by I-motif of double A-quartets. Furthermore, eight hydrated-magnesium cations link the four duplexes to support the octaplex formation. Based on these structural features, a proposal that folding of d(GAAA)n, found in the non-coding region of genomes, into an octaplex can induce slippage during replication to facilitate length polymorphism is presented.

  3. MC1R Genotype and Plumage Colouration in the Zebra Finch (Taeniopygia guttata): Population Structure Generates Artefactual Associations

    PubMed Central

    Hoffman, Joseph I.; Krause, E. Tobias; Lehmann, Katrin; Krüger, Oliver

    2014-01-01

    Polymorphisms at the melanocortin-1 receptor (MC1R) gene have been linked to coloration in many vertebrate species. However, the potentially confounding influence of population structure has rarely been controlled for. We explored the role of the MC1R in a model avian system by sequencing the coding region in 162 zebra finches comprising 79 wild type and 83 white individuals from five stocks. Allelic counts differed significantly between the two plumage morphs at multiple segregating sites, but these were mostly synonymous. To provide a control, the birds were genotyped at eight microsatellites and subjected to Bayesian cluster analysis, revealing two distinct groups. We therefore crossed wild type with white individuals and backcrossed the F1s with white birds. No significant associations were detected in the resulting offspring, suggesting that our original findings were a byproduct of genome-wide divergence. Our results are consistent with a previous study that found no association between MC1R polymorphism and plumage coloration in leaf warblers. They also contribute towards a growing body of evidence suggesting that care should be taken to quantify, and where necessary control for, population structure in association studies. PMID:24489736

  4. Development and Evaluation of a 9K SNP Array for Peach by Internationally Coordinated SNP Detection and Validation in Breeding Germplasm

    PubMed Central

    Scalabrin, Simone; Gilmore, Barbara; Lawley, Cynthia T.; Gasic, Ksenija; Micheletti, Diego; Rosyara, Umesh R.; Cattonaro, Federica; Vendramin, Elisa; Main, Dorrie; Aramini, Valeria; Blas, Andrea L.; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Troggio, Michela; Sosinski, Bryon; Aranzana, Maria José; Arús, Pere; Iezzoni, Amy; Morgante, Michele; Peace, Cameron

    2012-01-01

    Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs. The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species. PMID:22536421

  5. Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.

    PubMed

    Zhang, Wei; Qi, Weihong; Albert, Thomas J; Motiwala, Alifiya S; Alland, David; Hyytia-Trees, Eija K; Ribot, Efrain M; Fields, Patricia I; Whittam, Thomas S; Swaminathan, Bala

    2006-06-01

    Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7x10(-9) per site per year), we estimate that the most recent common ancestor of the contemporary beta-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens.

  6. Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms

    PubMed Central

    Zhang, Wei; Qi, Weihong; Albert, Thomas J.; Motiwala, Alifiya S.; Alland, David; Hyytia-Trees, Eija K.; Ribot, Efrain M.; Fields, Patricia I.; Whittam, Thomas S.; Swaminathan, Bala

    2006-01-01

    Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7 × 10−9 per site per year), we estimate that the most recent common ancestor of the contemporary β-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens. PMID:16606700

  7. Genome-Wide Discovery and Deployment of Insertions and Deletions Markers Provided Greater Insights on Species, Genomes, and Sections Relationships in the Genus Arachis.

    PubMed

    Vishwakarma, Manish K; Kale, Sandip M; Sriswathi, Manda; Naresh, Talari; Shasidhar, Yaduru; Garg, Vanika; Pandey, Manish K; Varshney, Rajeev K

    2017-01-01

    Small insertions and deletions (InDels) are the second most prevalent and the most abundant structural variations in plant genomes. In order to deploy these genetic variations for genetic analysis in genus Arachis , we conducted comparative analysis of the draft genome assemblies of both the diploid progenitor species of cultivated tetraploid groundnut ( Arachis hypogaea L.) i.e., Arachis duranensis (A subgenome) and Arachis ipaënsis (B subgenome) and identified 515,223 InDels. These InDels include 269,973 insertions identified in A. ipaënsis against A. duranensis while 245,250 deletions in A. duranensis against A. ipaënsis . The majority of the InDels were of single bp (43.7%) and 2-10 bp (39.9%) while the remaining were >10 bp (16.4%). Phylogenetic analysis using genotyping data for 86 (40.19%) polymorphic markers grouped 96 diverse Arachis accessions into eight clusters mostly by the affinity of their genome. This study also provided evidence for the existence of "K" genome, although distinct from both the "A" and "B" genomes, but more similar to "B" genome. The complete homology between A. monticola and A. hypogaea tetraploid taxa showed a very similar genome composition. The above analysis has provided greater insights into the phylogenetic relationship among accessions, genomes, sub species and sections. These InDel markers are very useful resource for groundnut research community for genetic analysis and breeding applications.

  8. Genome Analysis of the Domestic Dog (Korean Jindo) by Massively Parallel Sequencing

    PubMed Central

    Kim, Ryong Nam; Kim, Dae-Soo; Choi, Sang-Haeng; Yoon, Byoung-Ha; Kang, Aram; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Jong-Joo; Ha, Ji-Hong; Toyoda, Atsushi; Fujiyama, Asao; Kim, Aeri; Kim, Min-Young; Park, Kun-Hyang; Lee, Kang Seon; Park, Hong-Seog

    2012-01-01

    Although pioneering sequencing projects have shed light on the boxer and poodle genomes, a number of challenges need to be met before the sequencing and annotation of the dog genome can be considered complete. Here, we present the DNA sequence of the Jindo dog genome, sequenced to 45-fold average coverage using Illumina massively parallel sequencing technology. A comparison of the sequence to the reference boxer genome led to the identification of 4 675 437 single nucleotide polymorphisms (SNPs, including 3 346 058 novel SNPs), 71 642 indels and 8131 structural variations. Of these, 339 non-synonymous SNPs and 3 indels are located within coding sequences (CDS). In particular, 3 non-synonymous SNPs and a 26-bp deletion occur in the TCOF1 locus, implying that the difference observed in cranial facial morphology between Jindo and boxer dogs might be influenced by those variations. Through the annotation of the Jindo olfactory receptor gene family, we found 2 unique olfactory receptor genes and 236 olfactory receptor genes harbouring non-synonymous homozygous SNPs that are likely to affect smelling capability. In addition, we determined the DNA sequence of the Jindo dog mitochondrial genome and identified Jindo dog-specific mtDNA genotypes. This Jindo genome data upgrade our understanding of dog genomic architecture and will be a very valuable resource for investigating not only dog genetics and genomics but also human and dog disease genetics and comparative genomics. PMID:22474061

  9. Genome-Wide Discovery and Deployment of Insertions and Deletions Markers Provided Greater Insights on Species, Genomes, and Sections Relationships in the Genus Arachis

    PubMed Central

    Vishwakarma, Manish K.; Kale, Sandip M.; Sriswathi, Manda; Naresh, Talari; Shasidhar, Yaduru; Garg, Vanika; Pandey, Manish K.; Varshney, Rajeev K.

    2017-01-01

    Small insertions and deletions (InDels) are the second most prevalent and the most abundant structural variations in plant genomes. In order to deploy these genetic variations for genetic analysis in genus Arachis, we conducted comparative analysis of the draft genome assemblies of both the diploid progenitor species of cultivated tetraploid groundnut (Arachis hypogaea L.) i.e., Arachis duranensis (A subgenome) and Arachis ipaënsis (B subgenome) and identified 515,223 InDels. These InDels include 269,973 insertions identified in A. ipaënsis against A. duranensis while 245,250 deletions in A. duranensis against A. ipaënsis. The majority of the InDels were of single bp (43.7%) and 2–10 bp (39.9%) while the remaining were >10 bp (16.4%). Phylogenetic analysis using genotyping data for 86 (40.19%) polymorphic markers grouped 96 diverse Arachis accessions into eight clusters mostly by the affinity of their genome. This study also provided evidence for the existence of “K” genome, although distinct from both the “A” and “B” genomes, but more similar to “B” genome. The complete homology between A. monticola and A. hypogaea tetraploid taxa showed a very similar genome composition. The above analysis has provided greater insights into the phylogenetic relationship among accessions, genomes, sub species and sections. These InDel markers are very useful resource for groundnut research community for genetic analysis and breeding applications. PMID:29312366

  10. Genetic polymorphisms and protein structures in growth hormone, growth hormone receptor, ghrelin, insulin-like growth factor 1 and leptin in Mehraban sheep.

    PubMed

    Bahrami, A; Behzadi, Sh; Miraei-Ashtiani, S R; Roh, S-G; Katoh, K

    2013-09-15

    The somatotropic axis, the control system for growth hormone (GH) secretion and its endogenous factors involved in the regulation of metabolism and energy partitioning, has promising potentials for producing economically valuable traits in farm animals. Here we investigated single nucleotide polymorphisms (SNPs) of the genes of factors involved in the somatotropic axis for growth hormone (GH1), growth hormone receptor (GHR), ghrelin (GHRL), insulin-like growth factor 1 (IGF-I) and leptin (LEP), using polymerase chain reaction-single-strand conformation polymorphism (PCR-SSCP) and DNA sequencing methods in 452 individual Mehraban sheep. A nonradioactive method to allow SSCP detection was used for genomic DNA and PCR amplification of six fragments: exons 4 and 5 of GH1; exon 10 of GH receptor (GHR); exon 1 of ghrelin (GHRL); exon 1 of insulin-like growth factor-I (IGF-I), and exon 3 of leptin (LEP). Polymorphisms were detected in five of the six PCR products. Two electrophoretic patterns were detected for GH1 exon 4. Five conformational patterns were detected for GH1 exon 5 and LEP exon 3, and three for IGF-I exon 1. Only GHR and GHRL were monomorphic. Changes in protein structures due to variable SNPs were also analyzed. The results suggest that Mehraban sheep, a major breed that is important for the animal industry in Middle East countries, has high genetic variability, opening interesting prospects for future selection programs and preservation strategies. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Single nucleotide polymorphisms in the Mycobacterium bovis genome resolve phylogenetic relationships

    USDA-ARS?s Scientific Manuscript database

    Mycobacterium bovis isolates carry restricted allelic variation yet exhibit a range of disease phenotypes and host preferences. Conventional genotyping methods target small hyper-variable regions of their genome and provide anonymous biallelic information insufficient to develop phylogeny. To resolv...

  12. Genome-Wide Sequence Variation Identification and Floral-Associated Trait Comparisons Based on the Re-sequencing of the ‘Nagafu No. 2’ and ‘Qinguan’ Varieties of Apple (Malus domestica Borkh.)

    PubMed Central

    Xing, Libo; Zhang, Dong; Song, Xiaomin; Weng, Kai; Shen, Yawen; Li, Youmei; Zhao, Caiping; Ma, Juanjuan; An, Na; Han, Mingyu

    2016-01-01

    Apple (Malus domestica Borkh.) is a commercially important fruit worldwide. Detailed information on genomic DNA polymorphisms, which are important for understanding phenotypic traits, is lacking for the apple. We re-sequenced two elite apple varieties, ‘Nagafu No. 2’ and ‘Qinguan,’ which have different characteristics. We identified many genomic variations, including 2,771,129 single nucleotide polymorphisms (SNPs), 82,663 structural variations (SVs), and 1,572,803 insertion/deletions (INDELs) in ‘Nagafu No. 2’ and 2,262,888 SNPs, 63,764 SVs, and 1,294,060 INDELs in ‘Qinguan.’ The ‘SNP,’ ‘INDEL,’ and ‘SV’ distributions were non-random, with variation-rich or -poor regions throughout the genomes. In ‘Nagafu No. 2’ and ‘Qinguan’ there were 171,520 and 147,090 non-synonymous SNPs spanning 23,111 and 21,400 genes, respectively; 3,963 and 3,196 SVs in 3,431 and 2,815 genes, respectively; and 1,834 and 1,451 INDELs in 1,681 and 1,345 genes, respectively. Genetic linkage maps of 190 flowering genes associated with multiple flowering pathways in ‘Nagafu No. 2,’ ‘Qinguan,’ and ‘Golden Delicious,’ identified complex regulatory mechanisms involved in floral induction, flower bud formation, and flowering characteristics, which might reflect the genetic variation of the flowering genes. Expression profiling of key flowering genes in buds and leaves suggested that the photoperiod and autonomous flowering pathways are major contributors to the different floral-associated traits between ‘Nagafu No. 2’ and ‘Qinguan.’ The genome variation data provided a foundation for the further exploration of apple diversity and gene–phenotype relationships, and for future research on molecular breeding to improve apple and related species. PMID:27446138

  13. Single-feature polymorphism discovery in the barley transcriptome

    PubMed Central

    Rostoks, Nils; Borevitz, Justin O; Hedley, Peter E; Russell, Joanne; Mudie, Sharon; Morris, Jenny; Cardle, Linda; Marshall, David F; Waugh, Robbie

    2005-01-01

    A probe-level model for analysis of GeneChip gene-expression data is presented which identified more than 10,000 single-feature polymorphisms (SFP) between two barley genotypes. The method has good sensitivity, as 67% of known single-nucleotide polymorphisms (SNP) were called as SFPs. This method is applicable to all oligonucleotide microarray data, accounts for SNP effects in gene-expression data and represents an efficient and versatile approach for highly parallel marker identification in large genomes. PMID:15960806

  14. Analysis of single nucleotide variants of HFE gene and association to survival in The Cancer Genome Atlas GBM data.

    PubMed

    Lee, Sang Y; Zhu, Junjia; Salzberg, Anna C; Zhang, Bo; Liu, Dajiang J; Muscat, Joshua E; Langan, Sara T; Connor, James R

    2017-01-01

    Human hemochromatosis protein (HFE) is involved in iron metabolism. Two major HFE polymorphisms, H63D and C282Y, have been associated with an increased risk of cancers. Previously, we reported decreased gender effects in overall survival based on H63D or C282Y HFE polymorphisms patients with glioblastoma multiforme (GBM). However, the effect of other single nucleotide variation (SNV) in the HFE gene on the cancer development and progression has not been systematically studied. To expand our finding in a larger sample, and to identify other HFE SNV, we analyzed the frequency of somatic SNV in HFE gene and its relationship to survival in GBM patients using The Cancer Genome Atlas (TCGA) GBM (Caucasian only) database. We found 9 SNVs with increased frequency in blood normal of TCGA GBM patients compared to the 1000Genome. Among 9 SNVs, 7 SNVs were located in the intron and 2 SNVs (i.e., H63D, C282Y) in the exon of HFE gene. The statistical analysis demonstrated that blood normal samples of TCGA GBM have more H63D (p = 0.0002, 95% Confidence interval (CI): 0.2119-0.3223) or C282Y (p = 0.0129, 95% CI: 0.0474-0.1159) HFE polymorphisms than 1000Genome. The Kaplan-Meier survival curve for the 264 GBM samples revealed no difference between wild type (WT) HFE and H63D, and WT HFE and C282Y GBM patients. In addition, there was no difference in the survival of male/female GBM patients based on HFE genotype. There was no correlation between HFE expression and survival. In conclusion, the current results suggest that somatic HFE polymorphisms do not impact GBM patients' survival in the TCGA data set of GBM.

  15. Genome-wide association study identifies phospholipase C zeta 1 (PLCz1) as a stallion fertility locus in Hanoverian warmblood horses.

    PubMed

    Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2014-01-01

    A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions.

  16. Genome-Wide Association Study Identifies Phospholipase C zeta 1 (PLCz1) as a Stallion Fertility Locus in Hanoverian Warmblood Horses

    PubMed Central

    Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2014-01-01

    A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions. PMID:25354211

  17. Personalized Medicine in a New Genomic Era: Ethical and Legal Aspects.

    PubMed

    Shoaib, Maria; Rameez, Mansoor Ali Merchant; Hussain, Syed Ather; Madadin, Mohammed; Menezes, Ritesh G

    2017-08-01

    The genome of two completely unrelated individuals is quite similar apart from minor variations called single nucleotide polymorphisms which contribute to the uniqueness of each and every person. These single nucleotide polymorphisms are of great interest clinically as they are useful in figuring out the susceptibility of certain individuals to particular diseases and for recognizing varied responses to pharmacological interventions. This gives rise to the idea of 'personalized medicine' as an exciting new therapeutic science in this genomic era. Personalized medicine suggests a unique treatment strategy based on an individual's genetic make-up. Its key principles revolve around applied pharmaco-genomics, pharmaco-kinetics and pharmaco-proteomics. Herein, the ethical and legal aspects of personalized medicine in a new genomic era are briefly addressed. The ultimate goal is to comprehensively recognize all relevant forms of genetic variation in each individual and be able to interpret this information in a clinically meaningful manner within the ambit of ethical and legal considerations. The authors of this article firmly believe that personalized medicine has the potential to revolutionize the current landscape of medicine as it makes its way into clinical practice.

  18. Genome-wide association and genomic prediction identifies associated loci and predicts the sensitivity of Tobacco ringspot virus in soybean plant introduction

    USDA-ARS?s Scientific Manuscript database

    The genome-wide association study (GWAS) is a useful tool for detecting and characterizing traits of interest including those associated with disease resistance in soybean. The availability of 50,000 single nucleotide polymorphism (SNP) markers (SoySNP50K iSelect BeadChip; www.soybase.org) on 19,652...

  19. High-Throughput SNP Discovery through Deep Resequencing of a Reduced Representation Library to Anchor and Orient Scaffolds in the Soybean Whole Genome Sequence

    USDA-ARS?s Scientific Manuscript database

    The soybean Consensus Map 4.0 facilitated the anchoring of 95.6% of the soybean whole genome sequence developed by the Joint Genome Institute, Department of Energy but only properly oriented 66% of the sequence scaffolds. To find additional single nucleotide polymorphism (SNP) markers for additiona...

  20. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

Top